Document extraction

Extract key values from the given document and predicts the type of the field. For example, if in a given document "Invoice: 101" keyValue is found, it will predict its type as "INVOICE NUMBER" for now the type prediction is limited to the type identified by AWS Textract OCR Provider.

Sync Extraction

/ocrsvc/v{ver}/DocumentExtraction is called when user want to extract all key pair values.

The table shows the required values:

Component Description
API Method /ocrsvc/v{ver}/DocumentExtraction
Input
  • ver - The version of the IDP
  • ocrDocument - Processes .jpg, .jpeg, and .png files as input.
Output The response is of JSON type

Async Extraction

ocrsvc/v{ver}/AsyncDocumentExtraction are called when user wants to extract all key pair values. Called when OCR Document is a PDF File.

The table shows the required values:

Component Description
API Method ocrsvc/v{ver}/AsyncDocumentExtraction
Input
  • ver - The version of the IDP
  • ocrDocument - Processes .pdf files as input.
  • pageNo - PageNo can be , eg. for pdf with Page size 10, values could be 1,2,3 or 1-3, or 1, 3-5 or 1-3, 7-10
Output generated Task ID is generated.

Response output

{
  "ExtractionData": [
    {
      "FieldName": "Name of the extracted entity",
      "FieldValue": "Extracted text value, possibly multi-line",
      "FieldGeometry": [
        [Left, Top, Width, Height] (Field Name geometry),
        [Left, Top, Width, Height] (Field Value geometry)
      ],
      "Confidence": ["Confidence score for each extracted value"],
      "PageNo": "Page number where the field was found",
      "Type": {
        "Value": "Category of the extracted field (e.g., ADDRESS, DATE, etc.)",
        "Confidence": "Confidence score for field classification"
      }
    }
  ],
  "_metadata": {
    "TotalFields": "Total number of extracted fields",
    "Confidence": "Overall confidence score of the extraction",
    "TaskID": "Unique identifier for the OCR processing job",
    "OcrProvider": "Name of the OCR service provider used",
    "TenantID": "Identifier for the tenant or client using the service",
    "NumberOfPages": "Total number of pages in the document"
  }
}