Temple extraction

These are the endpoints that are involved in extracting data based on the given template from a given document.

Sync Extraction

/ocrsvc/v{ver}/TemplateExtraction is called when user want to from the specified file formats like .jpg or .jpeg or .png.

The table shows the required values:


Component	Description
API Method	/ocrsvc/v{ver}/TemplateExtraction
Input	ver - The version of the IDP ocrDocument - the input file to the OCR engine(.jpeg,.jpg & .png) documentType - Type of the Document templateFile - The file of JSON type which represents the template of the document
Output	The response is of JSON type

Async Extraction

/ocrsvc/v{ver}/AysncTemplateExtraction is called when called when user want to from the specified file formats like .pdf .

The table shows the required values:


Component	Description
API Method	/ocrsvc/v{ver}/AysncTemplateExtraction
Input	ver - The version of the IDP ocrDocument - the input file to the OCR engine(.pdf) documentType - Type of the Document pageNo – PageNo can be, for example,. for pdf with Page size 10, values could be 1,2,3 or 1-3 or 1-3, 7-10 templateFile – The file of JSON type which represents the template of the document
Output	The response is of JSON type.

Template Creation

"DocumentTypeId": "Direct Order Form",
  "TemplateID": "directorder1",
  "Page": [
    {
      "PageID": "1",
      "StartReg": "",
      "EndReg": "",
      "Fields": [
        {
          "FieldName": "Entity_Name",
          "Type": "Text",
          "ExtractionParser": [
            {
              "Type": "REExtractor",
              "PaserInput": {
                "regtext": "Agreement between ([\\s\\S]*?) and [\\s\\S]*? \\(\"Licensee\"\\)"
              }
            }
          ]
        }