Extract PDF Text

You can select this activity to extract plain text from a PDF document.

Note: Text from a table is also extracted but not formatted.

This table lists the properties for the activity.

Property Type Property Name Data Type Description
Common Continue on error Boolean The option to continue the RPA flow even if the activity fails. This check box is selected by default.
Input File Path String The location of the PDF document from which text must be extracted. For example, C:\RPA\test.pdf
Page Range String The pages from which text must be extracted. You can specify a single page or a range of pages.

For example,1-3 or 5

Misc DisplayName String The name to be displayed for the activity.
Output Extracted Text String The text extracted from the specified page range. You must create a variable to store this value.
Response Code Int32 Response code for the activity. Possible values:
  • 200 to 290: Indicates a successful response or valid output.
  • 400 to 499: Indicates client error responses.
  • 500 to 599: Indicates server error responses.
Note: You must create and specify the int32 variable to view the response code.