# Get Text from OCR Page

Read text from the specified page of a PDF document using recognition.

<table data-header-hidden><thead><tr><th width="210" valign="top"></th><th width="329" valign="top"></th></tr></thead><tbody><tr><td valign="top">File Name</td><td valign="top">[Text] The name of the PDF file from which the text will be extracted. You can enter the full file name including the path.</td></tr><tr><td valign="top">Page Number</td><td valign="top">[Number] The page number from which the text will be extracted. Numbering starts from 1.</td></tr><tr><td valign="top">Text Language</td><td valign="top">Select the language of the text.</td></tr><tr><td valign="top">Module</td><td valign="top">Select the OCR module used for recognizing the image into text.</td></tr><tr><td valign="top">Segmentation Method</td><td valign="top"><p>[Text] The recognized text can be automatically segmented into sections, separated by commas.</p><p>Segmentation method:</p><ul><li>0 - Use the specified block delimiter;</li><li>1 - Automatic segmentation (for Yandex only);</li><li>2 - Segment by empty spaces longer than the specified number of characters.</li></ul></td></tr><tr><td valign="top">Block Delimiter</td><td valign="top"><p>[Number] The hexadecimal code of the character that will be considered as the block delimiter. For example, space has code 20, tab has code 9.</p><p>Used when selecting segmentation method 0.</p></td></tr><tr><td valign="top">Number of Characters</td><td valign="top">[Number] The length of empty space in the recognized text, measured in the number of characters, which is used when selecting segmentation method 2.</td></tr><tr><td valign="top">Zoom</td><td valign="top"><p>[Number] A value indicating how many times to zoom in on the image before recognition.</p><p>Depending on the engine used, zooming the image 2 or 3 times helps improve recognition quality.</p></td></tr><tr><td valign="top">Auto Rotate Page</td><td valign="top">Automatically rotate the page during recognition.</td></tr><tr><td valign="top">Process Annotations</td><td valign="top">Annotations will also be processed when selected.</td></tr><tr><td valign="top">Result</td><td valign="top">[Text] Returns the extracted text from the page.</td></tr><tr><td valign="top">Error Handling Level</td><td valign="top"><p>Select the error handling level. Possible values:</p><ul><li>"Default" - default;</li><li>"Ignore" - errors are ignored;</li><li>"Handle" - errors are handled.</li></ul><p>If "Default" is selected, the value from the "Start" block of this diagram will be used.</p></td></tr><tr><td valign="top">Message Level</td><td valign="top"><p>Select the message level that the blocks will output during operation. Possible values:</p><ul><li>"Default" - default;</li><li>"Release" - output is disabled;</li><li>"Debug" - main information output;</li><li>"Detailed" - detailed information output.</li></ul><p>If "Default" is selected, the value from the "Start" block of this diagram will be used.</p></td></tr><tr><td valign="top">Error Text</td><td valign="top">[Text] Returns detailed information about the error in case of incorrect execution of the block's work.</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sherparpa.ru/en/sherpa-rpa/sherpa-designer/spravochnik-blokov/pdf-pdf-automation/poluchit-tekst-so-stranicy-ocr-getpagetextocr.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
