# Preprocessing the Document

Let's consider creating a template using the document "Cost of Work and Expenses Report" as an example.

Before the robot starts creating the template, the document must be recognized by the robot and saved to the specified path.

For example, when recognizing a document in .pdf format in the script project, we add the block "Get Text from OCR Page". In the results settings under the "Output" tab, we specify $DocPageText.

<figure><img src="https://sherparpa.ru/wp-content/uploads/2023/11/image3-573w305h.png" alt=""><figcaption></figcaption></figure>

It is important to understand that in some recognized files, the names of the table columns may not match the names of the columns that are set for output by the robot. Also, some columns may be completely absent, or some table borders may be missing.

You need to check and match the column names in the project script "Define Columns.process"; they are set in the settings on the right: "Properties Panel" — "Variables".

<figure><img src="https://sherparpa.ru/wp-content/uploads/2023/11/image4-567w266h.png" alt=""><figcaption></figcaption></figure>

For example, after processing the document "Cost of Work and Expenses Report", the following columns should be output by the robot:

<figure><img src="https://sherparpa.ru/wp-content/uploads/2023/11/image5-144w118h.png" alt=""><figcaption></figcaption></figure>

<table data-header-hidden><thead><tr><th width="147"></th><th></th></tr></thead><tbody><tr><td>Summa</td><td>sum</td></tr><tr><td>Price</td><td>price</td></tr><tr><td>SummaNDS</td><td>sum with VAT</td></tr><tr><td>Stavka</td><td>rate</td></tr><tr><td>Name</td><td>name/title</td></tr><tr><td>Count</td><td>quantity</td></tr></tbody></table>

However, when creating the template, we see that some of this data is missing in the document itself.
