Zanran - PDF Workbench

What's the problem? 

Zanran’s extraction of tables from PDF files into Excel is very good, but will not be 100% accurate across thousands of PDF files. If operating at a large-scale, there’s always an opportunity for small errors.

Whats the solution?

To alleviate this problem, we have built Zanran’s PDF Workbench which allows a human operator to check and amend the extracted tables – quickly and easily. 

How does it work?

The operator views the original PDF pages overlaid with a computer-generated grid.

The operator can select any cell in the grid to split it, or any group of cells to combine them.  Similarly, by selecting rows or columns (see the grey areas at the top and left) the rows or columns can be merged or split in the same way.

In addition, the operator can add labels to the headers – to make it clear what to extract.   

After editing, the operator saves the results as an XML file which is subsequently used to generate the Excel worksheet. 

If you need very high accuracy in your data extraction, Zanran’s Workbench provides a useful manual checking stage to improve the quality of the Excel file. 

For example:

Picture1-1.png  

Workbench

Zanran’s Workbench is a very flexible package. If you would like to discuss any application for editing or tagging, please contact us.