Zanran Blog

 

Financial reports – converting PDFs.

Posted by Jon Goldhill on 29-Nov-2016 14:59:19

PDF-to-Excel conversion for better financial analysis

When a PLC releases its annual and interim accounts it’s driven by both duty and desire. On the one hand it has an obligation to publish audited figures for the world in general to see, and for its shareholders in particular. On the other, it understandably wishes to present its financial performance in as positive a light as possible. The result is that in addition to the obligatory tables showing balance sheets and earnings statements and so forth we’re given statistics showing, say, growth in market share by sector or by geography.

Investors and market watchers depend on all this data, but that doesn’t make them its

Read More

Topics: Data Extraction, PDF-to-Excel

PDF-to-Excel - three reasons why extracting tables from PDFs is hard

Posted by Jon Goldhill on 16-Nov-2016 17:35:10

Zanran has needed to put a huge amount of effort into its PDF-to-Excel software.  What seems intuitive to a human – it “looks like a table” – is full of issues, exceptions and special cases for computer software. 

In this, the first of a number of articles about content extraction from PDFs, I want to look at some of the fundemental problems.

Read More

Topics: Data Extraction, technology

Buried Data

Posted by Zanran News on 14-Oct-2016 17:33:00


This is about the amount of data that you won’t find normally – graphs and charts.

One of the reasons we originally got into PDFs was that the quality of the content – especially numeric content - was so high.  We were finding far less junk than on HTML pages.  I appreciate that this is necessarily a generalisation - no insult is intended to the producers of top-quality content in HTML.

Read More

Topics: Data Extraction

Cloud-based PDF to Excel - review

Posted by Zanran News on 16-Mar-2016 14:03:00

If you’ve ever tried to copy a table from a PDF document, you’ll know it’s a pain.  You can only do so one cell at a time – which is very laborious.  However there are a lot of companies on the web that claim to be able to do ‘PDF to Excel’ processing – either cheaply or for free. 

We’ve reviewed these services to establish the quality of their technologies.

Read More

Topics: Data Extraction