Back when OmniPage was developed and sold by Caere, there was an ongoing battle with Xerox’s OCR software. It seemed that a new version of each product was released on a monthly basis, which helps explain why OmniPage Pro is now on version 11.
This optical character recognition software is pretty much best of breed. It can extract text from PDF files, recognise text in over 100 languages and output in HTML format. It also has an excellent layout engine, recognising headers and footers, tables, columns, coloured text and images.
Once you’ve installed the software and configured it to work with your scanner (most models are supported and if yours isn’t you can import scanned images manually), the easiest way to use the program is via the three-step wizard. First, either scan in your pages or import your image files. The latter can be in just about any format – GIF, TIFF, BMP, PCX, etc.
Next, let the software handle the OCR process automatically. You can do things manually, like defining graphics zones and table areas so that the program knows what it’s looking at, but we found that this was rarely necessary. OmniPage Pro breaks the page up into zones automatically, and seems to handle tables and images particularly well. The OCR operation is quite a processor-intensive one, so it’ll take up to a minute per page on a moderately fast machine, depending on the page complexity.
And finally, proof the document with the built-in spell-checker, before exporting it in the format of your choice. Most word-processor and spreadsheet formats are supported, as is HTML, and you can perform OCR operations from within Microsoft Office (97, 2000 and XP) if you so desire.
Like voice recognition software, OCR tools work very well if you follow the rules. If you don’t, you get gibberish. We found that the best results were had by allowing OmniPage Pro to handle our scanner directly. If you’re importing pre-scanned images, it’s best that they’re 300dpi with at least 16 levels of grey in the text. Anything less than that and the recognition process suffers a lot. Also, the quoted 99 percent accuracy only applies to laser-printed documents in ‘standard’ fonts. So don’t expect a faxed document in a script font to come out particularly well.
Contact: 0870 870 8085