I'm OCRing a bunch of TIFF files with tesseract, and while it works to some degree, it's nowhere near as accurate as I'd like it to be. Perhaps I'm doing something wrong and I could tune it to my liking, but I can't find too many resources on tesseract. Am I missing something?<p>Any other recommendations for OCRs? Ideally it would be free, but I'm willing to pay if it's not too pricey.<p>I've been trying out the trial version of FineReader, and it seems to work pretty well, so I may go with that.<p>Any help is greatly appreciated.
I've had really great success with finereader. I tried out every free OCR tool I could find and after poor results went for finereader.<p>Spend some time on their website so you get the right product, they have multiple prices for the same products, too. I got the latest Finereader (after a coupon code I found on google) for between 130-150.<p>(I'm mostly scanning books)
One thing that improves Tesseract's performance dramatically is giving it grayscale tif images. Do<p>mogrify -type Grayscale *.tif<p>and run them through tesseract to see the difference. No idea why no one mentions this in the documentation.