Suggestion: OCR plugin wrapper for open-source Tesseract-OCR

Description

Greenshot 1.0 comes with a plugin (wrapper) for Microsoft Document Imaging MODI OCR - if this is available on the machine. MODI however requires a Microsoft Office license and at least partial installation of the required modules and languages. Often, not all of the required language packs are available. Not everyone like the MODI, even when this has a high detection quality.

Suggestion:
=========

* implementation of an alternative plugin as a wrapper for the open-source OCR command-line software "Tesseract"
* it requires also imagemagick (convert) for preprocessing images (resizing +300%, conversion to tiff)
* advantage: almost any language is available
* advantage: open-source

References:
* https://code.google.com/p/tesseract-ocr/
* https://de.wikipedia.org/wiki/Tesseract\_%28Software%29
* https://en.wikipedia.org/wiki/Tesseract\_%28software%29

Environment

None

Status

Assignee

Unassigned

Reporter

Wikinaut

Labels

Priority