tesseract: tesseract (Tesseract Open Source OCR Engine)
tesseract: 
tesseract: This package contains an OCR engine - libtesseract and a command line
tesseract: program - tesseract. Tesseract 4 adds a new neural net (LSTM) based 
tesseract: OCR engine which is focused on line recognition, but also still
tesseract: supports the legacy Tesseract OCR engine of Tesseract 3 which works by
tesseract: recognizing character patterns.
tesseract: Tesseract has unicode (UTF-8) support, and can recognize more than 100
tesseract: languages "out of the box". It supports various output formats:
tesseract: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV.
tesseract: The lead developer is Ray Smith. The maintainer is Zdenko Podobny.
