tesseract: tesseract (Tesseract Open Source OCR Engine) tesseract: tesseract: This package contains an OCR engine - libtesseract and a command line tesseract: program - tesseract. Tesseract 4 adds a new neural net (LSTM) based tesseract: OCR engine which is focused on line recognition, but also still tesseract: supports the legacy Tesseract OCR engine of Tesseract 3 which works by tesseract: recognizing character patterns. tesseract: Tesseract has unicode (UTF-8) support, and can recognize more than 100 tesseract: languages "out of the box". It supports various output formats: tesseract: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. tesseract: The lead developer is Ray Smith. The maintainer is Zdenko Podobny.