From b4f9a574714c9db740516f0f7a2eade8199c1bb1 Mon Sep 17 00:00:00 2001 From: Adrian Sieber Date: Fri, 24 Feb 2017 07:39:35 +0000 Subject: [PATCH] tesseract: add page (#1267) --- pages/common/tesseract.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 pages/common/tesseract.md diff --git a/pages/common/tesseract.md b/pages/common/tesseract.md new file mode 100644 index 000000000..94452e235 --- /dev/null +++ b/pages/common/tesseract.md @@ -0,0 +1,23 @@ +# tesseract + +> OCR (Optical Character Recognition) engine. + +- Recognize text in an image and save it to `output.txt`. The file extension MUST not be mentioned: + +`tesseract {{image.png}} {{output}}` + +- Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German): + +`tesseract -l deu {{image.png}} {{output}}` + +- List the ISO 639-2 codes of available languages: + +`tesseract --list-langs` + +- Specify a custom page segmentation mode (default is 3): + +`tesseract -psm {{0_to_10}} {{image.png}} {{output}}` + +- List page segmentation modes and their descriptions: + +`tesseract --help-psm`