tldr/pages/common/tesseract.md

# tesseract

> OCR (Optical Character Recognition) engine.
> More information: <https://github.com/tesseract-ocr/tesseract>.

- Recognize text in an image and save it to `output.txt` (the `.txt` extension is added automatically):

`tesseract {{image.png}} {{output}}`

- Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):

`tesseract -l deu {{image.png}} {{output}}`

- List the ISO 639-2 codes of available languages:

`tesseract --list-langs`

- Specify a custom page segmentation mode (default is 3):

`tesseract -psm {{0_to_10}} {{image.png}} {{output}}`

- List page segmentation modes and their descriptions:

`tesseract --help-psm`
tesseract: add page (#1267) 2017-02-24 07:39:35 +00:00			`# tesseract`

			`> OCR (Optical Character Recognition) engine.`
Refactor: reword English pages' links' descriptions. 2019-06-03 01:06:36 +01:00			`> More information: <https://github.com/tesseract-ocr/tesseract>.`
tesseract: add page (#1267) 2017-02-24 07:39:35 +00:00
multiple pages: format technical tokens (#5119) Co-authored-by: bl-ue <54780737+bl-ue@users.noreply.github.com> Co-authored-by: Starbeamrainbowlabs <sbrl@starbeamrainbowlabs.com> 2021-01-31 17:05:18 +00:00			- Recognize text in an image and save it to `output.txt` (the `.txt` extension is added automatically):
tesseract: add page (#1267) 2017-02-24 07:39:35 +00:00
			`tesseract {{image.png}} {{output}}`

			`- Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):`

			`tesseract -l deu {{image.png}} {{output}}`

			`- List the ISO 639-2 codes of available languages:`

			`tesseract --list-langs`

			`- Specify a custom page segmentation mode (default is 3):`

			`tesseract -psm {{0_to_10}} {{image.png}} {{output}}`

			`- List page segmentation modes and their descriptions:`

			`tesseract --help-psm`