Google's Optical Character Recognition (OCR) software works for 248+ languages

Subhashish Panigrahi

Developed as a community project during 1995-2006 and later taken over by Google, Tesseract is considered one of the most accurate OCR engines and works for over 60 languages. However, detecting these elements is difficult and we may not always succeed. Native speakers of the following Indian lanaguages—Bangla, Malayalam, Kannada, Odia, Tamil, and Telugu—also commented on a Facebook post with feedback after testing the OCR. A tutorial to convert text in Odia (Indian language) from a scanned image using Google’s OCR. If you have additional feedback on the article or technology, please let us know in the comments.

Visit Link

Tags:

Optical character recognition

Tesseract (software)

Computing

Technology

Human communication

Software

Language