Building Resilient Systems on AWS: Learn how to design and implement a resilient, highly available, fault-tolerant infrastructure on AWS.

How to Detect Text in Images

By David Walsh on June 11, 2019

Images are a great way to communicate without text but oftentimes images are used/abused to spread text within social media and advertisements. Text in images also presents an accessibility issue. The truth is that it's important, for any number of reasons, to be able to detect text in image files. The amazing open source tool that makes detecting text in images possible is tesseract OCR!

I recommend using Homebrew to install tesseract:

brew install tesseract

To run tesseract to read text from an image, you can run the following from command line:

tesseract ~/Downloads/MyImage.png ~/Downloads/MyImage.txt -l eng

The command above extracts detected text in the English language (-l eng) into a text file (MyImage.txt). The process is very quick and there are dozens of supported languages.

Let's look at the following example:

The following text is detected:

International
‘Champions
Cup

~- TOUR SQUAD

#AFCTour2018

CECH MUSTAFI GUENDOUZI oziL
LENO SOKRATIS NELSON IWOBI
MARTINEZ MAVROPANOS SMITHROWE = NKETIAH
BELLERIN OSEI-TUTU WILLOCK PEREZ
KOLASINAC ELNENY RAMSEY LACAZETTE
CHAMBERS MAITLAND-NILES MKHITARYAN AUBAMEYANG
HOLDING

There are a number of utilities in different programming languages that plug into tesseract's functionality, but it's important to know the underlying tool! tesseract is an unbelievable tool that you should take advantage of if you need an open source utility for detecting text in an image!

Recent Features

By David WalshNovember 8, 2012
5 More HTML5 APIs You Didn’t Know Existed
The HTML5 revolution has provided us some awesome JavaScript and HTML APIs. Some are APIs we knew we've needed for years, others are cutting edge mobile and desktop helpers. Regardless of API strength or purpose, anything to help us better do our job is a...
By Garris ShiponSeptember 17, 2014
Responsive and Infinitely Scalable JS Animations
Back in late 2012 it was not easy to find open source projects using requestAnimationFrame() - this is the hook that allows Javascript code to synchronize with a web browser's native paint loop. Animations using this method can run at 60 fps and deliver fantastic...

Incredible Demos

By David WalshJune 4, 2012
Firefox Marketplace Animated Buttons
The Firefox Marketplace is an incredibly attractive, easy to use hub that promises to make finding and promoting awesome HTML5-powered web applications easy and convenient. While I don't work directly on the Marketplace, I am privy to the codebase (and so...
By David WalshAugust 24, 2009
Styling CSS Print Page Breaks
It's important to construct your websites in a fashion that lends well to print. I use a page-break CSS class on my websites to tell the browser to insert a page break at strategic points on the page. During the development of my...

How to Detect Text in Images

Recent Features

5 More HTML5 APIs You Didn’t Know Existed

Responsive and Infinitely Scalable JS Animations

Incredible Demos

Firefox Marketplace Animated Buttons

Styling CSS Print Page Breaks

Discussion