Building Resilient Systems on AWS: Learn how to design and implement a resilient, highly available, fault-tolerant infrastructure on AWS.

How to Detect Text in Images

By David Walsh on June 11, 2019

Images are a great way to communicate without text but oftentimes images are used/abused to spread text within social media and advertisements. Text in images also presents an accessibility issue. The truth is that it's important, for any number of reasons, to be able to detect text in image files. The amazing open source tool that makes detecting text in images possible is tesseract OCR!

I recommend using Homebrew to install tesseract:

brew install tesseract

To run tesseract to read text from an image, you can run the following from command line:

tesseract ~/Downloads/MyImage.png ~/Downloads/MyImage.txt -l eng

The command above extracts detected text in the English language (-l eng) into a text file (MyImage.txt). The process is very quick and there are dozens of supported languages.

Let's look at the following example:

The following text is detected:

International
‘Champions
Cup

~- TOUR SQUAD

#AFCTour2018

CECH MUSTAFI GUENDOUZI oziL
LENO SOKRATIS NELSON IWOBI
MARTINEZ MAVROPANOS SMITHROWE = NKETIAH
BELLERIN OSEI-TUTU WILLOCK PEREZ
KOLASINAC ELNENY RAMSEY LACAZETTE
CHAMBERS MAITLAND-NILES MKHITARYAN AUBAMEYANG
HOLDING

There are a number of utilities in different programming languages that plug into tesseract's functionality, but it's important to know the underlying tool! tesseract is an unbelievable tool that you should take advantage of if you need an open source utility for detecting text in an image!

Recent Features

By David WalshFebruary 19, 2019
Welcome to My New Office
My first professional web development was at a small print shop where I sat in a windowless cubical all day. I suffered that boxed in environment for almost five years before I was able to find a remote job where I worked from home. The first...
By David WalshSeptember 21, 2017
How to Create a RetroPie on Raspberry Pi – Graphical Guide
Today we get to play amazing games on our super powered game consoles, PCs, VR headsets, and even mobile devices. While I enjoy playing new games these days, I do long for the retro gaming systems I had when I was a kid: the original Nintendo...

Incredible Demos

By David WalshDecember 5, 2012
spellcheck Attribute
Many useful attributes have been provided to web developers recently: download, placeholder, autofocus, and more. One helpful older attribute is the spellcheck attribute which allows developers to control an elements ability to be spell checked or subject to grammar checks. Simple enough, right?
By David WalshSeptember 22, 2011
Elegant Overflow with CSS Ellipsis
Overflow with text is always a big issue, especially in a programmatic environment. There's always only so much space but variable content to add into that space. I was recently working on a table for displaying user information and noticed that longer strings were...

How to Detect Text in Images

Recent Features

Welcome to My New Office

How to Create a RetroPie on Raspberry Pi – Graphical Guide

Incredible Demos

spellcheck Attribute

Elegant Overflow with CSS Ellipsis

Discussion