Free OCR Software (Optical Character Recognition)
Convert Scanned Images with Text to Pure Text Documents
Free OCR Software (Optical Character Recognition)
Free OCR software are programs that will take an image file containing text (words) and generate a text document containing those words. You usually get such pictures containing text when you scan a document using a scanner. In general, these programs don't do well if the text on your page does not stand out clearly from its background, nor if the fonts used are highly stylised.
Some OCR programs can be trained. That is, you can get it to scan some text, and then you teach the software what those characters are. In this way, the program is able to learn the shape of each of the characters even from unusual fonts. Many, if not most, of the OCR software also consult a dictionary of words for that language when converting.
Note: that OCR software often come free with your scanner or all-in-one machine (ie, printer, scanner and copier combined), so you may want to see if you already have such a program before rushing out to download one. The ones bundled with your scanner are usually limited versions of commercial software, and can sometimes work better than the free ones listed here (or, as well as OCRs can be expected to work given the current state of technology).
If you are looking for full-blown commercial OCR software, one possibility is to check out OmniPage.
Related Pages
- How to Create / Make a Website: The Beginner's A-Z Guide
- Free Word Processors and Office Suites
- Free Video Capture and Recording Software
- Free Screen Video Recorders and Screen Capture Software
- Free Screen Readers: Text to Speech Conversion
- Free DVD Authoring and Creation Software
- Free CD and DVD Burners and Copying Software
- Free Command Line Shells
- Free x86 / PC Emulators and Virtual Machines
- Free Device Driver Backup and Extraction Utilities
Free OCR Software (Optical Character Recognition)
- Capture2Text (Windows)
Capture2Text is an optical character reader for reading the screen on Windows systems. It saves the text to the clipboard. At the time this was written, it supports numerous languages, including English, French, German, Japanese, Korean, Russian, Spanish, Chinese (both simplified and traditional), Greek (both modern and ancient), Italian, Latin, Dutch, Hebrew, Welsh, Norwegian, etc (too many to list here). It can also translate the text (using Google Translate).
- Tesseract OCR (Windows, Linux)
Currently sponsored by Google and originally developed by Hewlett Packard, this open source OCR program works under Windows and Linux. It can recognize 6 languages, is fully UTF-8 capable, is able to detect fixed pitch vs proportional pitch fonts, and can be trained. It takes a TIF image file as input (but if you need to, you can always convert your images from other formats using one of the free image and photo editing programs available). At the time I write this, the program can only handle text in a single column.
- GOCR (Linux, Windows, OS/2)
GOCR is an OCR program that converts scanned images of text into a text file. It is multiplatform and is released under the open source GNU General Public License. Executables (or binaries) are available for Linux, Windows and OS/2. This is a command line program.
- Ocropus (Linux)
Ocropy / Ocropus is a document analysis and OCR system that uses plugins for its character recognition engine and has layout analysis and statistical natural language modelling, multi-lingual capabilities. The OCR engine uses Tesseract (see elsewhere on this page).
- Ocrad: The GNU OCR (Linux)
Ocrad is a command line OCR utility that accepts files in the format of pbm, pgm, or ppm. It is able to handle multi-column texts or blocks of text. The program is available only in source code form.
- Microsoft Office Document Imaging (Windows, Mac OS X)
If you use Microsoft Office, you will probably already have this tool on your system. (Although it doesn't have a separate free download, it is listed here since many people already have this software on their system, and are not aware of the existence of this utility.) Windows users can find it in "Microsoft Office\Microsoft Office Tools" on the Start menu.
Related Pages
- Free File Renaming Tools for Bulk Renaming of Multiple Files
- Free Hard Disk Backup and Restore, Hard Disk Image and Cloning Utilities
- Free Partitioning Software - Copy, Create, Move, Resize, Convert, Undelete Partitions
- How to Work Around the Missing Up Arrow Button in Vista's Windows Explorer
- How Much Does It Cost to Set Up a Website?
- Important Precautions to Take When Buying a Domain Name
- How to Add Google Advertisements (Google AdSense) to Your Blog or Website
- Which Web Host Do You Recommend? (FAQ)
Newest Pages
- How to Convert Your Website from XHTML 1.0 to HTML5 the Quick and Easy Way
- How to Set the Height of a DIV Relative to a Browser Window (CSS)
- Free EPUB Readers (Ebook Viewing Software)
- How to Generate the Free Let's Encrypt SSL Certificate on Your Own (Windows) Computer
- How to Insert Meta Tags into a Web Page with BlueGriffon
- How to Play a Song (or Some Other Audio Clip) from a List on a Website
- Two Ways to View a Binary File on Windows Without Installing Anything
- How to Draw a Horizontal Line on a Web Page with Expression Web
- How to Create a Website Free of Charge
- Why Can't I Make Up Any Domain I Want? Is There a Way to Do Away with a Registrar Altogether?
How to Link to This Page
It will appear on your page as:
Free OCR Software (Optical Character Recognition)