Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

Example read words from Images with Tesseract.js

What's Tesseract.js

  • Tesseract.js is a pure javascript library for OCR (Tesseract OCR engine.)
  • It gets words out of images (supports over 60 languages)
  • It can run either in a browser and Node.js.
  • Demo (see)

Use on browser

Include library tesseract.js in HTML

<script src='https://cdn.rawgit.com/naptha/tesseract.js/1.0.10/dist/tesseract.js'></script>

Example code

Tesseract.recognize(imgObj, {
	lang: langValue    
})
.progress(function(p){
	console.log('progress', p)
})
.then(function(result){	
	console.log("Read the image success");		
	/*To do something*/			
})
.catch(function(err){
	console.log("Read the image failed");
	/*To do something*/			
})
.finally(function(resultOrError){		
	console.log("Finally");
	/*To do something*/
});

imgObj is any ImageLike object.(see)

langValue is a property to config a language.(see)

How to detect the language

Tesseract.detect(myImage)
.then(function(result){
    console.log(result.script)
})

(then, progress, error and finally methods can be used)

My source codes on Browser

Use on Node.js

Install tesseract.js package with npm

npm install tesseract.js --save

(requires node v6.8.0 or greater.)

Use it:

let Tesseract = require('tesseract.js')

Example code 1

I use a picture for testing from wiki

My source codes on Node.js and run with this command

node test_ocr.js

example output

example oupt node.js

More example codes and API docs on GitHub.

##References