A system/method is presented for scanning entire books or document all at once using an adaptive process where the book or document has known fonts and unknown fonts. The known fonts are processed through a verification system where sure words and error words are determined. Both the sure words and error words are sent to OCR training where they are re-OCR'ed and repeatedly verified until they meet a predetermined quality criteria. Characters or word not meeting the predetermined quality criteria receive additional OCR training until all the characters and words pass the predetermined quality criteria. Unknown fonts are scanned and clustered together by shape. Outliers in the shapes are manually key-in. Those symbols that are manually classified go to OCR training and then to the known type optimization process.

 
Web www.patentalert.com

< Image recognition method and apparatus for the same method

> Systems and methods for information content delivery relating to an object

~ 00492