Techniques are described for reducing the false positive rate of regular expression attribute extractions via a specific data representation and a machine learning method that can be trained at a much lower cost (much fewer labeled examples) than would be required by a full scale machine learning solution. Attribute determinations made using the regular expression technique are represented as skeleton tokens. The skeleton tokens, along with accurate attribute determinations, are provided to a machine-learning mechanism to train the machine-learning mechanism. Once trained, the machine-learning mechanism is used to predict the accuracy of attribute determinations represented by skeleton tokens generated for not-yet-analyzed input text.

 
Web www.patentalert.com

< Knowledge base with clustered data objects

> Method and apparatus for a scalable algorithm for decision optimization

> System and method for application balanced scorecard optimizer

~ 00591