An extraction manager extracts information from formatted input. The input is annotated with presentation information, and parsed into a set of elements comprising a canonical representation thereof. An information analyzer analyzes the elements in order to glean additional information. An entity extractor determines entities to extract from the input. The entity extractor analyzes elements according to specific entities to be extracted, and creates entity specific observations for analyzed elements. These observations comprise possible values for the relevant entities. A heuristics processor maintains a collection of entity specific heuristics, each comprising a test to help determine the suitability of data as a value for the corresponding entity. The heuristics processor selects heuristics for the entities to be extracted, and tests observations for these entities against the selected heuristics. Responsive to this testing, ordered possible values for entities to extract are determined.

 
Web www.patentalert.com

< Methods and systems for utilizing configuration information

> Processor for fast phrase searching

> Indexing and searching for database records with defined validity intervals

~ 00564