One aspect of the invention extracts a human readable list from a document. It does this by accessing a file that contains data that represents a portion of the document. The data is formatted in accordance with a document formatting description. The data is parsed into tokens that include container tokens and textual tokens. From the container tokens, this aspect determines a context for some of the textual tokens. Once the context is determined, this aspect determines a separator pattern between one of the textual tokens and an adjacent textual token where both the textual token and the adjacent textual token have the same context. Once the separator pattern is determined, the textual tokens can be extracted responsive to the separator pattern. Finally, the textual tokens are presented as the human readable list (for example, displayed, returned in a database, returned in response to a function or subroutine call, etc.).

 
Web www.patentalert.com

< Method for improving local descriptors in peer-to-peer file sharing

> Method and apparatus for tracking functional states of a Web-site and reporting results to web developers

> Searching content on web pages

~ 00531