One aspect of the invention extracts a human readable list from a
document. It does this by accessing a file that contains data that
represents a portion of the document. The data is formatted in accordance
with a document formatting description. The data is parsed into tokens
that include container tokens and textual tokens. From the container
tokens, this aspect determines a context for some of the textual tokens.
Once the context is determined, this aspect determines a separator
pattern between one of the textual tokens and an adjacent textual token
where both the textual token and the adjacent textual token have the same
context. Once the separator pattern is determined, the textual tokens can
be extracted responsive to the separator pattern. Finally, the textual
tokens are presented as the human readable list (for example, displayed,
returned in a database, returned in response to a function or subroutine
call, etc.).