A method and system for document analysis and retrieval. A document that
includes text is received from a host. Document keys (i.e., keywords and
keyphrases) associated with the text are generated. In first embodiments,
a document taxonomy is provided. The taxonomy has categories and
associated category keys (i.e., keywords and keyphrases). The category
keys of each category are compared with the document keys to determine a
distance between the document and each category as a measure of how close
the document is to each category. A subset of the categories is returned
to the host, wherein the subset of the categories reflects the determined
distances. In second embodiments, a search string is created as a logical
function of a subset of the document keys. The search string is submitted
to a search engine. Links to related documents are received from the
search engine and returned to the host.