Various technologies and techniques are disclosed that improve the identification of related content. An article for which to identify matching content is received or selected. The raw text of the article is analyzed to reduce the raw text to a core set of words, and the results are stored in a document feature vector array. The formatted text of the article is analyzed and vector array scores are updated based on the formatting. Anchor text words for documents that link to the article are added to the vector array. Articles linking to and from the particular article are identified and added to the vector array as appropriate. Transformations are performed, such as to adjust the vector scores based on how common or generic the words are. Vector arrays are created for other potentially related documents. The vectors are compared to determine how related they are to each other.

 
Web www.patentalert.com

< Tree lists using multiple line structures

> Viewer selection of programs to be subsequently delivered

> Intelligent intrusion detection system utilizing enhanced graph-matching of network activity with context data

~ 00536