The invention relates to improved solutions for information retrieval, wherein the information is represented by digitized text data. This data is further presumed to be organized in terms (431-438), documents and document corpora, where each document contains at least one term (431-438) and each document corpus contains at least one document. Based on a concept vector (420-424), which conceptually classifies the contents of each document, a term-to-concept vector is generated for each term (431-438) in the document corpus. The term-to-concept vector describes a relationship between the term (431) and each of the concept vectors (420-424). On basis of the term-to-concept vectors for the document corpus, a term-term matrix is generated which describes a term-to-term relationship between all the terms (431-438) in the document corpus. The term-term matrix may then be processed and used for retrieving information from the document corpus, such as the fact that a first term (431) is related to a second term (436).

 
Web www.patentalert.com

< Context sensitive term expansion with dynamic term expansion

> Learning a document ranking using a loss function with a rank pair or a query parameter

> Apparatus, medium, and method clustering audio files

~ 00546