A term-by-document (or part-by-collection) matrix can be used to index documents (or collections) for information retrieval applications. Reducing the rank of the indexing matrix can further reduce the complexity of information retrieval. A method for index matrix rank reduction can involve computing a singular value decomposition and then retaining singular values based on the singular values corresponding to singular values of multiple topics. The expected singular values corresponding to a topic can be determined using the roots of a specially formed characteristic polynomial. The coefficients of the special characteristic polynomial can be based on computing the determinants of a Gram matrix of term (or part) probabilities, a method of recursion, or a method of recursion further weighted by the probability of document (or collection) lengths.

 
Web www.patentalert.com

< Global network computers

> Method of database compression for database entries having a pre-determined common part

> Methods, systems, and computer program products for characterizing links to resources not activated

~ 00565