Text based documents are compared by lexically normalising each word of the text of a first document (104) to form a first normalised representation. A vector representation of the first document is built (206) from the first normalised representation. Each word of the text of a second document (110) is lexically normalised to form a second normalised representation. A vector representation of the second document is built (204) from the second normalised representation. The alignment of the vector representations is compared (210) to produce a score (218) of the similarity of the second document to the first document.

 
Web www.patentalert.com

< DEVICE AND METHOD FOR DETERMINING THE DENSITY OF A FLUID

> FAST SINGULAR VALUE DECOMPOSITION FOR EXPEDITING COMPUTER ANALYSIS SYSTEM AND APPLICATION THEREOF

> SYSTEM AND METHOD FOR INDIRECTLY MEASURING CALCIUM ION EFFLUX

~ 00553