An index generator and query expander for use in information retrieval in a corpus. A corpus is provided as an input to an inflectional analyzer, which produces a lemmatized corpus having base forms and associated inflections for each word in the original corpus. The lemmatized corpus is provided as an input to a disambiguator, which performs part of speech tagging and morpho-syntactic disambiguation to produce a disambiguated corpus. The disambiguated corpus is provided as an input to a derivational generator, which produces an expanded corpus having all possible valid derivatives of each word of the disambiguated corpus. The disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants, producing an index to the corpus having a minimum of variants. Alternatively, a query expander is provided utilizing similar techniques.

 
Web www.patentalert.com

< Mangled strings for testing of internationalization

< Hardware-operation description conversion method and program therefor

> Method and system for enabling connectivity to a data system

> Method base station and subscriber station for channel coding in a GSM mobile radio system

~ 00206