Embodiments of the present invention provide a method and apparatus for segmenting text by providing orthographic and inflectional variations to a syntactic parser. Under the present invention, possible segments are first identified in the sequence of characters. At least two of the identified segments overlap each other. For at least one of the segments, an alternative sequence of characters is identified. In some cases, this alternative sequence is formed through inflectional morphology, which identifies a different lexical form for a word identified by the segment. In some cases, the alternative sequence represents an orthographic variant of a word identified by the segment. The identified segments and the alternative segments are then passed to a syntactic analyzer, which produces one or more syntactic parses. The segments found in the resulting parses represent the segmentation of the input sequence of characters.

 
Web www.patentalert.com

< Microelectrical mechanical structure (MEMS) optical modulator and optical display system

< Creation and use of virtual device drivers on a serial bus

> File propagation tool

> Distributed variable synchronizer

~ 00222