Roberto Gretter and Gianni Peirone
Tech. Rep. - Ref. No. 9108-01 - December 12, 1991
In Automatic Speech Recognition there is a growing interest in the use of Language Models (LMs) to limit the search space, thus improving system performance. The estimation of an LM can take advantage of the availability of labelled corpora of written text, and tools for semi-automatic labelling are needed.
In this report we describe the implementation of a morphological analyzer for the Italian language, based on a lexicon extracted from the Italian dictionary ``Zingarelli''. Emphasis is posed both on the definition and the extraction of the morphemes, and on the morphemes concatenation rules needed for the composition of well-formed words. Some future possible improvements of the system are also discussed.