A Morphological Analyzer for the Italian Language

Roberto Gretter and Gianni Peirone

Istituto per la Ricerca Scientifica e Tecnologica, 38050 Povo di Trento (Italy)

Tech. Rep. - Ref. No. 9108-01 - December 12, 1991


In Automatic Speech Recognition there is a growing interest in the use of Language Models (LMs) to limit the search space, thus improving system performance. The estimation of an LM can take advantage of the availability of labelled corpora of written text, and tools for semi-automatic labelling are needed.

In this report we describe the implementation of a morphological analyzer for the Italian language, based on a lexicon extracted from the Italian dictionary ``Zingarelli''. Emphasis is posed both on the definition and the extraction of the morphemes, and on the morphemes concatenation rules needed for the composition of well-formed words. Some future possible improvements of the system are also discussed.

paper (file postscript, 162 kByte)