Design and Acquisition of a Task-Oriented Spontaneous-Speech Data Base

Anna Corazza, Marcello Federico, Roberto Gretter and Gianni Lazzari

In V. Roberto (ed.), Intelligent Perceptual Systems, Lecture Notes in Artificial Intelligence, Springer Verlag, pp. 196-210, Heidelberg, Germany, 1993


The need of large databases both for training and testing automatic speech recognition and understanding systems is a well known issue. This paper presents the result of a first collection of task-oriented spontaneous speech corpora performed in the MAIA project, under development at IRST. About 2000 sentences were acquired from 50 subjects concerning two scenarios of human-machine spoken interactions: a telecontrol station for a mobile robot and an information query system. Both systems were simulated by means of the well known ``Wizard of Oz'' technique. This paper focuses on the methodological issues of this approach, putting in evidence some important points which must be considered in the design of simulations, together with the adopted solutions. A first evaluation of the collected data concludes the exposition.

paper (file postscript, 208 kByte)