Daniele Falavigna, Roberto Gretter
ROBUST SPEECH RECOGNITION FOR UNKNOWN COMMUNICATION CHANNELS
PONT-a-MOUSSON, FRANCE, 17-18 April 1997
In this paper we address the problem of continuous digit recognition over the telephone in real-time. We describe a telephone corpus, that has been acquired both to retrain Hidden Markov Models, derived from clean speech, and to test the application. Experimental comparisons, using different acoustic features, are given, showing that linear prediction cepstral coefficients outperform the other types of features. Cepstral mean subtraction is compared with RASTA filtering. This latter one is more attractive because it allows to perform recognition while the user is still speaking. Explicit modeling of some weak spontaneous speech phenomena, that allows to considerably improve word accuracy, is also described. Finally, we discuss the use of a rejection strategy, for the recognition of small vocabularies, that is fundamental in real applications.