Evaluation of Digit Recognition over the Telephone Network

Daniele Falavigna, Roberto Gretter


Proceedings of ESCA - NATO Workshop on

ROBUST SPEECH RECOGNITION FOR UNKNOWN COMMUNICATION CHANNELS

PONT-a-MOUSSON, FRANCE, 17-18 April 1997


Abstract

In this paper we address the problem of continuous digit recognition over the telephone in real-time. We describe a telephone corpus, that has been acquired both to retrain Hidden Markov Models, derived from clean speech, and to test the application. Experimental comparisons, using different acoustic features, are given, showing that linear prediction cepstral coefficients outperform the other types of features. Cepstral mean subtraction is compared with RASTA filtering. This latter one is more attractive because it allows to perform recognition while the user is still speaking. Explicit modeling of some weak spontaneous speech phenomena, that allows to considerably improve word accuracy, is also described. Finally, we discuss the use of a rejection strategy, for the recognition of small vocabularies, that is fundamental in real applications.


paper (file postscript, 111 kByte)