On-line Learning of Language Models With Word Error Probability Distributions

Gretter Roberto, Riccardi Giuseppe


Salt Lake City, Utah, May 7-11 2001


We are interested in the problem of learning stochastic language models on-line (without speech transcriptions) for adaptive speech recognition and understanding. In this paper we propose an algorithm to adapt to variations in the language model distributions based on the speech input only and without its true transcription. The on-line probability estimate is defined as a function of the prior and word error distributions. We show the effectiveness of word-lattice based error probability distributions in terms of Receiver Operating Characteristics (ROC) curves and word accuracy. We apply the new estimates P_{adapt}(w) to the task of adapting on-line an initial large vocabulary trigram language model and show improvement in word accuracy with respect to the baseline speech recognizer.

paper (file postscript, 90 kByte)