8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Data Driven Example Based Continuous Speech Recognition

Mathias De Wachter, Kris Demuynck, Dirk van Compernolle, Patrick Wambacq

Katholieke Universiteit Leuven, Belgium

The dominant acoustic modeling methodology based on Hidden Markov Models is known to have certain weaknesses. Partial solutions to these flaws have been presented, but the fundamental problem remains: compression of the data to a compact HMM discards useful information such as time dependencies and speaker information. In this paper, we look at pure example based recognition as a solution to this problem. By replacing the HMM with the underlying examples, all information in the training data is retained. We show how information about speaker and environment can be used, introducing a new interpretation of adaptation. The basis for the recognizer is the well-known DTW algorithm, which has often been used for small tasks. However, large vocabulary speech recognition introduces new demands, resulting in an explosion of the search space. We show how this problem can be tackled using a data driven approach which selects appropriate speech examples as candidates for DTW-alignment.

Full Paper

Bibliographic reference.  Wachter, Mathias De / Demuynck, Kris / Compernolle, Dirk van / Wambacq, Patrick (2003): "Data driven example based continuous speech recognition", In EUROSPEECH-2003, 1133-1136.