Third International Conference on Spoken Language Processing (ICSLP 94)
The detection of speech endpoints is a strategic process for speech recognition systems in adverse conditions, but it remains a rather delicate problem. We introduce two signal processing methods that offer a good robustness without requiring high level informations about the signal. The first approach uses temporal parameters, the other frequential ones. We discuss and compare their performances using the ARS ESPRIT database (isolated words pronounced in a car). We show that these methods coupled with a statistical segmentation offer very good discrimination between noisy segments and speech segments, and a better precision for locating the speech boundaries. The preprocessing is introduced in a HMM speech recognition system.
Bibliographic reference. Puel, Jean-Baptiste / André-Obrecht, Regine (1994): "Robust signal preprocessing for HMM speech recognition in adverse conditions", In ICSLP-1994, 259-262.