Interspeech'2005 - Eurospeech
In order to realize speech recognition systems that can achieve high recognition accuracy for ubiquitous speech, it is crucial to make the systems flexible enough to cope with a large variability of spontaneous speech. This paper investigates two speech recognition methods that can adapt to speech variation using a large number of models trained based on clustering techniques; one automatically builds a model adapted to input speech using recognition hypotheses and clustered models, and the other directly uses clustered models in parallel. Both methods have been confirmed to be effective by evaluation experiments using presentation speech. Although the latter method needs a large amount of computation, it has an advantage in that it can be applied to online recognition, since it does not need recognition hypotheses. The former method can also be applied to online recognition, if the text of proceedings for the presentation can be used in place of recognition hypotheses.
Bibliographic reference. Furui, Sadaoki / Ichiba, Tomohisa / Shinozaki, Takahiro / Whittaker, Edward W. D. / Iwano, Koji (2005): "Cluster-based modeling for ubiquitous speech recognition", In INTERSPEECH-2005, 2865-2868.