Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Cluster-based Modeling for Ubiquitous Speech Recognition

Sadaoki Furui, Tomohisa Ichiba, Takahiro Shinozaki, Edward W. D. Whittaker, Koji Iwano

Tokyo Institute of Technology, Japan

In order to realize speech recognition systems that can achieve high recognition accuracy for ubiquitous speech, it is crucial to make the systems flexible enough to cope with a large variability of spontaneous speech. This paper investigates two speech recognition methods that can adapt to speech variation using a large number of models trained based on clustering techniques; one automatically builds a model adapted to input speech using recognition hypotheses and clustered models, and the other directly uses clustered models in parallel. Both methods have been confirmed to be effective by evaluation experiments using presentation speech. Although the latter method needs a large amount of computation, it has an advantage in that it can be applied to online recognition, since it does not need recognition hypotheses. The former method can also be applied to online recognition, if the text of proceedings for the presentation can be used in place of recognition hypotheses.

Full Paper

Bibliographic reference.  Furui, Sadaoki / Ichiba, Tomohisa / Shinozaki, Takahiro / Whittaker, Edward W. D. / Iwano, Koji (2005): "Cluster-based modeling for ubiquitous speech recognition", In INTERSPEECH-2005, 2865-2868.