EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Live Speech Recognition in Sports Games by Adaptation of Acoustic Model and Language Model

Yasuo Ariki (1), Takeru Shigemori (1), Tsuyoshi Kaneko (1), Jun Ogata (2), Masakiyo Fujimoto (1)

(1) Ryukoku University, Japan
(2) AIST, Japan

This paper proposes a method to automatically extract keywords from baseball radio speech through LVCSR for highlight scene retrieval. For robust recognition, we employed acoustic and language model adaptation. In acoustic model adaptation, supervised and unsupervised adaptations were carried out using MLLR+MAP. By this two level adaptation, word accuracy was improved by 28%. In language model adaptation, language model fusion and pronunciation modification were carried out. This adaptation showed 13% improvement at word accuracy. Finally, by integrating both adaptations, 38% improvement was achieved at word accuracy level and 28% improvement at keyword accuracy level.

Full Paper

Bibliographic reference.  Ariki, Yasuo / Shigemori, Takeru / Kaneko, Tsuyoshi / Ogata, Jun / Fujimoto, Masakiyo (2003): "Live speech recognition in sports games by adaptation of acoustic model and language model", In EUROSPEECH-2003, 1453-1456.