EUROSPEECH 2003 - INTERSPEECH 2003
This paper proposes a method to automatically extract keywords from baseball radio speech through LVCSR for highlight scene retrieval. For robust recognition, we employed acoustic and language model adaptation. In acoustic model adaptation, supervised and unsupervised adaptations were carried out using MLLR+MAP. By this two level adaptation, word accuracy was improved by 28%. In language model adaptation, language model fusion and pronunciation modification were carried out. This adaptation showed 13% improvement at word accuracy. Finally, by integrating both adaptations, 38% improvement was achieved at word accuracy level and 28% improvement at keyword accuracy level.
Bibliographic reference. Ariki, Yasuo / Shigemori, Takeru / Kaneko, Tsuyoshi / Ogata, Jun / Fujimoto, Masakiyo (2003): "Live speech recognition in sports games by adaptation of acoustic model and language model", In EUROSPEECH-2003, 1453-1456.