9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speech Recognition Performance of CJLC: Corpus of Japanese Lecture Contents

Satoru Kogure (1), Hiromitsu Nishizaki (2), Masatoshi Tsuchiya (3), Kazumasa Yamamoto (3), Shingo Togashi (3), Seiichi Nakagawa (3)

(1) Shizuoka University, Japan; (2) University of Yamanashi, Japan; (3) Toyohashi University of Technology, Japan

This paper discusses the speech recognition of Japanese classroom lecture speech. In particular, we mention the influences of microphone differences and the language model differences on the speech recognition performance of classroom lectures. First, we collected actual classroom lecture contents from several universities in Japan. In this paper, we recorded the lecture speech using lapel microphones because lapel microphones are more commonly used to record lectures. LVCSR is one of the essential technologies for adding tag information to such lecture speech. Next, therefore, we researched the influence of the differences between microphones used for recording lecture on speech recognition performance. Finally, seven types of language models that were trained using three types of corpora were compared on the basis of their ability to lecture speech.

