Sixth International Conference on Spoken Language Processing
In this paper, we propose a method for generating a pronunciation dictionary - extracting typical pronunciations for each word from speech data uttered by Japanese speakers - as one approach to speech recognition targeting English speech uttered by Japanese speakers whose mother tongue is not English. This method includes three processes: a process in which English phoneme HMMs (Hidden Markov Models) are adapted to the speaker using English speech uttered by a Japanese speaker; a process in which English by a Japanese speaker is translated into an English phoneme series using a phoneme typewriter; and a process by which representative phoneme series are selected with a clustering technique from multiple phoneme series derived with respect to each word. We also propose a speaker adaptation method in a recognition phase. In this method, the phoneme HMMs are adapted to the target speaker with a phoneme label series that expresses the typical pronunciation extracted using the above method. Evaluation tests by continuous speech recognition with English speech data uttered by five Japanese speakers using a pronunciation dictionary generated from other five Japanese speakers' data were carried out. The result of the tests indicated that sentence recognition errors were reduced by 72% compared to using a dictionary for native speakers.
Bibliographic reference. Suzuki, Tadashi / Ishii, Jun / Nakajima, Kunio (2000): "A method of generating English pronunciation dictionary for Japanese English recognition systems", In ICSLP-2000, vol.3, 634-637.