EUROSPEECH 2003 - INTERSPEECH 2003
This study aims at automatically estimating probability of individual words of Japanese English (JE) being perceived correctly by American listeners and clarifying what kinds of (combinations of) segmental, prosodic, and linguistic errors in the words are more fatal to their correct perception. From a JE speech database, a balanced set of 360 utterances by 90 male speakers are firstly selected. Then, a listening experiment is done where 6 Americans are asked to transcribe all the utterances. Next, using speech and language technology, values of many segmental, prosodic, and linguistic attributes of the words are extracted. Finally, relation between transcription rate of each word and its attribute values is analyzed with Classification And Regression Tree (CART) method to predict probability of each of the JE words being transcribed correctly. The machine prediction is compared with the human prediction of seven teachers and this method is shown to be comparable to the best American teacher. This paper also describes differences in perceiving intelligibility of the pronunciation between American and Japanese teachers.
Bibliographic reference. Minematsu, Nobuaki / Guo, Changchen / Hirose, Keikichi (2003): "CART-based factor analysis of intelligibility reduction in Japanese English", In EUROSPEECH-2003, 2069-2072.