INTERSPEECH 2004 - ICSLP
Speech representation provided by acoustic phonetics, spectrogram, is very noisy representation in that it shows every acoustic aspect of speech. Age, gender, size, shape, microphone, room and line are completely irrelevant to speech recognition, pronunciation assessment, and so on. But the spectrogram is affected easily by these factors. This is the very essential reason why speech systems are sometimes unreliable and the author supposes that the education should not endure this inevitable characteristics. The author proposed a novel method of acoustic representation of speech where no dimensions of the above factors exist. The method was derived by implementing phonology, another speech science, on physics. This paper examines whether the new representation of speech can provide a good tool of pronunciation assessment. Results of the experiments with good and intentionally-bad pronunciations of a single speaker showed that all the students are acoustically located between the two pronunciations, indicating that all the students are judged to be acoustically closer to the speaker than the speaker himself is. This result clearly shows that the proposed method is extremely reliable and effective in CALL.
Bibliographic reference. Minematsu, Nobuaki (2004): "Pronunciation assessment based upon the phonological distortions observed in language learners' utterances", In INTERSPEECH-2004, 1669-1672.