Perception studies have long argued that phonetic confusions are more likely to happen across some phonetic features than other (e.g., place of articulation rather than manner) [1]. Similarly, we and others have noted that pronunciation variation occurs more frequently in unstressed syllables, and in syllable codas. This suggests that a phonetic information structure is at play, where for decoding purposes it is important to get phonetic information accurate in stressed syllables, but less so in unstressed syllables. In this work, we explore the role of phonetic information in clean and noisy speech by reducing the phonetic information available to the recognizer. A surprising result is that replacing some phones with manner classes in the dictionary improves recognition in one noise condition.
Cite as: Fosler-Lussier, E., Rytting, C.A., Srinivasan, S. (2005) Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performance. Proc. Interspeech 2005, 1249-1252, doi: 10.21437/Interspeech.2005-479
@inproceedings{foslerlussier05_interspeech, author={Eric Fosler-Lussier and C. Anton Rytting and Soundararajan Srinivasan}, title={{Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performance}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={1249--1252}, doi={10.21437/Interspeech.2005-479} }