10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Improving the Robustness of Phonetic Segmentation to Accent and Style Variation with a Two-Staged Approach

Vaishali Patil, Shrikant Joshi, Preeti Rao

IIT Bombay, India

Correct and temporally accurate phonetic segmentation of speech utterances is important in applications ranging from transcription alignment to pronunciation error detection. Automatic speech recognizers used in these tasks provide insufficient temporal alignment accuracy apart from a recognition performance that is sensitive to accent and style variations from the training data. A two-staged approach combining HMM broad-class recognition with acoustic-phonetic knowledge based refinement is evaluated for phonetic segmentation accuracy in the context of accent and style mismatches with training data.

Full Paper

Bibliographic reference.  Patil, Vaishali / Joshi, Shrikant / Rao, Preeti (2009): "Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approach", In INTERSPEECH-2009, 2543-2546.