Correct and temporally accurate phonetic segmentation of speech utterances is important in applications ranging from transcription alignment to pronunciation error detection. Automatic speech recognizers used in these tasks provide insufficient temporal alignment accuracy apart from a recognition performance that is sensitive to accent and style variations from the training data. A two-staged approach combining HMM broad-class recognition with acoustic-phonetic knowledge based refinement is evaluated for phonetic segmentation accuracy in the context of accent and style mismatches with training data.
Bibliographic reference. Patil, Vaishali / Joshi, Shrikant / Rao, Preeti (2009): "Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approach", In INTERSPEECH-2009, 2543-2546.