ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Phone boundary detection using selective refinements and context-dependent acoustic features

Sirinoot Boonsuk, Proadpran Punyabukkana, Atiwong Suchato

Accurate placement of phone boundaries results in better performance of speech recognition systems as well as in the quality of concatenative speech synthesis. This study proposes a post-processing technique to refine the locations of phone boundaries provided by HMM-based forced alignment. The context-dependent Linear Discriminant Analysis (LDA) classifiers together with a confidence scoring scheme are utilized to improve the precision of locating phone boundaries. Every acoustic feature is not always suitable for locating boundaries between every type of phonetic segment. Therefore, feature selections are performed based on the boundary types. The proposed context-dependent refinement results in a 43.9% error reduction in locating phone boundaries compared to the ones obtained from an HMM-based force alignment. The average deviation, from manually labeled boundaries, is reduced from 1.4 to 1.0 frame when the frame size used is 10 milliseconds.


doi: 10.21437/Interspeech.2007-415

Cite as: Boonsuk, S., Punyabukkana, P., Suchato, A. (2007) Phone boundary detection using selective refinements and context-dependent acoustic features. Proc. Interspeech 2007, 1362-1365, doi: 10.21437/Interspeech.2007-415

@inproceedings{boonsuk07_interspeech,
  author={Sirinoot Boonsuk and Proadpran Punyabukkana and Atiwong Suchato},
  title={{Phone boundary detection using selective refinements and context-dependent acoustic features}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1362--1365},
  doi={10.21437/Interspeech.2007-415}
}