Subphonetic discovery through segmental clustering is a central step in building a corpus-based synthesizer. To help decide what clustering algorithm to use we employed merge-and-split tests on English fricatives. Compared to reference of 2%, Gaussian EM achieved a misclassification rate of 6%, K-means 10%, while predictive CART trees performed poorly.
Cite as: Kominek, J., Black, A.W. (2005) Measuring unsupervised acoustic clustering through phoneme pair merge-and-split tests. Proc. Interspeech 2005, 689-692, doi: 10.21437/Interspeech.2005-198
@inproceedings{kominek05b_interspeech, author={John Kominek and Alan W. Black}, title={{Measuring unsupervised acoustic clustering through phoneme pair merge-and-split tests}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={689--692}, doi={10.21437/Interspeech.2005-198} }