EUROSPEECH 2003 - INTERSPEECH 2003
In the context of text-to-speech synthesis, this contribution deals with the segmentation of speech into phone units. Using an HMM based segmentation system, we proceed to compare several phone-level confidence measures to detect potential local mismatches between the phone labels and the acoustics. As well as serving this purpose, these confidence measures will help the system suggest a new local graph of hypotheses for the markovian segmentation system. We propose a new formulation of a frame-based posterior probability confidence measure which gives the best results for all of our experiments over a bench of six confidence measures. Adopting an hypothesis testing formulation, this posterior frame-based measure gives an EER of 12% for a randomly blurred test database.
Bibliographic reference. Nefti, Samir / Boeffard, Olivier / Moudenc, Thierry (2003): "Confidence measures for phonetic segmentation of continuous speech", In EUROSPEECH-2003, 897-900.