8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Context Constrained-Generalized Posterior Probability for Verifying Phone Transcriptions

Hua Zhang (1), Lijuan Wang (2), Frank K. Soong (2), Wenju Liu (1)

(1) Chinese Academy of Sciences, China
(2) Microsoft Research Asia, China

A new statistical confidence measure, Context Constrained- Generalized Posterior probability (CC-GPP), is proposed for verifying phone transcriptions in speech databases. Different from generalized posterior probability (GPP), CC-GPP is computed by considering string hypotheses that bear a focused phone with partially matched left and right contexts. Parameters used for CC-GPP include context window length, a minimal number of matched context phones, and verification thresholds. They are determined by minimizing verification errors in a development set. Evaluated on a test set of 500 sentences that consist of 2.1% phone errors, CCGPP achieves 99.6% accuracy and 78.7% recall when 90% of the phones are accepted.

Full Paper

Bibliographic reference.  Zhang, Hua / Wang, Lijuan / Soong, Frank K. / Liu, Wenju (2007): "Context constrained-generalized posterior probability for verifying phone transcriptions", In INTERSPEECH-2007, 1330-1333.