8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Enhanced Tree Clustering with Single Pronunciation Dictionary for Conversational Speech Recognition

Hua Yu, Tanja Schultz

Carnegie Mellon University, USA

Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show that accurate pronunciation modeling can be achieved through efficient parameter sharing in the acoustic model. Combined with a single pronunciation dictionary, a 1.8% absolute word error rate improvement is achieved on Switchboard, a large vocabulary conversational speech recognition task.

Full Paper

Bibliographic reference.  Yu, Hua / Schultz, Tanja (2003): "Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition", In EUROSPEECH-2003, 1869-1872.