Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Diphone Subspace Models for Phone-Based HMM Complementation

Klaus Reinhard (1), Mahesan Niranjan (2)

(1) University of Cambridge, Department of Engineering, Cambridge, UK
(2) University of Sheffield, Department of Computer Science, Sheffield, UK

Considering the perceptual importance of phonetic transitions as minimal contextual variant units, this paper addresses the problem by modelling explicitly interphone dynamics covered in diphones. Subspace projections based on a time-constrained PCA (TC-PCA) are developed which focus on the temporal evolution. They reveal characteristic trajectories present in alowdimensional spectral representation facilitating robust parameter estimation and simultaneously optimise the discriminant information. The applied multiple hypotheses rescoring scheme enables operating in very low-dimensional parameter space. Using such multiple hypotheses paradigm the complementary information effectiveness of modelling explicitly inter-phone dynamics covered in diphones can be shown using the TIMIT database, resulting in improved phone error rates.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Reinhard, Klaus / Niranjan, Mahesan (1999): "Diphone subspace models for phone-based HMM complementation", In EUROSPEECH'99, 1351-1354.