Sixth European Conference on Speech Communication and Technology
Considering the perceptual importance of phonetic transitions as minimal contextual variant units, this paper addresses the problem by modelling explicitly interphone dynamics covered in diphones. Subspace projections based on a time-constrained PCA (TC-PCA) are developed which focus on the temporal evolution. They reveal characteristic trajectories present in alowdimensional spectral representation facilitating robust parameter estimation and simultaneously optimise the discriminant information. The applied multiple hypotheses rescoring scheme enables operating in very low-dimensional parameter space. Using such multiple hypotheses paradigm the complementary information effectiveness of modelling explicitly inter-phone dynamics covered in diphones can be shown using the TIMIT database, resulting in improved phone error rates.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Reinhard, Klaus / Niranjan, Mahesan (1999): "Diphone subspace models for phone-based HMM complementation", In EUROSPEECH'99, 1351-1354.