10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Stream-Based Context-Sensitive Phone Mapping for Cross-Lingual Speech Recognition

Khe Chai Sim, Haizhou Li

Institute for Infocomm Research, Singapore

Recently, a Probabilistic Phone Mapping (PPM) model was proposed to facilitate cross-lingual automatic speech recognition using a foreign phonetic system. Under this framework, discrete hidden Markov models (HMMs) are used to map a foreign phone sequence to a target phone sequence. Context-sensitive mapping is made possible by expanding the discrete observation symbols to include the contexts of the foreign phones in which they appear in the sequence. Unfortunately, modelling the context dependencies jointly results in dramatic increase in model parameters as wider contexts are used. In this paper, the probability of observing a contextdependent symbol is decomposed into the product of probabilities of observing the symbol and its contexts. This allows wider contexts to be modelled without greatly compromising the model complexity. This can be modelled conveniently using a multiple-stream discrete HMM system where the contexts are treated as independent streams. Experimental results are reported on TIMIT English phone recognition task using the Czech, Hungarian and Russian foreign phone recognisers.

Full Paper

Bibliographic reference.  Sim, Khe Chai / Li, Haizhou (2009): "Stream-based context-sensitive phone mapping for cross-lingual speech recognition", In INTERSPEECH-2009, 3019-3022.