ITRW on Non-Linear Speech Processing
(NOLISP 07)

Paris, France
May 22-25, 2007

Towards Phonetically-Driven Hidden Markov Models: Can we Incorporate Phonetic Landmarks in HMM-Based ASR?

Guillaume Gravier, Daniel Moraru

Equipe Metiss, Irisa, Rennes, France

Automatic speech recognition mainly relies on hidden Markov models (HMM) which make little use of phonetic knowledge. As an alternative, landmark based recognizers rely mainly on precise phonetic knowledge and exploit distinctive features. We propose a theoretical framework to combine both approaches by introducing phonetic knowledge in a non stationary HMM decoder. To demonstrate the potential of the method, we investigate how broad phonetic landmarks could be used to improve a HMM decoder by focusing the best path search. We show that, assuming error free landmark detection, every broad phonetic class brings a small improvement. The use of all the classes reduces the error rate from 22% to 14% on a broadcast news transcription task. We also experimentally validate that landmarks boundaries does not need to be detected precisely and that the algorithm is robust to non detection errors.

Full Paper

Bibliographic reference.  Gravier, Guillaume / Moraru, Daniel (2007): "Towards phonetically-driven hidden Markov models: can we incorporate phonetic landmarks in HMM-based ASR?", In NOLISP-2007, 55-58.