A Hidden-Articulator Markov Model (HAMM) is a Hidden Markov Model (HMM) in which each state represents an articulatory configuration. Articulatory knowledge, known to be useful for speech recognition [1], is represented by specifying a mapping of phonemes to articulatory configurations; vocal tract dynamics are represented via transitions between articulatory configurations.
In previous work [2], we extended the articulatory-feature model introduced by Erler [3] by using diphone units and a new technique for model initialization. By comparing it with a purely random model, we showed that the HAMM can take advantage of articulatory knowledge.
In this paper, we extend that work in three ways. First, we decrease the number of parameters, making it comparable in size to standard HMMs. Second, we evaluate our model in noisy contexts, verifying that articulatory knowledge can provide benefits in adverse acoustic conditions. Third, we use a corpus of sideby- side speech and articulator trajectories to show that the HAMM can reasonably predict the movement of the articulators.
s L. Deng and D. Sun (1994). "Phonetic Classification and Recognition Using HMM Representation of Overlapping Articulatory Features for all classes of English sounds," ICASSP, 1994, pp.45-48 M. Richardson, J. Bilmes, C. Diorio (2000). "Hidden-Articulator Markov Models for Speech Recognition", ASR2000. K. Erler and G.H. Freeman (1996). "An HMM-based speech recognizer using overlapping articulatory features," J. Acoust. Soc. Am. 100, pp.2500-13
Cite as: Richardson, M., Bilmes, J., Diorio, C. (2000) Hidden-articulator Markov models: performance improvements and robustness to noise. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 131-134, doi: 10.21437/ICSLP.2000-495
@inproceedings{richardson00_icslp, author={Matt Richardson and Jeff Bilmes and Chris Diorio}, title={{Hidden-articulator Markov models: performance improvements and robustness to noise}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 3, 131-134}, doi={10.21437/ICSLP.2000-495} }