ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Gammatone-domain model combination for consonant recognition in noisy environments

Jae Sam Yoon, Ji Hun Park, Hong Kook Kim

In this paper, a gammatone-domain model combination method is proposed for consonant recognition in noisy environments. For this task, we first define a gammatone cepstral coefficient (GCC) as the cepstral representation of the averaged envelopes of a gammatone filtered signal. Then, we investigate a proper phonetic unit by comparing monophone, diphone, and triphone acoustic models, where it is determined from consonant recognition experiments that the diphone hidden Markov models (HMMs) provide the best performance. Next, a gammatone-domain model combination method is developed to combine the clean and noise models in the linear gammatone-envelope domain. We then evaluate the performance of the GCC-based feature and the proposed model combination on intervocalic English consonants (VCV) with 24 different consonants. It is experimentally shown that the GCC-based feature achieves a relatively higher recognition rate of 47.46% than the mel-frequency cepstral coefficients (MFCCs). Also, the model combination applied to the GCC-based diphone HMM system relatively increases the accuracy rate by 77.67% under the noisy conditions.


doi: 10.21437/Interspeech.2008-488

Cite as: Yoon, J.S., Park, J.H., Kim, H.K. (2008) Gammatone-domain model combination for consonant recognition in noisy environments. Proc. Interspeech 2008, 1773-1776, doi: 10.21437/Interspeech.2008-488

@inproceedings{yoon08_interspeech,
  author={Jae Sam Yoon and Ji Hun Park and Hong Kook Kim},
  title={{Gammatone-domain model combination for consonant recognition in noisy environments}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1773--1776},
  doi={10.21437/Interspeech.2008-488}
}