ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization

Makoto Shozakai, Goshu Nagino

This paper proposes a new methodology to improve robustness of recognition performance to aggregate of noises by two-dimensional visualization technique. At first, an aggregate of noises existing in adverse environments are collected as much as possible. Then, hidden Markov model (HMM) for each collected noise is trained. Aggregate of the trained HMMs are visualized into two-dimensional map by the statistical multidimensional scaling technique named as COSMOS method. The noises corresponding to the HMMs located in periphery of the map are overlaid to clean speech used for training HMMs of acoustic models. It is revealed that this new methodology significantly reduces recognition error rate by around 60% to non-stationary noises overlaid in the voice interval of word.


doi: 10.21437/Interspeech.2005-219

Cite as: Shozakai, M., Nagino, G. (2005) Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization. Proc. Interspeech 2005, 921-924, doi: 10.21437/Interspeech.2005-219

@inproceedings{shozakai05_interspeech,
  author={Makoto Shozakai and Goshu Nagino},
  title={{Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={921--924},
  doi={10.21437/Interspeech.2005-219}
}