EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics in Various Noisy Environments

Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano

Nara Institute of Science and Technology, Japan

Noise and speaker adaptation techniques are essential to realize robust speech recognition in noisy environments. In this paper, first, a noise robust speech recognition algorithm is implemented by superimposing a small quantity of noise data on spectral subtracted input speech. According to the recognition experiments, 30dB SNR noise superimposition on input speech after spectral subtraction increases the robustness against different noises significantly. Next, we apply this noise robust speech recognition to the unsupervised speaker adaptation algorithm based on HMM sufficient statistics in different noise environments. The HMM sufficient statistics for each speaker are calculated from 25dB SNR office noise added speech database beforehand. We evaluate successfully our proposed unsupervised speaker adaptation algorithm in noisy environments with 20k dictation task using 11 kinds of different noises, including office, car, exhibition, and crowd noises.

Full Paper

Bibliographic reference.  Yamade, Shingo / Lee, Akinobu / Saruwatari, Hiroshi / Shikano, Kiyohiro (2003): "Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments", In EUROSPEECH-2003, 1493-1496.