ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Robust distant speaker recognition based on position dependent cepstral mean normalization

Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa

In a distant environment, channel distortion may drastically degrade speaker recognition performance. In this paper, we propose a robust speaker recognition method based on position dependent Cepstral Mean Normalization (CMN) to compensate the channel distortion depending on the speaker position. It is shown in [1] that the position dependent CMN is robust for speech recognition in a distant environment. We extend this method to the speaker recognition and show that this method is much effective to speaker recognition. In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points to the microphone in the room and estimated the compensation parameters a priori. In the recognition stage, the system estimates the speaker position and adopts the estimated compensation parameters corresponding to the estimated position, and then the system applies the CMN to the speech and performs speaker recognition. In our past study, we proposed a new text-independent speaker recognition method by combining speaker-specific Gaussian Mixture Models (GMMs) with syllablebased HMMs adapted to the speakers by MAP [2]. The robustness of this speaker recognition method for the change of the speaking style in close-talking environment was evaluated in [2]. We integrated this method to the proposed position dependent CMN for distant speaker recognition. Our experiments showed that the proposed method improved the speaker recognition performance remarkably in a distant environment.


doi: 10.21437/Interspeech.2005-622

Cite as: Wang, L., Kitaoka, N., Nakagawa, S. (2005) Robust distant speaker recognition based on position dependent cepstral mean normalization. Proc. Interspeech 2005, 1977-1980, doi: 10.21437/Interspeech.2005-622

@inproceedings{wang05k_interspeech,
  author={Longbiao Wang and Norihide Kitaoka and Seiichi Nakagawa},
  title={{Robust distant speaker recognition based on position dependent cepstral mean normalization}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1977--1980},
  doi={10.21437/Interspeech.2005-622}
}