7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Speaker Independent Speech Recognition Using Features Based on Glottal Sound Source

Norihide Kitaoka, Daisuke Yamada, Seiichi Nakagawa

Toyohashi University of Technology, Japan

We discussed utilization of features based on the glottal sound source for speaker independent speech recognition. It has been thought that such features as pitch cannot contribute to speaker independent speech recognition because of the dominant speaker dependent factor.

In this paper, we tried to utilize pitch, power, LPC residual power, voicing rate, and their regression coefficients as feature parameters for speaker independent speech recognition, and found that regression parameters of F0, power and LPC residual power could improve the performance, especially using covariances between each parameter and conventional MFCC. This showed that the procedure to derive the regression parameters could reduce the speaker dependent factor which appeared as biases of those features, and that the correlation between glottal source information and spectral envelope information (MFCC) worked well.

We also tested the parameters on a large-vocabulary continuous speech recognition task and obtained the performance improvement.

Full Paper

Bibliographic reference.  Kitaoka, Norihide / Yamada, Daisuke / Nakagawa, Seiichi (2002): "Speaker independent speech recognition using features based on glottal sound source", In ICSLP-2002, 2125-2128.