7th International Conference on Spoken Language Processing
September 16-20, 2002
We discussed utilization of features based on the glottal sound source for speaker independent speech recognition. It has been thought that such features as pitch cannot contribute to speaker independent speech recognition because of the dominant speaker dependent factor.
In this paper, we tried to utilize pitch, power, LPC residual power, voicing rate, and their regression coefficients as feature parameters for speaker independent speech recognition, and found that regression parameters of F0, power and LPC residual power could improve the performance, especially using covariances between each parameter and conventional MFCC. This showed that the procedure to derive the regression parameters could reduce the speaker dependent factor which appeared as biases of those features, and that the correlation between glottal source information and spectral envelope information (MFCC) worked well.
We also tested the parameters on a large-vocabulary continuous speech recognition task and obtained the performance improvement.
Bibliographic reference. Kitaoka, Norihide / Yamada, Daisuke / Nakagawa, Seiichi (2002): "Speaker independent speech recognition using features based on glottal sound source", In ICSLP-2002, 2125-2128.