Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Speaker and Channel-Normalized Set of Formant Parameters for Telephone Speech Recognition

Boris Lobanov, T. Levkovskaya, Igor E. Kheidorov

Institute of Engineering Cybernetics, Nac. Ac. of Sc. Belarus, Minsk, Belarus

The speech parameters, most commonly used nowadays, are Cepstral coefficients derived from FFT or LPC Spectrum. An alternative approach that can potentially provide maximum speaker and channel independence is estimation of articulatory based features such as formant frequencies, amplitudes and voicing degree. A present report describes a new method and algorithm of robust estimation of F1(t), F2(t), F3(t), A1(t),A2(t), A3(t), V(t) from telephone speech signal, and also the procedures of their normalization against speaker and channel variability. The results obtained from the experiments confirm the efficiency of the suggested set of formant parameters in a view of speech signal speaker and channel variability resistance. According to the experiments it gives significant improvement in the recognition performance as compared with cepstral parameters use.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Lobanov, Boris / Levkovskaya, T. / Kheidorov, Igor E. (1999): "Speaker and channel-normalized set of formant parameters for telephone speech recognition", In EUROSPEECH'99, 331-334.