5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Estimation of Voice Source and Vocal Tract Parameters Using Combined Subspace-Based and Amplitude Spectrum-Based Algorithm

Chang-Sheng Yang, Hideki Kasuya

Utsunomiya University, Japan

In this paper, a high quality pole-zero speech analysis technique is proposed. The speech production process is represented by a source-filter model. A Rosenberg-Klatt model is used to approximate a voicing source waveform for voiced speech, whereas a white noise is assumed for unvoiced. The vocal tract transfer function is represented by a pole-zero filter. For voiced speech, parameters of the source model are jointly estimated with those of the vocal tract filter. A combined algorithm is developed to estimate the vocal tract parameters, i.e., formants and anti-formants which are calculated from the poles and zeros of the filter. By the algorithm, poles are estimated based on a subspace algorithm, while zeros are estimated from the amplitude spectrum. For unvoiced speech, an AR model is assumed, which can be solved by LPC analysis. An experiment using synthesized nasal sounds shows that the poles and zeros are estimated quite accurately.

