Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993


Speech Transients Analysis Using AR-Smoothed Wigner-Ville Distribution

Krzysztof Marasek

Universitat Stuttgart, Institut für Machinelle Sprachverarbeitung, Lehrstuhl für Experimentelle Phonetik, Stuttgart, Germany Institute of Fundamental, Technological Research, Warszawa, Poland

The Joint Time-Frequency Representations (JTFRs) are the most promising techniques of spectral analysis of nonstationary signals. The potential advantages of those methods in speech research, especially most frequently used Wigner-Ville Distribution (WVD), were reduced by its important drawbacks: inter-components interferences, negative values, artifacts and spurious peaks. The results of recent investigations help to overcome this inconveniences, especially by smoothing, i.e. proper selection of window function in time, frequency and autocorrelation lag domains (Choi-Williams Distribution, Zhao-Atlas-Marks). The application of autoregressive modeling (AR) of PseudoWVD instead of FFT adds to this: smoothing of spectral envelope, more "peaked" spectrum (more visible resonant frequencies) and better frequency resolution. The time-frequency distribution is derived from AR coefficients of smoothed local autocorrelation sequences. The experiments with speech transients analysis, especially formant frequencies tracking were performed for various types of articulation for real and synthetic speech. The reliability of the method against added noise was also tested. The AR smoothed PWVD accurately follows the formants dynamics and variations. It works also fine in low SNR ratios. Reduced number of parameters, sufficiently describing spectral envelope suggest use of the method in speech recognition systems.

Full Paper

Bibliographic reference.  Marasek, Krzysztof (1993): "Speech transients analysis using AR-smoothed wigner-ville distribution", In EUROSPEECH'93, 393-396.