Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Synthesis of Pathological Voice Based on a Stochastic Voice Source Model

Yasuo Endo, Hideki Kasuya

Faculty of Engineering, Utsunomiya University, Utsunomiya, Japan

This paper proposes a stochastic voice source model to synthesize pathological voice as well as normal voice. The voice signal is assumed to consist of a harmonic signal and an additive laryngeal noise signal. Perturbation of fundamental period, which is the key characteristic of pathological voice, is represented by an autoregressive moving average (ARMA) model. Suitability of the model is tested based on a spectral flatness measure by applying it to the normalized period sequence of 52 pathologic voice samples. Relationship between the model parameters and perceived roughness quality is also investigated. Based on this model, we construct an analysis-conversion-synthesis system of pathological voice. The system is applicable not only to perceptual experiments to explore acoustic correlates of pathological voice qualities, but also to the simulation of the voice quality variations of the voice treatment at voice clinics.

