EUROSPEECH '93
Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993

      

Modelling the Glottal Pulse With a Self-Excited Threshold Auto-Regressive Model

Jean Schoentgen

Institute of Phonetics, CP110, Université Libre de Bruxelles, Brussels, & National Fund for Scientific Research, Belgium

A novel glottal pulse model is presented. Conventional glottal pulse models are a concatenation of curves. For instance, the Liljencrants-Fant model(*) assumes that a typical pulse is made up of a raised cosine multiplied by a growing exponential (curve I) and a decaying exponential (curve II). The switch from curve I to II occurs at time instant T1 when the slope of the closing phase is at its most negative. The switch-back occurs at pulse onset. Curve I is a solution of a second-order and curve I a solution of a first-order linear differential equation. Therefore, the LF-model can be physically interpreted in terms of a linear parametric oscillator. Indeed, a parametric oscillator is driven by periodically changing one or more of its parameters. In this article, we address the following two problems. First, the LF-model is at odds with established theory. Indeed, the aerodynamic-myoelastic theory of vocal fold vibration maintains that the laryngeal oscillator is not driven but self-sustained. Second, the LF-model must be fitted pitch-synchronously. This means that before curves I and II can be adjusted to a given pulse, the pulse has to be preprocessed so as to determine the onsets of the opening and return phases. This is considered to be an annoying technical problem because no algorithm exists that can reliably extract this kind of information from the pulse under all circumstances. We show that these problems can be solved by turning the LF-model into a self-sustained oscillator. This is carried out by letting the switch from curve I to II and back depend on signal amplitude instead of time. In the statistics literature, such a model is known as a (S)elf-(E)xcited (T)hreshold (A)uto(R)egressive model. This model has been developed to represent time-series that are output by nonlinear systems that sustain self-excited vibrations. We show that: First, a SETAR-model can be pitch-asynchronously fitted to a glottal pulse. Second, a SETAR-model exhibits self-sustained oscillations. Third, a SETAR-model can easily be turned into an LF-model In other words, it is an intermediate step in an algorithm that fits a LF-model pitch-asynchronously. We fitted the SETAR-model to glottal pulses obtained by glottal inverse filtering. Results show that the quality of the fit is similar to the one achieved by the LF-model (*) Strictly speaking, the original LF-model represents the derivative of the glottal pulse. It is obtained by differentiating curves I & II and by taking into account that a is much smaller than co.

Full Paper

Bibliographic reference.  Schoentgen, Jean (1993): "Modelling the glottal pulse with a self-excited threshold auto-regressive model", In EUROSPEECH'93, 107-110.