EUROSPEECH '93

A novel glottal pulse model is presented. Conventional glottal pulse models are a concatenation of curves. For instance, the LiljencrantsFant model(*) assumes that a typical pulse is made up of a raised cosine multiplied by a growing exponential (curve I) and a decaying exponential (curve II). The switch from curve I to II occurs at time instant T1 when the slope of the closing phase is at its most negative. The switchback occurs at pulse onset. Curve I is a solution of a secondorder and curve I a solution of a firstorder linear differential equation. Therefore, the LFmodel can be physically interpreted in terms of a linear parametric oscillator. Indeed, a parametric oscillator is driven by periodically changing one or more of its parameters. In this article, we address the following two problems. First, the LFmodel is at odds with established theory. Indeed, the aerodynamicmyoelastic theory of vocal fold vibration maintains that the laryngeal oscillator is not driven but selfsustained. Second, the LFmodel must be fitted pitchsynchronously. This means that before curves I and II can be adjusted to a given pulse, the pulse has to be preprocessed so as to determine the onsets of the opening and return phases. This is considered to be an annoying technical problem because no algorithm exists that can reliably extract this kind of information from the pulse under all circumstances. We show that these problems can be solved by turning the LFmodel into a selfsustained oscillator. This is carried out by letting the switch from curve I to II and back depend on signal amplitude instead of time. In the statistics literature, such a model is known as a (S)elf(E)xcited (T)hreshold (A)uto(R)egressive model. This model has been developed to represent timeseries that are output by nonlinear systems that sustain selfexcited vibrations. We show that: First, a SETARmodel can be pitchasynchronously fitted to a glottal pulse. Second, a SETARmodel exhibits selfsustained oscillations. Third, a SETARmodel can easily be turned into an LFmodel In other words, it is an intermediate step in an algorithm that fits a LFmodel pitchasynchronously. We fitted the SETARmodel to glottal pulses obtained by glottal inverse filtering. Results show that the quality of the fit is similar to the one achieved by the LFmodel (*) Strictly speaking, the original LFmodel represents the derivative of the glottal pulse. It is obtained by differentiating curves I & II and by taking into account that a is much smaller than co.
Bibliographic reference. Schoentgen, Jean (1993): "Modelling the glottal pulse with a selfexcited threshold autoregressive model", In EUROSPEECH'93, 107110.