We present a new method for pitch-synchronously extracting formants. Pitch-synchronous formants extraction is the estimation of formant frequencies and bandwidths during the closed phase (CP) of a glottal cycle. Conventionally, the formants are obtained from a linear auto-regressive model that is fitted to the signal portion emitted during the closed phase of the glottis. The positions and durations of the closed phases have to be determined beforehand. The problem with this approach is that methods that automatically detect the closed phase of the glottis cycle from the speech signal (or the laryngogram) are not totally reliable and the accurate measurements of CP- durations cannot be obtained under all circumstances and for all kinds of signals. Here, we present a method that does not require the detection of the CPs before modelling. The method uses a compound auto-regressive model (AR). When the model is fitted, error-minimisation automatically determines the positions of the closed and open phases of the glottis and the values of the model coefficients. This is achieved by fitting, inside a single analysis window, two linear constant coefficient auto-regressive models at the same time; the first to the CP-portions and the second to the rest of the signal. We present results obtained by applying this compound model to synthetic and natural vowels, and to natural sentences. Results show that the segmentation is robust and that the formant values obtained are typical of the closed phase values of the speech signal.
Bibliographic reference. Schoentgen, Jean / Azami, Zoubir (1993): "Pitch-synchronous formant extraction by means of a compound auto-regressive model", In EUROSPEECH'93, 401-404.