Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

An Improved Speech Analysis-Synthesis Algorithm Based on the Autoregressive with Exogenous Input Speech Production Model

Takahiro Ohtsuka (1), Hideki Kasuya (2)

(1) Graduate School of Engineering, (2) Faculty of Engineering, Utsunomiya University, Japan

Ding et al. have explored a novel pitch-synchronous speech analysis-synthesis method[1] based on an auto-regressive with exogenous input (ARX) speech production model. This method makes an automatic estimation of the vocal tract (formant) and voice source parameters from a speech utterance. This method, however, has suffered deficiencies in the analysis of a high-pitch voice and the introduction of click sounds in the transition between vocalic and weak voiced consonantal segments. This paper proposes an improved ARX method in order to solve the problems mentioned above. Perceptual comparison experiments have shown that quality of re-synthesized speech by the proposed method is higher than that by a well-known cepstral method.

Reference

  1. Ding, W., Kasuya, H., and Adachi, S. Simultaneous estimation of vocal tract and voice source parameters based on an ARX model. IEEE Trans. Inf. & Syst., E78-D, 738-743, 1995.


Full Paper

Bibliographic reference.  Ohtsuka, Takahiro / Kasuya, Hideki (2000): "An improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model", In ICSLP-2000, vol.2, 787-790.