ISCA Workshop on
Statistical And Perceptual Audition

Makuhari, Japan
September 25, 2010

A Statistical Model of Speech F0 Contours

Hirokazu Kameoka, Jonathan Le Roux, Yasunori Ohishi

NTT Communication Science Laboratories, NTT Corporation, Japan

This paper proposes a statistical model of speech fundamental frequency (F0) contours, based on the formulation of the discrete-time stochastic process version of the Fujisaki model, which is known as a well-founded mathematical model representing the control mechanism of vocal fold vibration. There are two important motivations for this statistical formulation. One is to derive a general parameter estimation framework for the Fujisaki model, allowing for the introduction of powerful statistical methods, and the other is to introduce a measure of speech naturalness in terms of an F0 contour through a probability distribution assumption, that can be incorporated into many statistical speech processing problems such as speech analysis, synthesis, separation, denoising and dereverberation.

Index Terms: speech F0 contour, statistical model

Full Paper

Bibliographic reference.  Kameoka, Hirokazu / Roux, Jonathan Le / Ohishi, Yasunori (2010): "A statistical model of speech F0 contours", In SAPA-2010, 43-48.