This paper proposes a statistical model of speech fundamental frequency (F0) contours, based on the formulation of the discretetime stochastic process version of the Fujisaki model, which is known as a wellfounded mathematical model representing the control mechanism of vocal fold vibration. There are two important motivations for this statistical formulation. One is to derive a general parameter estimation framework for the Fujisaki model, allowing for the introduction of powerful statistical methods, and the other is to introduce a measure of speech naturalness in terms of an F0 contour through a probability distribution assumption, that can be incorporated into many statistical speech processing problems such as speech analysis, synthesis, separation, denoising and dereverberation.
Index Terms: speech F0 contour, statistical model
Bibliographic reference. Kameoka, Hirokazu / Roux, Jonathan Le / Ohishi, Yasunori (2010): "A statistical model of speech F0 contours", In SAPA2010, 4348.