ISCA Archive SpeechProsody 2008
ISCA Archive SpeechProsody 2008

Learning prosodic sequences using the fundamental frequency variation spectrum

Kornel Laskowski, Jens Edlund, Mattias Heldner

We investigate a recently introduced vector-valued representation of fundamental frequency variation, whose properties appear to be well-suited for statistical sequence modeling. We show what the representation looks like, and apply hidden Markov models to learn prosodic sequences characteristic of higher-level turn-taking phenomena. Our analysis shows that the models learn exactly those characteristics which have been reported for the phenomena in the literature. Further refinements to the representation lead to a 12-17% relative improvement in speaker change prediction for conversational spoken dialogue systems.


Cite as: Laskowski, K., Edlund, J., Heldner, M. (2008) Learning prosodic sequences using the fundamental frequency variation spectrum. Proc. Speech Prosody 2008, 151-154

@inproceedings{laskowski08_speechprosody,
  author={Kornel Laskowski and Jens Edlund and Mattias Heldner},
  title={{Learning prosodic sequences using the fundamental frequency variation spectrum}},
  year=2008,
  booktitle={Proc. Speech Prosody 2008},
  pages={151--154}
}