ISCA Archive SpeechProsody 2008
ISCA Archive SpeechProsody 2008

Data-driven unsupervised adaptation of acoustic-prosodic models

Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan

Training categorical prosody models for spoken language systems requires a significant amount of speech data annotated with the discrete labels of interest (such as boundary marks or word prominence information). In practice, the difficulty and expense incurred in producing corpora with rich prosodic transcriptions severely limits their integration within applications. In this paper, we explore the possibility of using a large, unlabeled corpus to adapt, in an unsupervised fashion, acousticprosodic models trained from a small, human-annotated seed dataset. Our experiments show that the proposed adaptation scheme improves the ability of the acoustic-prosodic model to distinguish between prosodic categories. On a test set derived from the Boston University Radio News Corpus, the adapted models reduced pitch accent detection error rate by 4.3% relative to the seed acoustic-prosodic models trained from the annotated data.


Cite as: Ananthakrishnan, S., Narayanan, S. (2008) Data-driven unsupervised adaptation of acoustic-prosodic models. Proc. Speech Prosody 2008, 161-164

@inproceedings{ananthakrishnan08_speechprosody,
  author={Sankaranarayanan Ananthakrishnan and Shrikanth Narayanan},
  title={{Data-driven unsupervised adaptation of acoustic-prosodic models}},
  year=2008,
  booktitle={Proc. Speech Prosody 2008},
  pages={161--164}
}