EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Identifying Speakers in Children's Stories for Speech Synthesis

Jason Y. Zhang (1), Alan W. Black (1), Richard Sproat (2)

(1) Carnegie Mellon University, USA
(2) AT&T Labs Research, USA

Choosing appropriate voices for synthesizing children's stories requires text analysis techniques that can identify which portions of the text should be read by which speakers. Our work presents techniques to take raw text stories and automatically identify the quoted speech, identify the characters within the stories and assign characters to each quote. The resulting marked-up story may then be rendered with a standard speech synthesizer with appropriate voices for the characters. This paper presents each of the basic stages in identification, and the algorithms, both rule-driven and data-driven, used to achieve this. A variety of story texts are used to test our system. Results are presented with a discussion of the limitations and recommendations on how to improve speaker assignment in further texts.

Full Paper

Bibliographic reference.  Zhang, Jason Y. / Black, Alan W. / Sproat, Richard (2003): "Identifying speakers in children's stories for speech synthesis", In EUROSPEECH-2003, 2041-2044.