12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Multi-View Approach for Speaker Turn Role Labeling in TV Broadcast News Shows

Géraldine Damnati, Delphine Charlet

Orange Labs, France

Speaker role recognition in TV Broadcast News shows is addressed in this paper. Speaker turns are assigned a role among anchor, reporter and other. A multi-view approach is proposed exploiting the complementarities of lexical cues obtained from Automatic Speech Recognition output and acoustical cues obtained from speech signal analysis. Early and late fusions are compared. 90.1% classification accuracy is obtained on automatically segmented speaker turns for a 6.5 hours test corpus of 14 shows mixing news and conversational speech. Further analyses are provided for other speaker turns showing interesting perspectives towards finer-grained speaker role characterization.

Full Paper

Bibliographic reference.  Damnati, Géraldine / Charlet, Delphine (2011): "Multi-view approach for speaker turn role labeling in TV broadcast news shows", In INTERSPEECH-2011, 1285-1288.