Speaker role recognition in TV Broadcast News shows is addressed in this paper. Speaker turns are assigned a role among anchor, reporter and other. A multi-view approach is proposed exploiting the complementarities of lexical cues obtained from Automatic Speech Recognition output and acoustical cues obtained from speech signal analysis. Early and late fusions are compared. 90.1% classification accuracy is obtained on automatically segmented speaker turns for a 6.5 hours test corpus of 14 shows mixing news and conversational speech. Further analyses are provided for other speaker turns showing interesting perspectives towards finer-grained speaker role characterization.
Bibliographic reference. Damnati, Géraldine / Charlet, Delphine (2011): "Multi-view approach for speaker turn role labeling in TV broadcast news shows", In INTERSPEECH-2011, 1285-1288.