ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Combining task-dependent information with auditory attention cues for prominence detection in speech

Ozlem Kalinli, Shrikanth S. Narayanan

Auditory attention is a highly complex mechanism that involves the process of low-level acoustic features of sound together with higher level cognitive rules. In this paper, a novel method that combines biologically inspired auditory attention cues with higher level lexical and syntactic information is proposed to model task-dependent influences on a given task. The feature maps are extracted from sound at multi-scales by mimicking the processing stages in the human auditory system, and converted to low-level auditory gist features. Then, the auditory attention model biases the gist features based on the task to maximize target detection. The top-down task-dependent influence of lexical and syntactic information is incorporated into the model using a probabilistic approach. The combined model is tested to detect prominent syllables in speech using the BU Radio News Corpus. The model achieves 88% prominence detection accuracy at syllable level, which is comparable to reported human performance on this task.


doi: 10.21437/Interspeech.2008-329

Cite as: Kalinli, O., Narayanan, S.S. (2008) Combining task-dependent information with auditory attention cues for prominence detection in speech. Proc. Interspeech 2008, 1064-1067, doi: 10.21437/Interspeech.2008-329

@inproceedings{kalinli08_interspeech,
  author={Ozlem Kalinli and Shrikanth S. Narayanan},
  title={{Combining task-dependent information with auditory attention cues for prominence detection in speech}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1064--1067},
  doi={10.21437/Interspeech.2008-329}
}