Our goal in this work was to develop an accurate method to identify laughter segments, ultimately for the purpose of speaker recognition. Our previous work used MLPs to perform frame level detection of laughter using short-term features, including MFCCs and pitch, and achieved a 7.9% EER on our test set. We improved upon our previous results by including high-level and long-term features, median filtering, and performing segmentation via a hybrid MLP/HMM system with Viterbi decoding. Upon including the long-term features and median filtering, our results improved to 5.4% EER on our test set and 2.7% EER on an equal-prior test set used by others. After attaining segmentation results by incorporating the hybrid MLP/HMM system and Viterbi decoding, we had a 78.5% precision rate and 85.3% recall rate on our test set. To our knowledge these are the best known laughter detection results on the ICSI Meeting Recorder Corpus to date.
Bibliographic reference. Knox, Mary Tai / Morgan, Nelson / Mirghafori, Nikki (2008): "Getting the last laugh: automatic laughter segmentation in meetings", In INTERSPEECH-2008, 797-800.