Sixth European Conference on Speech Communication and Technology
We describe a speaker tracking and detection system, for Switchboard conversations, that uses a twospeaker and silence hidden Markov model (HMM)with a minimumstate duration constraint and Gaussian mixture model (GMM) state distributionsadapted from a single gender- and handsetindependent imposter model distribution. Speaker tracking is used to segment speakers for detection, which is carried out by averaging frame scores of the Viterbi path and HNORM’ing via a novel parameter interpolation extension of HNORM for use with files of arbitrary lengths. Use of duration statistics augmenting the acoustic scores is also introduced via a nonlinear combination function. Results are reported on the NIST 1998 Multispeaker development evaluation dataset.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Sönmez, Kemal / Heck, Larry / Weintraub, Mitchel (1999): "Speaker tracking and detection with multiple speakers", In EUROSPEECH'99, 2219-2222.