INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Multipitch Tracking Using a Factorial Hidden Markov Model

Michael Wohlmayr, Franz Pernkopf

Graz University of Technology, Austria

In this paper, we present an approach to track the pitch of two simultaneous speakers. Using a well-known feature extraction method based on the correlogram, we track the resulting data using a factorial hidden Markov model (FHMM). In contrast to the recently developed multipitch determination algorithm [1], which is based on a HMM, we can accurately associate estimated pitch points with their corresponding source speakers. We evaluate our approach on the "Mocha-TIMIT" database [2] of speech utterances mixed at 0dB, and compare the results to the multipitch determination algorithm [1] used as a baseline. Experiments show that our FHMM tracker yields good performance for both pitch estimation and correct speaker assignment.

References>

  1. Wu M., Wang D. and Brown G.J., "A Multipitch Tracking Algorithm for Noisy Speech", IEEE Transactions On Speech and Audio Processing, 11(3):229-241, 2003.
  2. Wrench A., "A multichannel/multispeaker articulatory database for continuous speech recognition research", Phonus, 5:3-17, 2000

Full Paper

Bibliographic reference.  Wohlmayr, Michael / Pernkopf, Franz (2008): "Multipitch tracking using a factorial hidden Markov model", In INTERSPEECH-2008, 147-150.