ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Multipitch tracking using a factorial hidden Markov model

Michael Wohlmayr, Franz Pernkopf

In this paper, we present an approach to track the pitch of two simultaneous speakers. Using a well-known feature extraction method based on the correlogram, we track the resulting data using a factorial hidden Markov model (FHMM). In contrast to the recently developed multipitch determination algorithm [1], which is based on a HMM, we can accurately associate estimated pitch points with their corresponding source speakers. We evaluate our approach on the "Mocha-TIMIT" database [2] of speech utterances mixed at 0dB, and compare the results to the multipitch determination algorithm [1] used as a baseline. Experiments show that our FHMM tracker yields good performance for both pitch estimation and correct speaker assignment.

s> Wu M., Wang D. and Brown G.J., "A Multipitch Tracking Algorithm for Noisy Speech", IEEE Transactions On Speech and Audio Processing, 11(3):229-241, 2003.

Wrench A., "A multichannel/multispeaker articulatory database for continuous speech recognition research", Phonus, 5:3-17, 2000

doi: 10.21437/Interspeech.2008-34

Cite as: Wohlmayr, M., Pernkopf, F. (2008) Multipitch tracking using a factorial hidden Markov model. Proc. Interspeech 2008, 147-150, doi: 10.21437/Interspeech.2008-34

  author={Michael Wohlmayr and Franz Pernkopf},
  title={{Multipitch tracking using a factorial hidden Markov model}},
  booktitle={Proc. Interspeech 2008},