ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

2-d processing of speech for multi-pitch analysis

Tianyu T. Wang, Thomas F. Quatieri

This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonically-related signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are analyzed to extract pitch candidates. These candidates are then combined across multiple regions for obtaining separate pitch estimates of each speech-signal component at a single point in time. We refer to this as multi-region analysis (MRA). By explicitly accounting for pitch dynamics within localized time segments, this separability is distinct from that which can be obtained using short-time autocorrelation methods typically employed in state-ofthe- art multi-pitch tracking algorithms. We illustrate the feasibility of MRA for multi-pitch estimation on mixtures of synthetic and real speech.

doi: 10.21437/Interspeech.2009-722

Cite as: Wang, T.T., Quatieri, T.F. (2009) 2-d processing of speech for multi-pitch analysis. Proc. Interspeech 2009, 2827-2830, doi: 10.21437/Interspeech.2009-722

  author={Tianyu T. Wang and Thomas F. Quatieri},
  title={{2-d processing of speech for multi-pitch analysis}},
  booktitle={Proc. Interspeech 2009},