10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

2-D Processing of Speech for Multi-Pitch Analysis

Tianyu T. Wang, Thomas F. Quatieri


This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonically-related signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are analyzed to extract pitch candidates. These candidates are then combined across multiple regions for obtaining separate pitch estimates of each speech-signal component at a single point in time. We refer to this as multi-region analysis (MRA). By explicitly accounting for pitch dynamics within localized time segments, this separability is distinct from that which can be obtained using short-time autocorrelation methods typically employed in state-ofthe- art multi-pitch tracking algorithms. We illustrate the feasibility of MRA for multi-pitch estimation on mixtures of synthetic and real speech.

Bibliographic reference.  Wang, Tianyu T. / Quatieri, Thomas F. (2009): "2-d processing of speech for multi-pitch analysis", In INTERSPEECH-2009, 2827-2830.