INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Full-band Adaptive Harmonic Representation of Speech

Gilles Degottex, Yannis Stylianou

University of Crete, Computer Science Dep. and FORTH, Inst. of Computer Science Vasilika Vouton, Heraklion, Greece

In this paper we present a full-band Adaptive Harmonic Model (aHM) that is able to accurately reconstruct stationary and non stationary parts of speech. The model does not require any voiced/unvoiced decision, neither an accurate estimation of the pitch contour. Its robustness is based on a previously suggested adaptive Quasi-Harmonic model (aQHM) which provides a mechanism for frequency correction and adaptivity of its basis functions to the characteristics of the input signal. The suggested method overcomes limitations of the initial method based on aQHM in detecting frequency tracks over time, especially at mid and high frequencies, by employing a bandlimited iterative procedure for the re-estimation of the fundamental frequency. Listening tests show that reconstructed speech using aHM is mainly indistinguishable from the original signal, outperforming standard sinusoidal models (SM) and the aQHM-based method, while it uses less parameters for the reconstruction than SM.

Index Terms: Sinusoidal model, quasi-harmonic model, nonstationary basis, speech analysis.

Full Paper

Audio Examples (different voices and languages; for explanations see full paper)
Original    aHM-AIR    aQHNM    SM    (arctic_bdl1)
Original    aHM-AIR    aQHNM    SM    (arctic_slt1)
Original    aHM-AIR    aQHNM    SM    (nitech_jp_atr503_m001_j31)
Original    aHM-AIR    aQHNM    SM    (af049orgh)
Original    aHM-AIR    aQHNM    SM    (emodb_m_39)
Original    aHM-AIR    aQHNM    SM    (emodb_f_107)
Original    aHM-AIR    aQHNM    SM    (Luciano_K_It_m_s)
Original    aHM-AIR    aQHNM    SM    (Tiziana_C_It_f_s)
Original    aHM-AIR    aQHNM    SM    (XavierReference1.2)
Original    aHM-AIR    aQHNM    SM    (Christine.01_neutre)
Original    aHM-AIR    aQHNM    SM    (Kostas268)
Original    aHM-AIR    aQHNM    SM    (Maria263)

Bibliographic reference.  Degottex, Gilles / Stylianou, Yannis (2012): "A full-band adaptive harmonic representation of speech", In INTERSPEECH-2012, 382-385.