13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

On the Modeling of Voiceless Stop Sounds of Speech using Adaptive Quasi-Harmonic Models

George P. Kafentzis (1,3), Olivier Rosec (1), Yannis Stylianou (2,3)

(1) Orange Labs, TECH/ASAP/VOICE, Lannion, France
(2) Institute of Computer Science, Foundation for Research and Technology Hellas, Greece
(3) Multimedia Informatics Lab, Computer Science Department, University of Crete, Greece

In this paper, the performance of the recently proposed adaptive signal models on modeling speech voiceless stop sounds is presented. Stop sounds are transient parts of speech that are highly non-stationary in time. State-of-the-art sinusoidal models fail to model them accurately and efficiently, thus introducing an artifact known as the pre-echo effect. The adaptive QHM and the extended adaptive QHM (eaQHM) are tested to confront this effect and it is shown that highly accurate, pre-echo-free representations of stop sounds are possible using adaptive schemes. Results on a large database of voiceless stops show that, on average, eaQHM improves by 100% the Signal to Reconstruction Error Ratio (SRER) obtained by the standard sinusoidal model.

Index Terms: Extended adaptive Quasi-Harmonic Model, Stop sounds, Speech analysis, Sinusoidal Modeling, Pre-echo effect

Full Paper

Bibliographic reference.  Kafentzis, George P. / Rosec, Olivier / Stylianou, Yannis (2012): "On the modeling of voiceless stop sounds of speech using adaptive quasi-harmonic models", In INTERSPEECH-2012, 859-862.