Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA 2009)

Florence, Italy
December 14-16, 2009

Speech Morphing Based on Biologically Relevant Signal Representations

Hideki Kawahara

Auditory Media Laboratory, Faculty of Systems Engineering, Wakayama University, Wakayama, Japan

Voice morphing based on a high fidelity VOCODER is a unique strategy to explore attributes which are closely related to biological states of speakers. The method is based on a temporally stable power spectral representation and spectral envelope recovery based on a new formulation of the sampling theory. The morphing algorithm itself is re-formulated to enable extrapolation without introducing perceptual and objective breakdown. It also extended to make temporally-variable multi-aspect morphing possible. GUI (graphical user interface) based tools are implemented to handle complexities introduced by these extensions. For characterizing voicing, a bottom-up local repetition detector, a residual-based irregularity detector and a group delay-based acoustic event detector with multi-resolution analysis are prepared.

Index Terms. Spectrum, periodicity, speech perception, voicing, morphing

