EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Blind Inversion of Multidimensional Functions for Speech Enhancement

John Hogden (1), Patrick Valdez (1), Shigeru Katagiri (2), Erik McDermott (2)

(1) Los Alamos National Laboratory, USA
(2) NTT Corporation, Japan

We discuss speech production in terms of a mapping from a low-dimensional articulator space to low-dimensional manifold embedded in a high-dimensional acoustic space. Our discussion highlights the advantages of using an articulatory representation of speech. We then summarize mathematical results showing that, because articulator motions are bandlimited, a large class of mappings from articulation to acoustics can be blindly inverted. Simulation results showing the power of the inversion technique are also presented. One of the most interesting simulation results is that some many-to-one mappings can also be inverted. These results explain earlier experimental results that the studied technique can recover articulator positions. We conclude that our technique has many advantages for speech processing, including invariance with respect to various nonlinearities and the ability to exploit context more easily.

Full Paper

Bibliographic reference.  Hogden, John / Valdez, Patrick / Katagiri, Shigeru / McDermott, Erik (2003): "Blind inversion of multidimensional functions for speech enhancement", In EUROSPEECH-2003, 1409-1412.