ITRW on Non-Linear Speech Processing (NOLISP 05)

Barcelona, Spain
April 19-22, 2005

A two-level Drive-Response Model of Instationary Speech Signals

Friedhelm R. Drepper

Zentralinstitut für Elektronik, Forschungszentrum Jülich GmbH, Jülich, Germany

The transmission protocol of voiced speech is hypothesized to be based on a fundamental excitation or drive process, which synchronizes the vocal tract excitation on the transmitter side and evokes the loudness and pitch perception on the receiver side. The fundamental drive can be extracted from the speech signal by using a voice-specific subband decomposition. When used as fundamental drive of a two-level drive - response model with stationary coupling on both levels, the instationary drive is able to describe instationary speech as secondary response. For simplicity each subband specific primary response is assumed to be restricted to a nonlinear synchronisation manifold. Whereas the extraction of a physiologically interpretable fundamental phase is limited to voiced sections of speech, the fundamental amplitude can as well be used for the time scale separation of unvoiced sections.

Full Paper

Bibliographic reference.  Drepper, Friedhelm R. (2005): "A two-level drive-response model of instationary speech signals", In NOLISP-2005, 58-69.