Interspeech'2005 - Eurospeech
A time scale separation of voiced speech signals is introduced, which avoids the assumption of a frequency gap between the acoustic response and the prosodic drive. The non-stationary drive is extracted selfconsistently from a voice specific subband decomposition of the speech signal. When the band limited prosodic drive is used as fundamental drive of a two-level drive-response model, the voiced excitation can be reconstructed as a trajectory on a generalized synchronization manifold, which is suited to serve as cue for phoneme recognition and as fingerprint for speaker recognition.
Bibliographic reference. Drepper, F. R. (2005): "Voiced excitation as entrained primary response of a reconstructed glottal master oscillator", In INTERSPEECH-2005, 329-332.