7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper describes a method of extracting time-varying features that is effective for speech signals with high fundamental frequencies. The proposed method adopts a speech production model that consists of a Time-Varying Auto-Regressive (TVAR) process for an articulatory filter and a Hidden Markov Model (HMM) for an excitation source. The model represents waveform amplitude variations by time-varying gain of the excitation source. The proposed algorithm is given by extending a Viterbi algorithm so that the proposed algorithm can adaptively estimate TVAR coefficients and time-varying gain with decoding the state transition of the excitation source HMM. We applied the proposed method to extracting time-varying features from both synthetic and natural speech, and confirmed its feasibility.
Bibliographic reference. Sasou, Akira / Tanaka, Kazuyo (2002): "Adaptive estimation of time-varying features from high-pitched speech based on an excitation source HMM", In ICSLP-2002, 2161-2164.