INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Speech Inventory Based Discriminative Training for Joint Speech Enhancement and Low-Rate Speech Coding

Xiaoqiang Xiao (1), Robert M. Nickel (2)

(1) Pennsylvania State University, USA
(2) Bucknell University, USA

A significant extension to a novel inventory based speech processing procedure published by the authors in 2009 and 2010 is presented. The method is based on a speech analysis and re-synthesis scheme for scenarios in which speaker enrollment and noise enrollment are feasible. The procedure jointly provides speech enhancement and high-quality low-rate speech encoding with a flexible rate of just below 1.5 kbits/sec in average. In this paper we are presenting a significant improvement of the original approach that fosters intelligibility in lower SNR environments. We are proposing to augment the originally solely HMM based analysis stage with a discriminative training algorithm that dramatically improves the accuracy of the employed inventory frame selection process. A comparison mean opinion score (CMOS) study shows that the new method leads to a significant gain in overall perceptual quality between the encoder input and the decoder output.

Full Paper

Bibliographic reference.  Xiao, Xiaoqiang / Nickel, Robert M. (2010): "Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding", In INTERSPEECH-2010, 2398-2401.