A significant extension to a novel inventory based speech processing procedure published by the authors in 2009 and 2010 is presented. The method is based on a speech analysis and re-synthesis scheme for scenarios in which speaker enrollment and noise enrollment are feasible. The procedure jointly provides speech enhancement and high-quality low-rate speech encoding with a flexible rate of just below 1.5 kbits/sec in average. In this paper we are presenting a significant improvement of the original approach that fosters intelligibility in lower SNR environments. We are proposing to augment the originally solely HMM based analysis stage with a discriminative training algorithm that dramatically improves the accuracy of the employed inventory frame selection process. A comparison mean opinion score (CMOS) study shows that the new method leads to a significant gain in overall perceptual quality between the encoder input and the decoder output.
Bibliographic reference. Xiao, Xiaoqiang / Nickel, Robert M. (2010): "Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding", In INTERSPEECH-2010, 2398-2401.