11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Recognizing Cochlear Implant-Like Spectrally Reduced Speech with HMM-Based ASR: Experiments with MFCCs and PLP Coefficients

Cong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic

Lab-STICC, France

In this paper, we investigate the recognition of cochlear implant-like spectrally reduced speech (SRS) using conventional speech features (MFCCs and PLP coefficients) and HMM-based ASR. The SRS was synthesized from subband temporal envelopes extracted from original clean speech for testing, whereas the acoustic models were trained on a different set of original clean speech signals of the same speech database. It was shown that changing the bandwidth of the subband temporal envelopes had no significant effect on the ASR word accuracy. In addition, increasing the number of frequency subbands of the SRS from 4 to 16 improved significantly the system performance. Furthermore, the ASR word accuracy attained with the original clean speech, by using both MFCC-based and PLP-based speech features, can be achieved by using the 16-, 24-, or 32-subband SRS. The experiments were carried out by using the TI-digits speech database and the HTK speech recognition toolkit.

Full Paper

Bibliographic reference.  Do, Cong-Thanh / Pastor, Dominique / Lan, Gaël Le / Goalic, André (2010): "Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients", In INTERSPEECH-2010, 2634-2637.