In this paper, we investigate the recognition of cochlear implant-like spectrally reduced speech (SRS) using conventional speech features (MFCCs and PLP coefficients) and HMM-based ASR. The SRS was synthesized from subband temporal envelopes extracted from original clean speech for testing, whereas the acoustic models were trained on a different set of original clean speech signals of the same speech database. It was shown that changing the bandwidth of the subband temporal envelopes had no significant effect on the ASR word accuracy. In addition, increasing the number of frequency subbands of the SRS from 4 to 16 improved significantly the system performance. Furthermore, the ASR word accuracy attained with the original clean speech, by using both MFCC-based and PLP-based speech features, can be achieved by using the 16-, 24-, or 32-subband SRS. The experiments were carried out by using the TI-digits speech database and the HTK speech recognition toolkit.
Bibliographic reference. Do, Cong-Thanh / Pastor, Dominique / Lan, Gaël Le / Goalic, André (2010): "Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients", In INTERSPEECH-2010, 2634-2637.