Third International Conference on Spoken Language Processing (ICSLP 94)
This paper presents experiments in phonetic classification conducted as part of a study on the effects of microphone variations on performance in speech recognition systems. The TIMIT corpus provides data recorded on a close-talking microphone, on a free field microphone and over telephone lines. The study focuses on the unmatched training and. testing conditions under which degradation is most severe. Analysis of baseline performance characterizes the effects of microphone variations. Downsampling is shown to significantly improve performance for bandlimited conditions at the cost of some degradation for non-bandlimited conditions. Comparative analysis of microphone independent preprocessing techniques, including cepstral mean normalization, RASTA processing, spectral subtraction and codebook dependent cepstral normalization, reveals the effects and tradeoffs of different compensation techniques.
Bibliographic reference. Chang, Jane / Zue, Victor W. (1994): "A study of speech recognition system robustness to microphone variations: experiments in phonetic classification", In ICSLP-1994, 995-998.