5th International Conference on Spoken Language Processing
This study presents a new approach for robust speech activity detection (SAD). Our framework is based on HMM recognition of speech versus silence. We model speech as one of fourteen large phone classes whereas silence is represented as a separate model. Individual test utterances are concatenated to simulate read continuous speech for testing. The HMM-based algorithm is compared to both an energy based, as well as speech enhancement based, SAD algorithms for clean, 5 dB and 0 dB SNR levels under white Gaussian noise (WGN), aircraft cockpit noise (AIR) and automobile highway noise (HWY). We found that our algorithm provides lower frame error rates than the other two methods especially for HWY noise. Unlike other studies, we evaluate our algorithm on the core test set of the standard TIMIT database. Hence, results can be used as benchmarks to evaluate future systems.
Bibliographic reference. Sarikaya, Ruhi / Hansen, John H. L. (1998): "Robust speech activity detection in the presence of noise", In ICSLP-1998, paper 0922.