ISCA Archive ISCSLP 2002
ISCA Archive ISCSLP 2002

Robust speech detection with heteroscedastic discriminant analysis applied to the time-frequency energy

Ye Tian, Zuoying Wang, Dajin Lu

In this paper, we propose a robust speech detection algorithm with Heteroscedastic Discriminant Analysis (HDA) applied to the Time-Frequency Energy (TFE). The TFE consists of the log energy in time domain, the log energy in the fixed band 250-3500 Hz, and the log Mel-scale frequency bands energy. The bottom-up algorithm with automatic threshold adjustment is used for accurate word boundary detection. Compared to the algorithms based on the energy in time domain [1], the ATF parameter [2], the energy and the LDA-MFCC parameter [3], the proposed algorithm shows better performance under different types of noise.

s

L. F. Lamel, L. R. Rabiner, A. E. Rosenberg, and J. G. Wilson, "An improved endpoint detector for isolated word recognition," IEEE Trans. Acoustic, Speech and Signal Processing, v29, pp. 777-785, Aug. 1981. G. D. Wu and C. T. Lin, "Speech detection with mel-Scale frequency bank in noisy environment". IEEE Trans. Speech and Audio Processing, v8, pp. 541-554, Sep 2000. A. Martin, D. Charlet, and L. Mauuary, "Robust speech/non-speech detection using LDA applied to MFCC", Proceedings of ICASSPÂ’2001, v1, pp. 237-240, 2001.


Cite as: Tian, Y., Wang, Z., Lu, D. (2002) Robust speech detection with heteroscedastic discriminant analysis applied to the time-frequency energy. Proc. International Symposium on Chinese Spoken Language Processing, paper 88

@inproceedings{tian02_iscslp,
  author={Ye Tian and Zuoying Wang and Dajin Lu},
  title={{Robust speech detection with heteroscedastic discriminant analysis applied to the time-frequency energy}},
  year=2002,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={paper 88}
}