International Symposium on Chinese Spoken Language Processing
August 23-24, 2002
Robust Speech Detection with Heteroscedastic Discriminant Analysis Applied to the Time-Frequency Energy
Ye Tian, Zuoying Wang, Dajin Lu
Tsinghua University, Beijing, China
In this paper, we propose a robust speech detection algorithm
with Heteroscedastic Discriminant Analysis (HDA) applied to the
Time-Frequency Energy (TFE). The TFE consists of the log energy
in time domain, the log energy in the fixed band 250-3500 Hz,
and the log Mel-scale frequency bands energy. The bottom-up
algorithm with automatic threshold adjustment is used for
accurate word boundary detection. Compared to the algorithms
based on the energy in time domain , the ATF parameter ,
the energy and the LDA-MFCC parameter , the proposed
algorithm shows better performance under different types
L. F. Lamel, L. R. Rabiner, A. E. Rosenberg, and J. G. Wilson,
"An improved endpoint detector for isolated word recognition,"
IEEE Trans. Acoustic, Speech and Signal Processing, v29,
pp. 777-785, Aug. 1981.
- G. D. Wu and C. T. Lin, "Speech detection with mel-Scale
frequency bank in noisy environment". IEEE Trans.
Speech and Audio Processing, v8, pp. 541-554, Sep 2000.
- A. Martin, D. Charlet, and L. Mauuary,
"Robust speech/non-speech detection using LDA applied to MFCC",
Proceedings of ICASSPí2001, v1, pp. 237-240, 2001.
TIAN, Ye / WANG, Zuoying / LU, Dajin (2002):
"Robust speech detection with heteroscedastic discriminant analysis applied to the time-frequency energy",
In ISCSLP 2002, paper 88.