Every speech recognition system contains a speech/non-speech detection stage. Detected speech sequences are only passed through the speech recognition stage later on. In a noisy environment, non-speech segments can be an important source of error. In this work, we introduce a new speech/non-speech detection system based on fractal dimension and prosodic features plus the common used MFCC features. We evaluated our system performance using neural network and SVM classifiers on TIMIT speech database with a HMM based speech recognizer. Experimental results show very good performance in speech/non-speech detection.
Bibliographic reference. Shafiee, Soheil / Almasganj, Farshad / Jafari, Ayyoob (2008): "Speech/non-speech segments detection based on chaotic and prosodic features", In INTERSPEECH-2008, 111-114.