9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speech/Non-Speech Segments Detection Based on Chaotic and Prosodic Features

Soheil Shafiee, Farshad Almasganj, Ayyoob Jafari

Amirkabir University of Technology, Iran

Every speech recognition system contains a speech/non-speech detection stage. Detected speech sequences are only passed through the speech recognition stage later on. In a noisy environment, non-speech segments can be an important source of error. In this work, we introduce a new speech/non-speech detection system based on fractal dimension and prosodic features plus the common used MFCC features. We evaluated our system performance using neural network and SVM classifiers on TIMIT speech database with a HMM based speech recognizer. Experimental results show very good performance in speech/non-speech detection.

Full Paper

Bibliographic reference.  Shafiee, Soheil / Almasganj, Farshad / Jafari, Ayyoob (2008): "Speech/non-speech segments detection based on chaotic and prosodic features", In INTERSPEECH-2008, 111-114.