EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Using Mutual Information to Design Class-Specific Phone Recognizers

Patricia Scanlon (1), Daniel P.W. Ellis (1), Richard Reilly (2)

(1) Columbia University, USA
(2) University College Dublin, Ireland

Information concerning the identity of subword units such as phones cannot easily be pinpointed because it is broadly distributed in time and frequency. Continuing earlier work, we use Mutual Information as measure of the usefulness of individual time-frequency cells for various speech classification tasks, using the hand-annotations of the TIMIT database as our ground truth. Since different broad phonetic classes such as vowels and stops have such different temporal characteristics, we examine mutual information separately for each class, revealing structure that was not uncovered in earlier work; further structure is revealed by aligning the time-frequency displays of each phone at the center of their hand-marked segments, rather than averaging across all possible alignments within each segment. Based on these results, we evaluate a range of vowel classifiers over the TIMIT test set and show that selecting input features according to the mutual information criteria can provides a significant increase in classification accuracy.

Full Paper

Bibliographic reference.  Scanlon, Patricia / Ellis, Daniel P.W. / Reilly, Richard (2003): "Using mutual information to design class-specific phone recognizers", In EUROSPEECH-2003, 857-860.