EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

A New Pitch Synchronous Time Domain Phoneme Recognizer Using Component Analysis and Pitch Clustering

Ramon Prieto, Jing Jiang, Chi-Ho Choi

Stanford University, USA

A new framework for time domain voiced phoneme recognition is shown. Each speech frame taken for training and recognition is bounded by consecutive glottal closures. A pre-processing stage is designed and implemented to model pitch synchronous frames with gaussian mixture models. Component analysis carried out on the data shows optimal performance with a very small number of components, requiring low computational power. We designed a new clustering technique that, using the pitch period, gives better results than other well known clustering algorithms like k-means.

Full Paper

Bibliographic reference.  Prieto, Ramon / Jiang, Jing / Choi, Chi-Ho (2003): "A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering", In EUROSPEECH-2003, 2481-2484.