11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Phoneme Classification and Lattice Rescoring Based on a k-NN Approach

Ladan Golipour, Douglas O'Shaughnessy

INRS-EMT, Canada

In this paper we propose a k-NN/SASH phoneme classification algorithm that competes favourably with state-of- the-art methods. We apply a similarity search algorithm (SASH) that has been used successfully for classification of high dimensional texts and images. Unlike other search algorithms, the computational time of SASH is not affected by the dimensionality of the data. Therefore, we generate fixed-length but high-dimensional feature vectors for phonemes using their underlying frames and those of boundaries. The k-NN/SASH phoneme classifier is fast, efficient, and could achieve a classification rate of 79.2% for the TIMIT test database. Finally, we apply this algorithm to rescore phoneme lattices, generated by the GMM-HMM monophone recognizer for both context-independent and context-dependent tasks. In both cases, the k-NN/SASH classifier leads to improvements in the recognition rate.

Full Paper

Bibliographic reference.  Golipour, Ladan / O'Shaughnessy, Douglas (2010): "Phoneme classification and lattice rescoring based on a k-NN approach", In INTERSPEECH-2010, 1954-1957.