9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Phoneme Recognition Based on Hybrid Neural Networks with Inhibition/Enhancement of Distinctive Phonetic Feature (DPF) Trajectories

Mohammad Nurul Huda, Kouichi Katsurada, Tsuneo Nitta

Toyohashi University of Technology, Japan

In this paper, we introduce a novel distinctive phonetic feature (DPF) extraction method that incorporates inhibition/enhancement functionalities by discriminating the DPF dynamic patterns of trajectories relevant or not. The trajectories of each DPF show a convex pattern when the DPF is relevant and a concave one when irrelevant. The proposed algorithm enhances convex type patterns and inhibits concave type patterns. We implement the algorithm into a phoneme recognizer and evaluate it. The recognizer consists of two stages. The first stage extracts 45 dimensional DPF vectors from local features (LFs) of input speech using a hybrid neural network and incorporates an inhibition/enhancement network to obtain modified DPF patterns, and the second stage orthogonalizes the DPF vectors and then feeds them to an HMM-based classifier. The proposed phoneme recognizer significantly improves the phoneme recognition accuracy with fewer mixture components by resolving coarticulation effects.

Full Paper

Bibliographic reference.  Huda, Mohammad Nurul / Katsurada, Kouichi / Nitta, Tsuneo (2008): "Phoneme recognition based on hybrid neural networks with inhibition/enhancement of distinctive phonetic feature (DPF) trajectories", In INTERSPEECH-2008, 1529-1532.