Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

A Neural Network Speech Recognizer Based on the Both Acoustic Steady Portions and Transitions

Seyyed Ali Seyyed Salehi

Biomedical Eng. Dept., Amirkabir Univ. of Technology and Research Center of Intelligent Signal Processing (RCISP), Tehran, Iran

Previous works on speech recognition utilizing neural networks have often relied on either recognition through segmentation or mapping of the representation trajectories to the phoneme space. Here, information could be missed due to the manner of border labeling techniques. Recent works have indicated that firstly, phonetic borders and transitions would have a good potential to be recognized as acoustic units, and secondly, recognition of the fast transitions by neural networks, as fixed cues in time, results in high performance detection and recognition of those events. This approach was manifested through recognition of basic units formed from the VC and CV borders in Farsi (Persian) spoken language. Analysis of the resulting errors has indicated certain discrepancies amongst the theoretical linguistic points of view and implementation outcome. Implementation results have indicated that the CV, CVC and CVCC linguistic models for Farsi syllables do not always match the reality of the acoustic space in the speech signal.

Full Paper

Bibliographic reference.  Seyyed Salehi, Seyyed Ali (2000): "A neural network speech recognizer based on the both acoustic steady portions and transitions", In ICSLP-2000, vol.2, 871-874.