In the framework of an ANN/HMM hybrid system for phone recognition three specialized ANNs were designed and evaluated. One of these ANNs detects the manner of articulation. The other two ANNs describe the speech signal in terms of place of articulation. One of these is used for plosive and nasal classification, and the other one is used for fricative classification. The design of these networks was inspired by acoustic-phonetic knowledge. Input parameters, ANN topology, and desired output representation have been optimized for the specific task of the network. A main advantage of ANNs over statistical classifiers like HMMs is seen in the possibility to use a large unconstrained feature set which can be setup in order to contain all necessary information rather than to fulfill statistical constraints. Experiments are reported for the TIMIT database.
Cite as: Bengio, Y., Mori, R.D., Flammia, G., Kompe, H. (1991) Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 551-554, doi: 10.21437/Eurospeech.1991-137
@inproceedings{bengio91_eurospeech, author={Yoshua Bengio and Renato De Mori and Giovanni Flammia and Half Kompe}, title={{Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks}}, year=1991, booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)}, pages={551--554}, doi={10.21437/Eurospeech.1991-137} }