ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Prosodic modeling of Mandarin speech and its application to lexical decoding

Wern-Jun Wang, Yuan-Fu Liao, Sin-Horng Chen

In this paper, a new RNN-based prosodic modeling method for Mandarin speech recognition is proposed. It is performed in the post-processing stage of the acoustic decoding aiming at detecting word boundaries for assisting in the lexical decoding. It employs a simple RNN to learn the relationship between input prosodic features, extracted from the input utterance with syllable boundaries provided by the preceding acoustic decoding, and output information related to word boundaries. Simulations on a large single-speaker database were performed to evaluate the proposed method. Experimental results showed that 71.9% of word tags and 95.3% of punctuation mark (PM) tags could be correctly detected. By incorporating the prosodic model into an HMM-based continuous Mandarin speech recognition system, the character recognition rate increased from 73.6% to 74.7% with a reduction of 17% on the computational complexity. So the proposed prosodic modeling method is helpful for speech recognition.


doi: 10.21437/Eurospeech.1999-180

Cite as: Wang, W.-J., Liao, Y.-F., Chen, S.-H. (1999) Prosodic modeling of Mandarin speech and its application to lexical decoding. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 743-746, doi: 10.21437/Eurospeech.1999-180

@inproceedings{wang99_eurospeech,
  author={Wern-Jun Wang and Yuan-Fu Liao and Sin-Horng Chen},
  title={{Prosodic modeling of Mandarin speech and its application to lexical decoding}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={743--746},
  doi={10.21437/Eurospeech.1999-180}
}