ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition

Fabian Triefenbach, Azarakhsh Jalalvand, Kris Demuynck, Jean-Pierre Martens

Reservoir Computing (RC) has recently been introduced as an interesting alternative for acoustic modeling. For phone and continuous digit recognition, the reservoir approach obtained quite promising results. In this work, we further elaborate this concept by porting some well-known techniques used to enhance recognition rates of GMM-based models to Reservoir Computing. In particular, we introduce context-dependent (CD) triphone states to model co-articulation and pronunciation mismatches arising from an imperfect lexicon. We also propose to incorporate two speaker normalization methods in the feature space, namely mean & variance normalization and vocal tract length normalization. The impact of the investigated techniques is studied in the context of phone recognition on the TIMIT corpus. Our CD-RC-HMM hybrid yields a speaker-independent phone error rate (PER) of 22% and a speakerdependent PER of 20.5%. By combining GMM and RC-based likelihoods at the state level, these scores can be reduced further.


doi: 10.21437/Interspeech.2013-739

Cite as: Triefenbach, F., Jalalvand, A., Demuynck, K., Martens, J.-P. (2013) Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition. Proc. Interspeech 2013, 3342-3346, doi: 10.21437/Interspeech.2013-739

@inproceedings{triefenbach13_interspeech,
  author={Fabian Triefenbach and Azarakhsh Jalalvand and Kris Demuynck and Jean-Pierre Martens},
  title={{Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3342--3346},
  doi={10.21437/Interspeech.2013-739}
}