ASR Inspired Syllable Stress Detection for Pronunciation Evaluation Without Using a Supervised Classifier and Syllable Level Features

Manoj Kumar Ramanathi, Chiranjeevi Yarra, Prasanta Kumar Ghosh


Automatic syllable stress detection is typically performed with a supervised classifier considering manually annotated stress markings and features computed within the syllable segments derived from phoneme transcriptions and their time-aligned boundaries. However, the manual annotation is tedious and the errors in estimating segmental information could degrade stress detection accuracy. In order to circumvent these, we propose to estimate stress markings in automatic speech recognition (ASR) framework involving finite-state-transducer (FST) without using annotated stress markings and segmental information. For this, we train the ASR system with native English data along with pronunciation lexicon containing canonical stress markings and decode non-native utterances as pronunciations embedded with stress markings. In the decoding, we use an FST encoded with the pronunciations derived using phoneme transcriptions and the instructions involved in a typical manual annotation. Experiments are conducted on polysyllabic words taken from ISLE corpus containing utterances spoken by Italian and German speakers and using the ASR models trained with three corpora. Among all the three models, the highest stress detection accuracies with the proposed approach respectively on Italian & German speakers are found to be 2.07% & 1.19% higher than and comparable to those with the two supervised classification approaches used as baselines.


 DOI: 10.21437/Interspeech.2019-2091

Cite as: Ramanathi, M.K., Yarra, C., Ghosh, P.K. (2019) ASR Inspired Syllable Stress Detection for Pronunciation Evaluation Without Using a Supervised Classifier and Syllable Level Features. Proc. Interspeech 2019, 924-928, DOI: 10.21437/Interspeech.2019-2091.


@inproceedings{Ramanathi2019,
  author={Manoj Kumar Ramanathi and Chiranjeevi Yarra and Prasanta Kumar Ghosh},
  title={{ASR Inspired Syllable Stress Detection for Pronunciation Evaluation Without Using a Supervised Classifier and Syllable Level Features}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={924--928},
  doi={10.21437/Interspeech.2019-2091},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2091}
}