An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities

Sweekar Sudhakara, Manoj Kumar Ramanathi, Chiranjeevi Yarra, Prasanta Kumar Ghosh


Goodness of pronunciation (GoP) is typically formulated with Gaussian mixture model-hidden Markov model (GMM-HMM) based acoustic models considering HMM state transition probabilities (STPs) and GMM likelihoods of context dependent phonemes. On the other hand, deep neural network (DNN)-HMM based acoustic models employed sub-phonemic (senone) posteriors instead of GMM likelihoods along with STPs. However, each senone is shared across many states; thus, there is no one-to-one correspondence between them. In order to circumvent this, most of the existing works have proposed modifications to the GoP formulation considering only posteriors neglecting the STPs. In this work, we derive a formulation for the GoP and it results in the formulation involving both senone posteriors and STPs. Further, we illustrate the steps to implement the proposed GoP formulation in Kaldi, a state-of-the-art automatic speech recognition toolkit. Experiments are conducted on English data collected from Indian speakers using acoustic models trained with native English data from LibriSpeech and Fisher-English corpora. The highest improvement in the correlation coefficient between the scores from the formulations and the expert ratings is found to be 14.89% (relative) better with the proposed approach compared to the best of the existing formulations that don’t include STPs.


 DOI: 10.21437/Interspeech.2019-2363

Cite as: Sudhakara, S., Ramanathi, M.K., Yarra, C., Ghosh, P.K. (2019) An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities. Proc. Interspeech 2019, 954-958, DOI: 10.21437/Interspeech.2019-2363.


@inproceedings{Sudhakara2019,
  author={Sweekar Sudhakara and Manoj Kumar Ramanathi and Chiranjeevi Yarra and Prasanta Kumar Ghosh},
  title={{An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={954--958},
  doi={10.21437/Interspeech.2019-2363},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2363}
}