Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Improved Lexicon Formation Through Removal of Co-articulation and Acoustic Recognition Errors

Philip Hanna, Darryl Stewart, Ji Ming, F. J. Smith

School of Computer Science The Queen’s University of Belfast, Northern Ireland, UK

It is becoming increasingly more necessary that speech recognition systems contain an accurate lexicon, consisting of likely word pronunciations that actually occur within a given domain. Given the increasing size of speech databases, it would appear that data driven approaches are best suited to derive such pronunciations. Presently, however, such an approach often introduces implausible pronunciations, resulting in a higher degree of confusability within the decoder. In this paper, we outline a novel data driven approach which aims to improve the quality of extracted word pronunciations through the removal of co-articulation effects and acoustic model misclassifications from the speech data. A number of selection constraints are additionally employed to exclude any improbable pronunciation alternatives. Initial experiments have shown that the approach does indeed provide plausible pronunciation alternatives without introducing improbable pronunciations.


Full Paper

Bibliographic reference.  Hanna, Philip / Stewart, Darryl / Ming, Ji / Smith, F. J. (2000): "Improved lexicon formation through removal of co-articulation and acoustic recognition errors", In ICSLP-2000, vol.1, 50-53.