5th International Conference on Spoken Language Processing
This paper explores techniques for utilizing untranscribed training data pools to increase the available training data for automatic speech recognition systems. It has been well established that current speech recognition technology, especially in Large Vocabulary Conversational Speech Recognition (LVCSR), is largely language independent, and that the dominant factor with regards to performance on a certain language is the amount of available training data. The paper addresses this need for increased training data by presenting ways to use untranscribed acoustic data to increase the training data size and thus improve speech recognition.
Bibliographic reference. Zavaliagkos, George / Siu, Man-Hung / Colthurst, Thomas / Billa, Jayadev (1998): "Using untranscribed training data to improve performance", In ICSLP-1998, paper 1007.