This paper presents recent progress in developing speech-to-text (STT) and keyword spotting (KWS) systems for the 2014 IARPA-Babel evaluation. Systems have been developed for the limited language pack condition for four of the five development languages in this program phase: Assamese, Bengali, Haitian Creole and Zulu. The systems have several novel characteristics that support rapid development of KWS systems. On the STT side different acoustic units are explored based on phonemic or graphemic representations, and system combination is used to improve STT performance. The acoustic models are trained on only 10 hours of speech data with manual transcriptions, completed with unsupervised training on additional untranscribed data. Both word and subword units (morphologically decomposed, syllables, phonemes) are used for KWS. The KWS systems are based on the multi-hypotheses produced by a consensus network decoding or searching word lattices. The word error rates of the individual STT systems are on the order of 5060%, and the KWS systems obtain Maximum Term Weighted Values ranging from 3045% for all keywords (in-vocabulary and out-of-vocabulary (OOV)). Sub-word units are shown to be successful at locating some of the OOV keywords, and system combination improves system performance.
Bibliographic reference. Le, Viet-Bac / Lamel, Lori / Messaoudi, Abdel / Hartmann, William / Gauvain, Jean-Luc / Woehrling, Cécile / Despres, Julien / Roy, Anindya (2014): "Developing STT and KWS systems using limited language resources", In INTERSPEECH-2014, 2484-2488.