We investigate approaches for large vocabulary continuous speech recognition (LVCSR) system for new languages or new domains using limited amounts of transcribed training data. In these low resource conditions, the performance of conventional LVCSR systems degrade significantly. We propose to train low resource LVCSR system with additional sources of information like annotated data from other languages (German and Spanish) and various acoustic feature streams (short-term and modulation features). We train multilayer perceptrons (MLPs) on these sources of information and use Tandem features derived from the MLPs for low resource LVCSR. In our experiments, the proposed system trained using only one hour of English conversational telephone speech (CTS) provides a relative improvement of 11% over the baseline system.
Bibliographic reference. Thomas, Samuel / Ganapathy, Sriram / Hermansky, Hynek (2010): "Cross-lingual and multi-stream posterior features for low resource LVCSR systems", In INTERSPEECH-2010, 877-880.