8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Narrowband to Wideband Feature Expansion for Robust Multilingual ASR

Dušan Macho

Motorola Labs, USA

To build high quality wideband acoustic models for automatic speech recognition (ASR), a large amount of wideband speech training data is required. However, for a particular language, one may have available a lot of narrowband data, but only a limited amount of wideband data. This paper deals with such situation and proposes a narrowband to wideband expansion algorithm that expands the narrowband signal ASR features to wideband ASR features. The algorithm is tested in two practical situations comprising sufficient amount and insufficient amount of original wideband training data. Tests show that using a combination of wideband features and expanded features does not harm the ASR performance when having a sufficient amount of the original wideband data, and it improves the ASR performance significantly when only a limited amount of wideband data is originally available. In the presented multilingual tests, a unique expansion model is trained for four languages from the Speecon database. Availability of different amounts of wideband training data is considered, including the case when no wideband data is available. ASR experiments for each language confirm that the addition of expanded features to the wideband model training enhances the models and provides better results than using the limited amount of wideband data only. In all tests, the ETSI standard noise-robust front-end is used.

Full Paper

Bibliographic reference.  Macho, Dušan (2007): "Narrowband to wideband feature expansion for robust multilingual ASR", In INTERSPEECH-2007, 1118-1121.