Sixth International Conference on Spoken Language Processing
We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.
Bibliographic reference. Gruhn, Rainer / Singer, Harald / Tsukada, Hajime / Naito, Masaki / Nishino, Atsushi / Nakamura, Atsushi / Sagisaka, Yoshinori / Nakamura, Satoshi (2000): "Cellular-phone based speech-to-speech translation system ATR-MATRIX", In ICSLP-2000, vol.4, 448-451.