Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Cellular-Phone Based Speech-To-Speech Translation System ATR-MATRIX

Rainer Gruhn, Harald Singer (1), Hajime Tsukada (2), Masaki Naito (3), Atsushi Nishino (4), Atsushi Nakamura (5), Yoshinori Sagisaka, Satoshi Nakamura

ATR Spoken Language Translation Research Laboratories, Seika-cho, Soraku-gun, Kyoto, Japan
(1) currently at SpeechWorks International, Boston, USA; (2) currently at NTT Service Integration Labs, Tokyo, Japan; (3) currently at KDD R&D Laboratories, Saitama, Japan; (4) currently at IMG, Osaka, Japan; (5) currently at NTT Communication Science Labs, Kyoto, Japan

We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate downsampling and filtering of high-quality databases, and 2.) by piping, similar to the NTIMIT and CTIMIT paradigms. An evaluation of acoustic models with filtered, piped and real cellular-phone data is given. Recognition rates are at same levels as for wideband speech.

Full Paper

