International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

Improving Performance of Telephone-Based Mandarin Speech Recognition

Huayun Zhang, Bo Xu, Taiyi Huang

Chinese Academy of Sciences, Beijing, China

Since telephone is the only ubiquitous communications device in current world, it is the largest potential application field for speech techniques. Telephony speech recognition is a core technique for such telephone-based speech applications. It is well known that the bandwidth of telephone line is limited to 300~3400Hz and there are many inherent variations within the telephone network. All these make speech recognition over telephone a more difficult task compared to its desktop pairs. Additionally, due to the freely speaking style required by real applications and the diverse background environment, a perfect laboratory system may become very vulnerable in real world. So the robustness is the life-and-death issue for such commercial systems. In this paper, we will introduce our recent progresses on improving the performance for a Mandarin telephony speech recognition system. Our improvements include a more robust and straightforward feature extraction block for telephony speech and a novel dynamic channel compensation algorithm. And then we will focus our discussion on the strategy of dealing with outof- vocabulary (OOV) utterances. Through all these amendments, the systemís performance obviously improves in real applications.


Full Paper

Bibliographic reference.  Zhang, Huayun / Xu, Bo / Huang, Taiyi (2002): "Improving performance of telephone-based Mandarin speech recognition", In ISCSLP 2002, paper 71.