INTERSPEECH 2004 - ICSLP
In this paper, a new approach is proposed for recognizing speech of mutually unintelligible spoken Chinese regionalects based on a unified three-layer framework and a one-stage searching strategy. This framework includes (1) a unified acoustic model for all the considered regionalects; (2) a multiple pronunciation lexicon constructed by both a rule-based and a data-driven approaches; (3) a one-stage searching network, whose nodes represent the Chinese characters with their multiple pronunciations. Unlike the traditional approaches, the new approach avoids searching the intermediate local optimal syllable sequences or lattices. Instead, by using the Chinese characters as the searching nodes, the new approach can search to find the globally optimal character sequences directly. This paper reports the experiments on two of the Chinese regionalects, i.e., Taiwanese and Mandarin. Results show that the unified framework can efficiently deal with the issues of multiple pronunciations of the spoken Chinese regionalects. The character error reduction rate is 34.1%, which is achieved by using the new approach compared with the traditional two-stage scheme. Furthermore, the new approach is shown more robust when dealing with the poor uttered speech database.
Bibliographic reference. Lyu, Ren-Yuan / Lyu, Dau-Cheng / Liang, Min-Siong / Wang, Min-Hong / Chiang, Yuang-Chin / Hsu, Chun-Nan (2004): "A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects"", In INTERSPEECH-2004, 1001-1004.