September 22-25, 1997
We describe new methods for speaker-independent, continuous mandarin speech recognition based on the IBM HMM-based continuous speech recognition system (1-3): First, we treat tones in mandarin as attributes of certain phonemes, instead of syllables. Second, instantaneous pitch is treated as a variable in the acoustic feature vector, in the same way as cepstra or energy. Third, by designing a set of word-segmentation rules to convert the continuous Chinese text into segmented text, an effective trigram language model is trained(4). By applying those new methods, a speaker-independent, very-large-vocabulary continuous mandarin dictation system is demonstrated. Decoding results showed that its performance is similar to the best results for US English.
Bibliographic reference. Chen, C. Julian / Gopinath, Ramesh A. / Monkowski, Michael D. / Picheny, Michael A. / Shen, Katherine (1997): "New methods in continuous Mandarin speech recognition", In EUROSPEECH-1997, 1543-1546.