ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

The effect of fundamental frequency on Mandarin speech recognition

Sharlene Liu, Sean Doyle, Allen Morris, Farzad Ehsani

ABSTRACT We study the effects of modeling tone in Mandarin speech recognition. Including the neutral tone, there are 5 tones in Mandarin and these tones are syllable-level phenomena. A direct acoustic manifestation of tone is the fundamental frequency (f0). We will report on the effect of f0 on the acoustic recognition accuracy of a Mandarin recognizer. In particular, we put f0, its first derivative (f0'), and its second derivative (f0'') in separate streams of the feature vector. Stream weights are adjusted to investigate the individual effects of f0, f0', and f0'' to recognition accuracy. Our results show that incorporating the f0 feature negatively impacted accuracy, whereas f0' increased accuracy and f0'' seemed to have no effect.


doi: 10.21437/ICSLP.1998-761

Cite as: Liu, S., Doyle, S., Morris, A., Ehsani, F. (1998) The effect of fundamental frequency on Mandarin speech recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0847, doi: 10.21437/ICSLP.1998-761

@inproceedings{liu98c_icslp,
  author={Sharlene Liu and Sean Doyle and Allen Morris and Farzad Ehsani},
  title={{The effect of fundamental frequency on Mandarin speech recognition}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0847},
  doi={10.21437/ICSLP.1998-761}
}