The present study compared the duration of Mandarin tones in three types of speech contexts: isolated monosyllables, formal text-reading passages, and casual conversations. A total of 156 adult speakers was recruited. The speech materials included 44 monosyllables recorded from each of 121 participants, 18 passages read by 2 participants, and 20 conversations conducted by 33 participants. The duration pattern of the four lexical tones in the isolated monosyllables was consistent with the pattern described in previous literature. However, the duration of the four lexical tones became much shorter and tended to converge to that of the neutral tone (i.e., tone 0) in the text-reading and conversational speech. The maximum-likelihood estimator revealed that the durational cue contributed to tone recognition in the isolated monosyllables. With a single speaker, the average tone recognition based on duration alone could reach approximately 65% correct. As the number of speakers increased (e.g., ≥ 4), tone recognition performance dropped to approximately 45% correct. In conversational speech, the maximum likelihood estimation of tones based on duration cues was only 23% correct. The tone duration provided little useful cue to differentiate Mandarin tonal identity in everyday situations.
Cite as: Yang, J., Zhang, Y., Li, A., Xu, L. (2017) On the Duration of Mandarin Tones. Proc. Interspeech 2017, 1407-1411, doi: 10.21437/Interspeech.2017-29
@inproceedings{yang17_interspeech, author={Jing Yang and Yu Zhang and Aijun Li and Li Xu}, title={{On the Duration of Mandarin Tones}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={1407--1411}, doi={10.21437/Interspeech.2017-29} }