Sixth International Conference on Spoken Language Processing
This paper proposes a new algorithm for Mandarin speech Consonant and Vowel (C/V) segmentation based on the fractal theory. The new method focuses on searching the transient region between the Consonant and Vowel parts in a Mandarin syliable that in general is a concatenation of a consonant followed by a vowel. The Multiscale Fractal Dimension Set (MFD) stands for the fractal dimensions at multiple maximum resolutions of computation. Just using the r-variance of MFD (the degree of the difference from all elements of a MFD) to distinguish clearly between the stable phonemes and their transient region, the algorithm can directly search the speech frame with minimum r-variance of MFD as the C/V segmentation boundary. A result of 95.2% segmentation accuracy is obtained for clean test corpus, and 82.3% accuracy in noisy environment with the SNR of 10 dB. This shows that the new C/V segmentation algorithm is qualified for the task of continuous Mandarin speech recognition.
Bibliographic reference. Wang, Fan / Zheng, Fang / Wu, Wenhu (2000): "A c/v segmentation method for Mandarin speech based on multiscale fractal dimension", In ICSLP-2000, vol.4, 648-651.