First International Conference on Spoken Language Processing (ICSLP 90)
In this paper, we presented two techniques for the automatic voiced/unvoiced/ silence classification of spoken Korean which is essential for the high quality speech synthesis and for the speech recognition system taking advantage of the acoustic-phonetic information. The database in this study is composed of five sentences spoken by 5 male and 5 female speakers. Each sentence was uttered twice by each speaker in a sound-treated room. (Almost all kinds of Korean unvoiced sounds are contained in these sentences.) One classification technique is based on the Neural Network utilizing the spectral and the time domain features such as spectral slope, energy, zero-crossing rate, and the autocorrelation coefficient at unit sample delay. The other adopts the conventional pattern classification technique, and uses almost the same features as above. Final classification accuracy of 96.2 % is achieved for both methods. Finally, the results are compared and possible future extensions are briefly discussed.
Bibliographic reference. Hahn, Hee-Il / Hahn, Minsoo (1990): "Voiced/unvoiced/silence classification of spoken Korean", In ICSLP-1990, 461-464.