Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes a system that has been developed to provide either the segmentation of labeled speech or the labeling and segmentation of speech, when only the orthographic transcription is available. The system is conceived to provide a broad correspondence between acoustic and phonetic levels of representation, without requiring a great human effort in the phonetic transcription and segmentation of a speech database. The technique is based on the use of an acoustic-phonetic unit Hidden Markov Model(HMM) recognizer: both the recognizer and the segmentation system have been designed exploiting the DA11PA-TIMIT acoustic-phonetic continuous speech database of American English. Segmentation and labeling experiments have been conducted in different conditions to check the reliability of the resulting system. Satisfactory results have been obtained, when the system is trained with some manually presegmented material. The size of this material represents an important information: to this purpose, system performance lias been evaluated with respect to this parameter.
Bibliographic reference. Brugnara, F. / Falavigna, D. / Omologo, Maurizio (1992): "A HMM-based system for automatic segmentation and labeling of speech", In ICSLP-1992, 803-806.