ISCA Archive SIOA 1989
ISCA Archive SIOA 1989

Tools for phonetic labeling and phonetic assessment

C. Bourjot, A. Boyer, Dominique Fohr, Jean-Paul Haton

The design of a semi-automatic labeling system is necessary for labeling large speech databases, either for acoustic-phonetic studies or automatic phonetic decoding assessment. Such a system accepts as input the standard phonetic transcription of a sentence and the corresponding speech signal uttered by a speaker. Its output consists of a segemented and labeled sentence that the user may eventually correct. We propose an algorithm for semi-automatic labeling which yields the segmentation (begin-end- center) into phonetic units. This algorithm operates in three successive steps: coarse segmentation into macro-classes and determination of the main pronunciation of a sentence based on phonological rules, matching of the phoneme sequence representing the sentence against the speech signal by using a dynamic time warping technique. The algorithm is controlled by confusion, insertion and omission matrices obtained from a small hand labeled corpus, eventual correction of the output by the user by reference to the speech signal and its spectrogram displayed synchronously. This interactive tool was tested on multispeaker corpora. Comparison between manual and semiautomatic labeling will be provided both in terms of accuracy and time. The assessment of phonetic decoding necessitates large transcribed databases made up either of semi-automatically labeled corpora or of corpora with only standard phonetic transcription. The assessment method we propose is based on two algorithms, i.e. the previous semi-automatic labeling technique and the usual dynamic time-warping between the phonetic transcription of a sentence and the speech signal. The dynamic time warping algorithm gives the "best" warping path between both sequences of unit according to a given criterion but not necessarily the good one. In order to avoid these errors made by the alignment process which are computed as errors of the acoustic phonetic decoder, we perform an adaptation of the algorithm to the system in two steps: determination of the characteristics of the phonetic decoder by running the DTW an a small labeled corpus assessment of the decoder using the DTW driven by these characteristics. The comparison of this methodology according to both types of databases is carried out in terms of complexity of algorithms and multilingual adaptability.


Cite as: Bourjot, C., Boyer, A., Fohr, D., Haton, J.-P. (1989) Tools for phonetic labeling and phonetic assessment. Proc. Speech Input/Output Assessment and Speech Databases, Vol.2, 157 (abstract)

@inproceedings{bourjot89_sioa,
  author={C. Bourjot and A. Boyer and Dominique Fohr and Jean-Paul Haton},
  title={{Tools for phonetic labeling and phonetic assessment}},
  year=1989,
  booktitle={Proc. Speech Input/Output Assessment and Speech Databases},
  pages={Vol.2, 157 (abstract)}
}