Third International Conference on Spoken Language Processing (ICSLP 94)
In this paper we present a new research project to obtain a statistical survey of the pronunciation of German using an automatic system for segmentation and labeling of speech data and a very large data base of spoken German (GermAn Spoken in Public, GASP). It mainly involves the development of two components: a) An automatic system of speech verification (PHONSEG) which produces a segmentation with semi-continuous HMMs corresponding to a given input-string of phonetic segment labels, b) A rule corpus of German pronunciation (PHONHYP) with which various possible forms of pronunciation can be derived from a citation form in order to model the variability of speech on a relatively broad segmental level. The rules express processes in German that are well-known and that can be observed in manual transcriptions as for example contained in the German PhonDat database. The input to the system as a whole is the speech wave and the orthographic representation or the citation forms of the words contained in the utterance as a string of phonetic symbols. PHONSEG matches all possible forms that have been derived by PHONHYP on the signal and evaluates the likelihood. The output is the segmentation with the highest maximum overall likelihood, the corresponding labels and for each segment the time-normalized log likelihood.
Bibliographic reference. Wesenick, Maria-Barbara / Schiel, Florian (1994): "Applying speech verification to a large data base of German to obtain a statistical survey about rules of pronunciation", In ICSLP-1994, 279-282.