Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

High Performance "General Purpose" Phonetic Recognition for Italian

Piero Cosi (1), John-Paul Hosom (2)

(1) Istituto di Fonetica e Dialettologia - C.N.R., Padova, Italy
(2) Center for Spoken Language Understanding (CSLU-OGI), Oregon Graduate Institute, Portland, OR, USA

The development of a speaker independent "general purpose" phonetic recognizer for Italian is described. The CSLU Toolkit was used to develop and implement the system. The recognizer, based on a frame-based hybrid HMM/ANN architecture trained on context-dependent categories to account for coarticulatory variation, recognizes 38 different phonemes (not including silence or closures), and can distinguish between stressed and unstressed vowels as well as open and closed vowels. The APASCI corpus, containing nearly 2500 sentences read by 100 speakers, where the sentences have been designed to maximize the number of phonemes occurring in different contexts, was used for training and testing. As of the time of this writing, a phoneme-level accuracy of 82.90% on the development set and of 80.53% on the test set has been obtained. This level of accuracy is much greater than on a similar English-language corpus (with state-of-the-art performance of slightly better than 70%) and it represents the best performance obtained so far on this corpus.

Full Paper

Bibliographic reference.  Cosi, Piero / Hosom, John-Paul (2000): "High performance "general purpose" phonetic recognition for Italian", In ICSLP-2000, vol.2, 527-530.