INTERSPEECH 2004 - ICSLP
This paper addresses the use of a data-driven approach to determine a multidialectal phone set for an automatic speech recognition system for Spanish dialects. This approach is based on a decision tree clustering algorithm that tries to cluster contextual units of different dialects. This procedure avoids the definition of a global phonetic inventory and the previous study of similarity of sounds. The procedure is applied in Spanish as spoken in Spain, Colombia and Venezuela. Results show differences between phonemes that share the same SAMPA symbol in different dialects and also detect similarities between phonemes that are represented by different symbols in dialectal variants. Recognition results using this multidialectal approach overcome the monodialectal ones.
Bibliographic reference. Caballero, Monica / Moreno, Asuncion / Nogueiras, Albino (2004): "Data driven multidialectal phone set for Spanish dialects", In INTERSPEECH-2004, 837-840.