![]() |
ISCA Workshop on Multilingual Speech and Language Processing (MULTILING 2006)Center for Language and Speech Technology, Stellenbosch University, Stellenbosch, South Africa |
![]() |
In this paper, a data-driven approach to statistical modeling pronunciation variation is proposed. It consists of learning stochastic pronunciation rules. The proposed method jointly models different rules that define the same transformation. Hierarchic Grouping Rule Inference (HIEGRI) algorithm is proposed to generate this model based on graphs. HIEGRI algorithm detects the common patterns of an initial set of rules and infers more general rules for each given transformation. A rule selection strategy is used to find as general as possible rules without losing modeling accuracy. Learned rules are applied to generate pronunciation variants in a context-dependent acoustic model based recognizer. Pronunciation variation modeling method is evaluated on a Spanish recognizer framework.
Bibliographic reference. Caballero, Mónica / Moreno, Asunción (2006): "Statistical modeling of pronunciation variation by hierarchical grouping rule inference", In MULTILING-2006, paper 019.