 |
Sixth International Conference on Spoken Language Processing (ICSLP 2000)
Beijing, China
October 16-20, 2000 |
 |
A Comparison of Data-Derived and Knowledge-Based Modeling of Pronunciation Variation
Mirjam Wester (1,2), Eric Fosler-Lussier (1)
(1) International Computer Science Institute, Berkeley, CA, USA
(2) A2RT, Dept. of Language and Speech, University of Nijmegen, The Netherlands
This paper focuses on modeling pronunciation variation
in two different ways: data-derived and knowledge-based.
The knowledge-based approach consists of using
phonological rules to generate variants. The data-derived
approach consists of performing phone recognition,
followed by various pruning and smoothing methods to
alleviate some of the errors in the phone recognition.
Using phonological rules led to a small improvement in
WER; whereas, using a data-derived approach in which
the phone recognition was smoothed using simple
decision trees (d-trees) prior to lexicon generation led to a
significant improvement compared to the baseline.
Furthermore, we found that 10% of variants generated by
the phonological rules were also found using phone
recognition, and this increased to 23% when the phone
recognition output was smoothed by using d-trees. In
addition, we propose a metric to measure confusability in
the lexicon and we found that employing this confusion
metric to prune variants results in roughly the same
improvement as using the d-tree method.
Full Paper
Bibliographic reference.
Wester, Mirjam / Fosler-Lussier, Eric (2000):
"A comparison of data-derived and knowledge-based modeling of pronunciation variation",
In ICSLP-2000, vol.1, 270-273.