This paper reports on the problems occurring in the process of building LVCSR (Large Vocabulary Continuous Speech Recognition) corpora based on the internal evaluation of the Polish database JURISDIC. The initial assumptions are discussed together with technical matters concerning the database realization and annotation results. Providing rich database statistics was considered crucial especially regarding linguistic description both for database evaluation and for the implementation of linguistic factors in acoustic models for speech recognition. The assumed principles for database construction are: low redundancy, acoustic-phonetic variability adequate to dictation task, representativeness, balanced, heterogeneous structure enabling separate or combined modeling of phonetic-acoustic structures.
Cite as: Klessa, K., Demenko, G. (2009) Structure and annotation of Polish LVCSR speech database. Proc. Interspeech 2009, 1815-1818, doi: 10.21437/Interspeech.2009-529
@inproceedings{klessa09_interspeech, author={Katarzyna Klessa and Grażyna Demenko}, title={{Structure and annotation of Polish LVCSR speech database}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={1815--1818}, doi={10.21437/Interspeech.2009-529} }