ISCA Archive SLTU 2014
ISCA Archive SLTU 2014

The development of new corpora for under-resourced languages using data available for well-resourced ones

Pavel Skrelin, Nina Volskaya, Karina Evgrafova, Riikka Ullakonoja

In the paper we propose to exploit existing corpora of wellresourced languages as a basis for developing similar corpora of under-resourced ones. The construction of this type of corpora will allow finding common patterns of acoustic manifestation of similar functional states regardless of the language. The analysis of these corpora will also allow investigating universal and language-specific features reflected in speech. Two pilot experiments which may contribute to the proposed strategy are presented.

Index Terms: under-resourced languages, parallel speech corpora, acoustics, intonation


Cite as: Skrelin, P., Volskaya, N., Evgrafova, K., Ullakonoja, R. (2014) The development of new corpora for under-resourced languages using data available for well-resourced ones. Proc. 4th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2014), 243-246

@inproceedings{skrelin14_sltu,
  author={Pavel Skrelin and Nina Volskaya and Karina Evgrafova and Riikka Ullakonoja},
  title={{The development of new corpora for under-resourced languages using data available for well-resourced ones}},
  year=2014,
  booktitle={Proc. 4th Workshop on Spoken Language Technologies for Under-Resourced Languages  (SLTU 2014)},
  pages={243--246}
}