12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

GorUp: An Ontology-Driven Audio Information Retrieval System that Suits the Requirements of Under-Resourced Languages

N. Barroso (1), K. López de Ipiña (2), A. Ezeiza (2), C. Hernández (2), N. Ezeiza (2), O. Barroso (1), U. Susperregi (1), S. Barroso (3)

(1) Irunweb, Spain
(2) Universidad del País Vasco, Spain
(3) Insima Teknologia, Spain

GorUp is an Information Retrieval system that provides information about the contents of audio broadcast news in Basque, Spanish, and French. Since the resources available for Basque in general, and for this task in particular, were very few, data optimization methodologies had to be applied in various phases of the development. Moreover, the agglutinative nature of Basque required the use of morphemes and other sub-word units. Additionally, some keyword spotting and semantic methods have been also applied in the system in order to retrieve information properly. In most of the cases, the methods employed during this project could suit the requirements of many under-resourced languages, and one of these techniques could be the ontology-based approach. This paper presents the system in general for Basque and emphasizes the techniques employed in order to enhance the system using a semantic ontology.

Full Paper

Bibliographic reference.  Barroso, N. / López de Ipiña, K. / Ezeiza, A. / Hernández, C. / Ezeiza, N. / Barroso, O. / Susperregi, U. / Barroso, S. (2011): "Gorup: an ontology-driven audio information retrieval system that suits the requirements of under-resourced languages", In INTERSPEECH-2011, 3173-3176.