10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

SpLaSH (Spoken Language Search Hawk): Integrating Time-Aligned with Text-Aligned Annotations

Sara Romano, Elvio Cecere, Francesco Cutugno

Università di Napoli Federico II, Italy

In this work we present SpLaSH (Spoken Language Search Hawk), a toolkit used to perform complex queries on spoken language corpora. In SpLaSH, tools for the integration of time aligned annotations (TMA), by means of annotation graphs, with text aligned ones (TXA), by means of generic XML files, are provided. SpLaSH imposes a very limited number of constraints to the data model design, allowing the integration of annotations developed separately within the same dataset and without any relative dependency. It also provides a GUI allowing three types of queries: simple query on TXA or TMA structures, sequence query on TMA structure and cross query on both TXA and TMA integrated structures.

Full Paper

Bibliographic reference.  Romano, Sara / Cecere, Elvio / Cutugno, Francesco (2009): "SplaSH (spoken language search hawk): integrating time-aligned with text-aligned annotations", In INTERSPEECH-2009, 1487-1490.