8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

How to Access Audio Files of Large Data Bases Using In-car Speech Dialogue Systems

Sandra Mann, André Berton, Ute Ehrlich

DaimlerChrysler AG, Germany

Today, a number of in-car speech interfaces to handle large vocabulary are available. We propose an approach that allows accessing audio data on different media carriers and in various formats in a uniform way. This uniformity is achieved by providing an audio data retrieval via metadata. Each audio file is enhanced with machine readable information about several categories (e.g. title, artist, genre etc.). Searching for particular audio data the user may pre-select one of these categories, thus restricting the search area. The categories are the same in the metadata of all connected media carriers. The user may directly address the contents of the categories by means of speakable text entries (text enrolments), irrespective of the media carrier or format. Alternatively the user may search globally across all categories by speaking the complete name of a title, album, artist, genre or year - without having to navigate through complex hierarchies and long result lists. Uncertainties that are very likely to occur due to the amount of data that needs to be active are resolved by the system.

Full Paper

Bibliographic reference.  Mann, Sandra / Berton, André / Ehrlich, Ute (2007): "How to access audio files of large data bases using in-car speech dialogue systems", In INTERSPEECH-2007, 138-141.