In this paper, we describe a novel audio database recorded in home environments. The database contains continuous sounds from morning to evening, no matter what the subject is doing, although some utterances to invoke speech recognition are included in the data. It tells us how often speech interface is used, how speech interface is activated erroneously when it is not called, and how people speak when they really want to use speech recognition. The database also features parallel recording using microphone arrays, which is expected to improve the performance of speech/non-speech detection and speech recognition under noisy conditions. Preliminary experiments show that the speech/non-speech detection performance of the trigger-initiated activation system is relatively high, but that of the automatic activation system is not satisfactory. Adopting array-based and F0-based detection algorithms produces a slight rise of the precision/recall curve, but more research is necessary to realize a life with ubiquitous speech interface of home appliances, in which machines are always listening to you.
Bibliographic reference. Obuchi, Yasunari / Amano, Akio (2007): "Always listening to you: creating exhaustive audio database in home environments", In INTERSPEECH-2007, 566-569.