In this paper, we investigate multiple approaches for automatically detecting intoxicated speakers given samples of their speech. Intoxicated speech in a given language can be viewed simply as a different accent of this language; therefore we adopt our recent approach to dialect and accent recognition to detect intoxication. The system models phonetic structural differences across sober and intoxicated speakers. This approach employs SVM with a kernel function that computes similarities between adapted phone GMMs which summarize speakers' phonetic characteristics in their utterances. We also investigate additional cues, such as prosodic events, phonotactics and phonetic durations under intoxicated and sober conditions. We find that our phonetic-based system when combined with phonotactic features provides us with our best result on the official development set, an accuracy of 73% and an equal error rate of 26.3%, significantly higher than the official baseline.
Bibliographic reference. Biadsy, Fadi / Wang, William Yang / Rosenberg, Andrew / Hirschberg, Julia (2011): "Intoxication detection using phonetic, phonotactic and prosodic cues", In INTERSPEECH-2011, 3209-3212.