INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

A Comparison of Speaker Clustering and Speech Recognition Techniques for Air Situational Awareness

Wade Shen, Douglas Reynolds

MIT, USA

In this paper we compare speaker clustering and speech recognition techniques to the problem of understanding patterns of air traffic control communications. For a given radio transmission, our goal is to identify the talker and to whom he/she is speaking. This information, in combination with knowledge of the roles (i.e. takeoff, approach, hand-off, taxi, etc.) of different radio frequencies within an air traffic control region could allow tracking of pilots through various stages of flight, thus providing the potential to monitor the airspace in great detail. Both techniques must contend with degraded audio channels and significant non-native accents. We report results from experiments using the nn-MATC database [6] showing 9.3% and 32.6% clustering error for speaker clustering and ASR methods respectively.

Full Paper

Bibliographic reference.  Shen, Wade / Reynolds, Douglas (2007): "A comparison of speaker clustering and speech recognition techniques for air situational awareness", In INTERSPEECH-2007, 2421-2424.