12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Building an Audio-Visual Corpus of Australian English: Large Corpus Collection with an Economical Portable and Replicable Black Box

Denis Burnham (1), Dominique Estival (1), Steven Fazio (1), Jette Viethen (2), Felicity Cox (2), Robert Dale (2), Steve Cassidy (2), Julien Epps (3), Roberto Togneri (4), Michael Wagner (5), Yuko Kinoshita (5), Roland Göcke (5), Joanne Arciuli (6), Marc Onslow (6), Trent Lewis (7), Andrew Butcher (7), John Hajek (8)

(1) University of Western Sydney, Australia
(2) Macquarie University, Australia
(3) University of New South Wales, Australia
(4) University of Western Australia, Australia
(5) University of Canberra, Australia
(6) University of Sydney, Australia
(7) Flinders University, Australia
(8) University of Melbourne, Australia

The Big Australian Speech Corpus project incorporates the strategic goals of 30 Chief Investigators from various speech science areas. Speech from 1000 geographically and socially diverse speakers is being recorded using a uniform and automated protocol plus standardized hardware and software to produce a widely applicable and extensible database - AusTalk. Here we describe the project's major components and organization; share the lessons learnt from difficulties and challenges; and present the results achieved so far.

Full Paper

Bibliographic reference.  Burnham, Denis / Estival, Dominique / Fazio, Steven / Viethen, Jette / Cox, Felicity / Dale, Robert / Cassidy, Steve / Epps, Julien / Togneri, Roberto / Wagner, Michael / Kinoshita, Yuko / Göcke, Roland / Arciuli, Joanne / Onslow, Marc / Lewis, Trent / Butcher, Andrew / Hajek, John (2011): "Building an audio-visual corpus of Australian English: large corpus collection with an economical portable and replicable black box", In INTERSPEECH-2011, 841-844.