Accessing Information in Spoken Audio

April 19-20, 1999
Cambridge, UK

Recognition-Compatible Speech Compression for Stored Speech

Roger Tucker (1), Tony Robinson (2), James Christie (2), and Carl Seymour (2)

(1) Hewlett Packard Laboratories, Bristol, UK
(2) Cambridge University Engineering Dept., Cambridge, UK

Two important components of a speech archiving system are the compression scheme and the search facility. We investigate two ways of providing these components. The first is to run the recogniser directly from the compressed speech - we show how even with a 2.4kbit/sec codec it is possible to produce good recognition results; but the search is slow. The second is to preprocess the speech and store the extra data in a compressed form along with the speech. In the case of an RNN-HMM hybrid system, the posterior probabilties provide a suitable intermediate data format. Vector quantizing these at just 625 bits/sec enables the search to run many times real-time and still maintain good recognition accuracy.

Bibliographic reference.  Tucker, Roger / Robinson, Tony / Christie, James / Seymour, Carl (1999): "Recognition-compatible speech compression for stored speech", In Access-Audio-1999, 69-72.