 |
2003 ISCA Workshop on
Multilingual Spoken Document Retrieval
(MSDR2003)
Hong Kong
April 4-5, 2003 |
 |
Two Robust Methods for Cantonese Spoken Document Retrieval
Pui Yu Hui, Wai Kit Lo, Helen M. Meng
Human-Computer Communications Laboratory,
Department of Systems Engineering and Engineering Management,
The Chinese University of Hong Kong, China
This paper reports on two methods aimed at achieving robustness
for Cantonese spoken document retrieval. Our experimental
corpus contains 60 hours of Cantonese television news
broadcasts with over 1600 news stories. These spoken
documents are indexed by automatic speech recognition of
Cantonese base syllables. Recognition performance degrades
significantly as we migrate from anchor speech recorded in the
studio to reporter/interviewee speech recorded in the field.
Recognition errors affect retrieval performance. We devised two
robust methods to reduce the adverse effects of speech
recognition errors on retrieval: (1) developing techniques to
automatically extract studio speech from the audio tracks and
using only these in retrieval; and (2) using N-best recognition
hypotheses for document expansion prior to retrieval. Results
indicate that (i) the best method to automatically extract studio
speech segments fuses audio-based segmentation with
video-based segmentation; (ii) using only the studio speech
segments for our known-item retrieval task may not necessarily
bring about better retrieval performance since we are discarding
approximately three quarters of the audio in our corpus; (iii) the
use of N-best recognition hypothesis for document expansion can
bring about further improvements in retrieval performance,
attaining an average inverse rank of 0.654.
Full Paper
Bibliographic reference.
Hui, Pui Yu / Lo, Wai Kit / Meng, Helen M. (2003):
"Two robust methods for cantonese spoken document retrieval",
In MSDR-2003, 7-12.