The point process model (PPM) for keyword search is a whole-word parametric modeling framework based on the timing of phonetic events rather than the evolution of frame-level phonetic likelihoods. Recent progress in PPM training and decoding algorithms has yielded state-of-the-art phonetic search performance in high-resource settings, both in terms of accuracy and computational efficiency. In this paper, we consider PPM application to low-resource settings where the amount of transcribed speech is severely limited and the pronunciation dictionary is incomplete. By using (i) state-of-the-art deep neural network acoustic models to generate phonetic events and (ii) grapheme-to-phoneme conversion to generate pronunciations for out-of-vocabulary (OOV) keywords, we find the PPM system reaches state-of-the-art OOV search performance at a small computational cost. Moreover, due to their complementary methodologies, combining PPM outputs with the LVCSR baseline produces average relative ATWV improvements of 7% and 50% for in-vocabulary and OOV keywords, respectively (16% overall).
Bibliographic reference. Liu, Chunxi / Jansen, Aren / Chen, Guoguo / Kintzley, Keith / Trmal, Jan / Khudanpur, Sanjeev (2014): "Low-resource open vocabulary keyword search using point process models", In INTERSPEECH-2014, 2789-2793.