ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

A methodology and tool suite for evaluation of accuracy of interoperating statistical natural language processing engines

Uma Murthy, John F. Pitrelli, Ganesh Ramaswamy, Martin Franz, Burn L. Lewis

Evaluation of accuracy of natural language processing (NLP) engines plays an important role in their development and improvement. Such evaluation usually takes place at a per-engine level. For example, there are evaluation methods for engines such as speech recognition, machine translation, story boundary detection, etc. Many real-world applications require combinations of these functions. This has become possible now with NLP engines attaining sufficient accuracy to be able to combine them for complex tasks. However, it is not evident how the accuracy of output of such aggregates of engines will be evaluated. We present an evaluation methodology to address this problem. The key contribution of our work is an extensible methodology that narrows down possible combinations of machine outputs and ground truths to be compared at various stages in an aggregate of interoperating engines. We also describe two example evaluation modules that we developed following this methodology.


doi: 10.21437/Interspeech.2008-536

Cite as: Murthy, U., Pitrelli, J.F., Ramaswamy, G., Franz, M., Lewis, B.L. (2008) A methodology and tool suite for evaluation of accuracy of interoperating statistical natural language processing engines. Proc. Interspeech 2008, 2066-2069, doi: 10.21437/Interspeech.2008-536

@inproceedings{murthy08_interspeech,
  author={Uma Murthy and John F. Pitrelli and Ganesh Ramaswamy and Martin Franz and Burn L. Lewis},
  title={{A methodology and tool suite for evaluation of accuracy of interoperating statistical natural language processing engines}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2066--2069},
  doi={10.21437/Interspeech.2008-536}
}