14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Weakly Supervised Parsing with Rules

C. Cerisara (1), A. Lorenzo (1), P. Kral (2)

(1) LORIA, France
(2) University of West Bohemia, Czech Republic

This work proposes a new research direction to address the lack of structures in traditional n-gram models. It is based on a weakly supervised dependency parser that can model speech syntax without relying on any annotated training corpus. Labeled data is replaced by a few hand-crafted rules that encode basic syntactic knowledge. Bayesian inference then samples the rules, disambiguating and combining them to create complex tree structures that maximize a discriminative model's posterior on a target unlabeled corpus. This posterior encodes sparse selectional preferences between a head word and its dependents. The model is evaluated on English and Czech newspaper texts, and is then validated on French broadcast news transcriptions.

Full Paper

Bibliographic reference.  Cerisara, C. / Lorenzo, A. / Kral, P. (2013): "Weakly supervised parsing with rules", In INTERSPEECH-2013, 2192-2196.