Weakly supervised concept tagging: combining a generative and a discriminative approach

TitleWeakly supervised concept tagging: combining a generative and a discriminative approach
Publication TypeTalks
Authorsvan de Loo, J., De Pauw G., Gemmeke J. F., & Daelemans W.
Place PresentedPresented at the 25th Meeting of Computational Linguistics in the Netherlands (CLIN 2015), Antwerp, Belgium
Year of Publication2015
Date Presented06/02/2015

In previous work, we presented FramEngine, a system that learns to map utterances onto semantic frames in a weakly supervised way, as the training data does not specify any alignments between sub-sentential units in the utterances and slots in the semantic frames. Moreover, the semantic frames used for training often contain redundant information that is not referenced to inside the utterance. FramEngine uses hierarchical hidden Markov models (HHMMs) to model the association between the slots in the semantic frame and the words in the utterance. As such, the trained HHMMs can be used to tag utterances with concepts, viz. slot values in the semantic frames, and the resulting slot value sequences can be converted into semantic frames with filled slots.

Previous experiments have shown that FramEngine achieves high semantic frame induction performances with small amounts of training data. In our current work, we show that we can further improve its performance by adding a retraining step with a discriminative concept tagger. Concept tagging in the retraining phase is performed in two substeps, in order to incorporate particular generalisations, which have already proven to be effective in FramEngine. In the presented experiments, we use orthographic transcriptions of spoken utterances as input. Improvements are especially made when the utterances contain disfluencies such as interjections or restarts, which makes the combined system particularly useful for speech-based input.

poster_clin25_janneke.pdf836.21 KB