Previous abstract | Contents | Next abstract

Inducing Syntactic Categories by Context Distribution Clustering

This paper addresses the issue of the automatic induction of syntactic categories from unannotated corpora. Previous techniques give good results, but fail to cope well with ambiguity or rare words. An algorithm, context distribution clustering (CDC), is presented which can be naturally extended to handle these problems.

Alexander Clark, Inducing Syntactic Categories by Context Distribution Clustering. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal, 2000. [ps] [pdf] [bibtex]

Last update: June 27, 2001. erikt@uia.ua.ac.be