Previous abstract | Contents | Next abstract

Unsupervised Personal Name Disambiguation

This paper presents a set of algorithms for distinguishing personal names with multiple real referents in text, based on little or no supervision. The approach utilizes an unsupervised clustering technique over a rich feature space of biographic facts, which are automatically extracted via a language-independent bootstrapping process. entities are then partitioned and linked to their real referents via the automatically extracted biographic data. Performance is evaluated based on both a test set of hand-labeled multi-referent personal names and via automatically generated pseudonames.


Gideon S. Mann and David Yarowsky, Unsupervised Personal Name Disambiguation. In: Proceedings of CoNLL-2003, Edmonton, Canada, 2003, pp. 33-40. [ps] [ps.gz] [pdf] [bibtex]
Last update: June 11, 2003. erikt@uia.ua.ac.be