Le Capitole


Fifth Computational Natural Language Learning Workshop

Toulouse, France, July 6-7, 2001


CoNLL is the yearly workshop organized by SIGNLL, the Association for Computational Linguistics Special Interest Group on Natural Language Learning (http://www.aclweb.org/signll/). Previous CoNLL meetings were held in Madrid (1997), Sydney (1998), Bergen (1999) and Lisbon (2000). The 2001 event will be held as a two-days workshop at the 39th Annual Meeting of the Association for Computational Linguistics (ACL), July 6-11, 2001 in Toulouse, France.

This year, a special theme will be the focus of the workshop:

Interaction and Automation in Language Learning Resources

Apart from this special theme, the workshop will accept contributions about language learning topics, including, but not limited to:

This year's workshop will also accept submissions for a shared task (segmenting a text into clauses-clausing).

The Workshop

Main Session Theme: Interaction and Automation in Language Learning Resources

The purpose of the special theme is to present and discuss state-of-the-art learning mechanisms for the automated acquisition of language resources (dictionaries, ontologies, grammars) or the automated adaptation of natural language resources/processors to new domains or languages. The dimensions of learning that are of interest for this session include

Lately there have been new learning mechanisms that use either large amounts of raw data or small sets of carefully constructed tagged training samples. Learning language can be construed as learning numbers or parameters for some statistical or symbolic system, or learning rules assigning structures to input data (or a mix of those). Learning can be done off-line, which introduces the problem of interpreting (if needed) the derived knowledge before its use in an NLP engine; or on-line, which raises user interaction problems. Different approaches are tailored to solve different kinds of problems subject to a different balance of requirements (large vs. small training set, tagged vs. untagged training data, results needs interpretation or can be used as is, etc.). While this session aims at presenting the largest panorama of learning techniques, we encourage submission of work on semi-automated learning techniques that involve interaction with a human during the learning process or the intervention of a linguist for interpreting results.

Special Session: Shared Task - Segmenting Text Into Clauses

We invite groups to take part in a shared task: Segmenting a Text Into Clauses (Clausing). Participating groups will be provided with the same training and testing material, and will all use the same evaluation criteria, thus allowing comparison between various learning technologies. After Chunking, the CoNLL-2000 shared task, Clausing can be seen as the next step towards a full parsing.

More information on this shared task is available at:


Invited Session: Learning Computational Grammars

There will be a special session devoted to the presentation and discussion of results of the EU Learning Computational Grammars project (Coordinator: John Nerbonne). Project participants include: the University of Groningen (The Netherlands, coordinator), University of Antwerp (Belgium), the University of Tuebingen (Germany), SRI Cambridge (UK), the University College Dublin (Ireland), the University of Geneva (Switzerland), and Xerox Grenoble (France).

Invited Speaker

Eric Brill, MSR.


Format for Paper Submissions for Main Session

Submit an abstract of maximum 1500 words (Postscript, PDF or plain text ASCII) by April 6, 2001 electronically to the address below. Authors of accepted abstracts will be invited to produce a full paper to be published in the proceedings of the workshop, which will be available at the workshop for participants, and distributed afterwards by the ACL. Final submissions must follow the two-column format of ACL proceedings We strongly recommend the use of ACL LaTeX style files or Microsoft Word Style files tailored for this year's conference. They are available from the ACL-2001 program committee Web-site at http://acl2001.dfki.de/style/. More information on electronic submissions here. Submit main session abstracts to:

Walter Daelemans, walter.daelemans@uia.ua.ac.be
Centrum Nederlandse Taal en Spraak.
Linguistics, Department of Germanic languages and literature
UIA, University of Antwerp
Universiteitsplein 1, B-2610 Wilrijk, Belgium


Rémi Zajac, zajac@crl.nmsu.edu
Computing Research Laboratory
New Mexico State University
PO Box 30001 Dept. 3CRL
Las Cruces NM 88003

Format For Shared Task Submissions

Submit an abstract of maximum 1500 words describing the learning approach, and your results on the test set by April 6, 2001 to the address below (preferably by email). A special section of the proceedings will be devoted to a comparison and analysis of the results and to a description of the approaches used. Submit shared task submissions to:

Erik Tjong Kim Sang, erikt@uia.ua.ac.be
Centrum Nederlandse Taal en Spraak
Linguistics, Department of Germanic languages and literature
UIA, University of Antwerp
Universiteitsplein 1, B-2610 Wilrijk, Belgium

Important dates

