Teaching General CL links Stylometry Text Mining Shallow Parsing

Babelfish by Rod Lord (c) Rod Lord

General starting points for Computational Linguistics

You (almost) don’t even have to get up and go to the library to study computational linguistics; most material you will need is available on-line.

The ACL Anthology provides access to the computational linguistics research literature (journal, conferences, and workshops) from the 1960s onwards and is updated with new material all the time.

Try Google scholar if you don’t find it in the anthology. Most researchers have their papers on line or in lab repositories, and Google indexes them and computes citation links. Given some concept, the papers returned will be sorted according to impact (number of citations). This is useful to find quickly the main papers in some subarea. For example, if you want to know more about shallow parsing, simply type “shallow parsing” and then start looking at the top papers, the papers citing those papers (as found following the link “cited by”) and so on.

The following recent (or recently updated) textbooks and handbooks provide a more structured introduction to the field. The websites provide links to many software and other resources for learning and research:

Jurafsky & Martin, Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition. 2008. Website.

Bird, Klein & Loper, Natural Language Processing with Python. 2009. NLTK website. Free on-line version of the book.

Clark, Fox, Lappin, The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics). 2010.

Baayen, Analyzing Linguistic Data: A Practical Introduction to Statistics using R (Cambridge University Press). 2008.

How the field is structured

The Flemish National Science Foundation FWO sponsors a scientific research community on computational linguistics and language and speech technology (CLIF). You can find more information there on which computational linguistics research groups exist in Flanders.

Flanders and The Netherlands work together intensively in the field of computational linguistics; there is a yearly workshop, CLIN, where researchers and advanced students meet yearly to present their research (it has existed since 1990), and a lot of research is done in transnational cooperations. Since 2008, STIL awards a 1000 euro computational linguistics thesis prize at CLIN. See the computational linguistics website of De Nederlandse Taalunie for more information about these cooperations and about computational linguistics in the Low Countries.

Internationally, the main organization coordinating the field is the Association for Computational Linguistics (ACL), with chapters for North-America (NAACL) and Europe (EACL), and a close cooperation with the Asian Federation of Natural Language Processing, and the International Committee on Computational Linguistics (ICCL). These associations organize conferences regularly. ACL has a number of active Special Interest Groups on specific subareas in computational linguistics (e.g. computational phonology and morphology, parsing, natural language learning, semantics, etc.) that organize their own workshops and conferences.

Most of the research is published in proceedings of conferences organized by ACL, NAACL, EACL etc. Acceptance rates to these conferences are low, and the papers are relatively long and carefully peer-reviewed.

The main journals are:

Computational Linguistics
Natural Language Engineering
Machine Translation
Language Resources and Evaluation
Literary and Linguistic Computing
ACM transactions on speech and language processing

The development of resources for computational linguistics (corpora, reusable software, lexical databases) and the development of accurate evaluation measures when comparing or evaluating systems and resources is an important aspect of the field. The LREC conferences are the main venue for presenting research in this area.

Additional resources in Dutch

Via het sleutelwoord Stevin bij de www.kennislink.nl website kan je een lijst krijgen met alle taal- en spraaktechnologieartikelen van Kennislink: klik hier.

Een algemene inleiding tot de taal- en spraaktechnologie is daar ook te vinden.


Last modified: Mon Aug 2 12:57:09 2010