Naive discrimination learning as a framework for modeling aspects of language processing

Tuesday, November 19, 2013 - 11:00 - 12:00
University of Antwerp, room R007 (Rodestraat 14, 2000 Antwerp)
Harald Baayen

Naive discrimination learning is an approach to language processing that is insprired by information theory (Shannon, 1948) on the one hand, and learning theory in psychology (Wagner & Rescorla, 1972) on the other hand.   Instead of understanding grammar in terms of a formal calculus with an alphabet of symbols and rules for combining elementary symbols into well-formed strings, we think of the grammar of a language as comprising a code and overt signals. The signal (speech, writing, gesture, whistling, ...) is not necessarily decompositional in the item-and-arrangement sense.  Instead, cues distributed over the signal are allowed to be jointly predictive (thereby questioning the hypothesis of the dual articulation of language).   The code that language users share, albeit approximately due to variation in life experience, allows them to encode experience into the signal, or decode experience from the signal.   Crucially, we see the signal as targeting a reduction in uncertainty about the encoded experience.   Experiences, in all their richness, are much richer in information (in bits) than could be encoded in a short speech signal (in bits).  For instance, the words "pride and prejudice" will bring to mind a certain book and movies of that book, which are much richer than less than a second (or a few centimeters of writing) can ever encode.   

This general approach to language will be illustrated with three examples.  The first example addresses the myth that our cognitive faculties would decline as we age.  I will show that the psychological tests supposedly documenting cognitive decline all make use of language materials in a way that is uninformed about the consequences of prolonged experience with language as we grow older.   The second example addresses the question of morphological processing in reading.  Naive discrimination learning predicts that cues that are unique to a complex word are most predictive for that word.  Data from eye-tracking experiments indicate this prediction is correct.  The final example focuses on syntax and semantics.   Various algorithms have been developed  for estimating the semantic similarity of words using word co-occurrence restrictions across documents (LSA) or within windows of text (HAL, HiDEx).  I will present ongoing research in our group which indicates that a discrimination learning approach in which words are cues for words  makes predictions about word-to-word semantic similarity that are very similar to those of HiDEx.   This suggests naive discrimination theory may provide a straightforward rationale for why semantic vector space models work,  based on simple locally restricted learning events. 

Signups closed for this CLiPS Colloquium