Stylometry refers to the quantitative study of writing style. This field is rich in linguistic applications in which the correlation between a text’s writing style and its metadata is investigated. Empirical studies have shown that, for instance, the gender of an author can be fairly reliably predicted from his/her writing style. Other interesting applications include authorship attribution and prediction of age, gender and personality.


Project information

In the dominant approach in stylometry, superficial linguistic characteristics are often used, e.g. frequencies of words and character sequences. Although such features have been proven to work well on various tasks in stylometry, the issue of explanation often arises: it can be difficult to explain why certain superficial features perform well e.g. in a complex task such as authorship attribution. Moreover, such shallow features can be difficult to interpret from a linguistic point of view.


In this project, we will explore the use of deeper linguistic features in computational stylometry. Following a line of recent research, we hypothesise that more complex features will provide us with complementary information about writing style. We will propose methods of constructing new features (i.e. finding quantifiable aspects of the text) related to the semantics and the discourse of the text, two types of linguistic knowledge that are currently underresearched in stylometry.


Project Leader(s): 
Walter Daelemans
01/10/2014 - 30/09/2018

FWO Research Foundation - Flanders

Syndicate content