Conversation Level Constraints on Pedophile Detection in Chat Rooms

Year of Publication2012
AuthorsPeersman, C., Vaassen F., Van Asch V., & Daelemans W.
Secondary AuthorsForner, P., Karlgren J., & Womser-Hacker C.
Conference NameCLEF 2012 Conference and Labs of the Evaluation Forum - Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN)
Conference LocationRome, Italy
In this paper we present a new approach for detecting online pedophiles in chat rooms that combines the results of predictions on the level of the individual post, the level of the user and the level of the entire conversation, and describe the results of this three-stage system in the PAN 2012 competition. Also, we describe a resampling and a filtering strategy to circumvent issues regarding the unbalanced dataset. Finally, we describe the creation of a dictionary of words and expressions relating to predators’ grooming stages, which we used to identify which posts in the predators’ conversations were most distinctive for their grooming behavior.

