Domain Adaptation of Simulated Data for Cyberbullying Research

Publication TypeTalks
AuthorsEmmery, C., Verhoeven B., De Pauw G., & Daelemans W.
Place PresentedPresented at ATILA 2015, Antwerp, Belgium
Year of Publication2015
Date Presented16-10-2015

Scarcity of publicly available datasets with sensitive content is a common problem amongst many social applications of Natural Language Processing. Progress in these fields is therefore mostly hampered by the access to real data being restricted for the sake of privacy. This is equally true for the domain of cyberbullying research, where the relevant content of the messages is in general also easily retractable to a particular person. We argue that simulating real-life scenarios might yield plausible data that can serve to circumvent this limitation.

