Seminar in Computational Linguistics

  • Date:
  • Location:
  • Lecturer: Sandra Kuebler
  • Contact person: Gongbo Tang
  • Seminarium

Abusive Language Detection: A Closer Look

Abusive language detection is popular topic with important real world applications. In this talk, I want to have a closer look at what we are really doing when we use existing data sets. I will start by explaining the task definition, the existing data sets, and a big picture overview of current methods. 


We will then delve deeper into three questions: 1) Abusive language detection has mostly been investigated for English. If we want to develop an approach for a different language, can we simply use the insights for English to get similar results for the new language? 2) All data sets have been up-sampled to increase the number of abusive posts, i.e., they have a sampling bias. How does this bias affect classification results? 3) How reliable are the annotations in on of the existing data sets? In other words, what exactly is our classifier learning? For example, should self-abuse be considered abusive language?