Seminar in Computational Linguistics

  • Date: –14:30
  • Location: Zoom: https://uu-se.zoom.us/j/68234661148
  • Lecturer: Sidsel Boldsen
  • Contact person: Gongbo Tang
  • Seminarium

Modelling language change within historical text corpora

A challenge when working with historical text corpora in language technology is the fact that language is changing. Thus, an aspect to take into account when building and applying statistical models to diachronic data is that language is non-stationary and will vary across periods of time. In my PhD thesis, I explore how to model temporal language variation in digital corpora in NLP. This field has recently gained attention at the LChange'19 ACL workshop and the SemEval2020 shared task on lexical semantic change detection. In this talk, I will present my work three different topics: (1) How language change can be modelled within the task of temporal text classification, (2) how language change can be identified, and lastly (3) some preliminary work on injecting time into neural language models.