Seminar in Computational Linguistics

  • Lecturer: Murathan Kurfali
Vad sa du?: Towards Universal Direct Speech Recognition

In prose fiction, broadly, there are two kinds of voice, that of the narrator (narration) and of the characters (dialogue/direct speech). Distinguishing these two narrative modes from each other is essential for many literary and computational studies.  Yet, contrary to what it might seem, distinguishing narration from direct speech is not straightforward. Among other things, the typographical conventions to delimit direct speech significantly vary over languages. Presumably, as a result of this, most of the previous approaches were monolingual, relying on language-specific features.

In this talk, I will present the initial findings of our ongoing effort towards universal direct speech tagging. Specifically, our work explores three main questions: (i) Is it possible to gather high-quality training data automatically, (ii) Can direct speech recognition be achieved without relying on typography and (iii) If so, can it be multilingual. To this end, we fine-tune a multilingual sentence-embedder (m-BERT) on training data without any quotation marks, forcing the model to instead learn the linguistic characteristics of direct speech. The results show that not only is direct speech recognition in this way is possible, but it can even retain most of its performance in the zero-shot setting.