Lectures on Natural Language Processing
- Date: –11:05
- Location: Engelska parken English Park Campus, Room 22-0031
- Lecturer: Jörg Tiedemann
- Contact person: Ahmed Ruby
Lost in Translation Models – Messing Around with Massively Multilingual Data
Multilingual data sets contain a lot of interesting information. The question is how we can make use of it in the best possible way. In the FoTran project we are interested in developing a framework based on modular and multilingual machine translation to push the limits of language coverage and linguistic diversity. The purpose is not only automatic translation but also the idea of representation learning with cross-lingual signals. In this talk, I will introduce MAMMOTH, our massively multilingual and modular translation toolkit and challenges that come with its implementation. I will also discuss some findings from along the way, additional resources we have created, and I will probably get lost in some details looking forward to your help and feedback.