Publications in Computational Linguistics

  • Basirat, Ali; Allassonnière-Tang, Marc; Berdicevskis, Aleksandrs

    An empirical study on the contribution of formal and semantic features to the grammatical gender of nouns

    Part of Linguistics Vanguard, 2021.

  • Megyesi, Beáta; Tudor, Crina

    Transcription of Historical Ciphers and Keys: Guidelines, version 2.0

    2021.

  • Basirat, Ali; Nivre, Joakim

    Real-valued syntactic word vectors

    Part of Journal of experimental and theoretical artificial intelligence (Print), p. 557-579, 2020.

    Open access
  • Hubková, Helena; Král, Pavel; Pettersson, Eva

    Czech Historical Named Entity Corpus v 1.0

    Part of 12th Conference on Language Resources and Evaluation (LREC 2020), p. 4458-4465, 2020.

  • Nivre, Joakim

    Multilingual Dependency Parsing from Universal Dependencies to Sesame Street

    Part of Text, Speech, and Dialogue (TSD 2020), p. 11-29, 2020.

  • Stymne, Sara; Östman, Carin

    SLäNDa: An Annotated Corpus of Narrative and Dialogue in Swedish Literary Fiction

    Part of Proceedings of the 12th Language Resources and Evaluation Conference, p. 826-834, 2020.

    Open access
  • Stymne, Sara

    Cross-Lingual Domain Adaptation for Dependency Parsing

    Part of Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (TLT), p. 62-69, 2020.

    Open access
  • Veeman, Hartger; Allassonniere-Tang, Marc; Berdicevskis, Aleksandrs; Basirat, Ali

    Cross-lingual Embeddings Reveal Universal and Lineage-Specific Patterns in Grammatical Gender Assignment

    Part of Proceedings of the the 24th Conference on Computational Natural Language Learning, p. 265-275, 2020.

    Open access
  • Rizal, Arra’Di Nur; Stymne, Sara

    Evaluating Word Embeddings for Indonesian–English Code-Mixed Text Based on Synthetic Data

    Part of Proceedings of the 4th Workshop on Computational Approaches to Code Switching, p. 26-35, 2020.

    Open access
  • Ramisch, Carlos; Savary, Agata; Guillaume, Bruno; Waszczuk, Jakub et al.

    Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

    Part of Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, p. 107-118, 2020.

    Open access
  • Moghe, Nikita; Hardmeier, Christian; Bawden, Rachel

    The University of Edinburgh-Uppsala University's Submission to the WMT 2020 Chat Translation Task

    Part of Proceedings of the 5th Conference on Machine Translation (WMT), p. 473-478, 2020.

    Open access
  • Loáiciga, Sharid; Hardmeier, Christian; Sayeed, Asad

    Exploiting Cross-lingual Hints to Discover Event Pronouns

    Part of Proceedings of the 12th Conference on Linguistic Resources and Evaluation (LREC), p. 99-103, 2020.

    Open access
  • Lapshinova-Koltunski, Ekaterina; Krielke, Marie-Pauline; Hardmeier, Christian

    Coreference Strategies in English-German Translation

    Part of Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference, p. 139-153, 2020.

    Open access
  • Della Corte, Giuseppe; Stymne, Sara

    IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation

    Part of Proceedings of the First International Workshop on Natural Language Processing Beyond Text, p. 41-50, 2020.

  • Tang, Gongbo

    Understanding Neural Machine Translation: An investigation into linguistic phenomena and attention mechanisms

    Open access
  • Costa-jussà, Marta R.; Hardmeier, Christian; Radford, Will; Webster, Kellie

    Proceedings of the First Workshop on Gender Bias in Natural Language Processing

    2020.

    Open access
  • Braud, Chloé; Hardmeier, Christian; Li, Junyi Jessy; Louis, Annie et al.

    Proceedings of the First Workshop on Computational Approaches to Discourse

    2020.

    Open access
  • Costa-jussà, Marta R.; Hardmeier, Christian; Radford, Will; Webster, Kellie

    Proceedings of the Second Workshop on Gender Bias in Natural Language Processing

    2020.

    Open access
  • Chen, Shifei; Basirat, Ali

    Cross-lingual Word Embeddings beyond Zero-shot Machine Translation

    2020.

    Open access
  • Veeman, Hartger; Basirat, Ali

    An Exploration of the Encoding of Grammatical Gender in Word Embeddings

    2020.

    Open access
  • Dahlqvist, Bengt

    Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish

    Part of Proceedings of the 12th International Conference on Agents and Artificial Intelligence, p. 452-458, 2020.

    Open access
  • Dahlqvist, Bengt

    Marian Miracles in Old Swedish Texts

    Part of Les miracles de Notre-Dame du Moyen Âge à nos jours, p. 179-190, 2020.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English

    Part of Proceedings of the 28th International Conference on Computational Linguistics, p. 4251-4262, 2020.

  • Lasry, George; Megyesi, Beáta; Kopal, Nils

    Deciphering Papal Ciphers from the 16th to the 18th Century.

    Part of Cryptologia, 2020.

    Open access
  • Volodina, Elena; Mohammed, Yousuf Ali; Matsson, Arild; Derbring, Sandra et al.

    Towards Privacy by Design in Learner Corpora Research: A Case of On-the-fly Pseudonymization of Swedish Learner Essays

    Part of Proceedings of the 28th International Conference on Computational Linguistics. COLING 2020, p. 357-369, 2020.

  • Hershcovich, Daniel; de Lhoneux, Miryam; Kulmizev, Artur; Pejhan, Elham et al.

    Kopsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding

    Part of 16th International Conference on Parsing Technologies and IWPT 2020 Shared Task on Parsing Into Enhanced Universal Dependencies, p. 236-244, 2020.

  • Megyesi, Beáta

    Transcription of Historical Ciphers and Keys

    Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 106-115, 2020.

    Open access
  • Tudor, Crina; Megyesi, Beáta; Láng, Benedek

    Automatic Key Structure Extraction

    Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 146-152, 2020.

    Open access
  • Arentzen, Thomas; Johnsén, Henrik Rydell; Westergren, Andreas

    Rubenson on the Move: A Biographical Journey

    Part of Wisdom on the Move, p. 247-250, 2020.

  • Megyesi, Beáta

    Transcription of Historical Ciphers and Keys: Guidelines

    2020.

    Open access
  • Dahllöf, Mats

    Classification of Medieval Documents: Determining the Issuer, Place of Issue, and Decade for Old Swedish Charters

    Part of DHN 2020 Digital Humanities in the Nordic Countries, p. 12-23, 2020.

    Open access
  • Chen, Jialuo; Souibgui, Mohamed Ali; Fornés, Alicia; Megyesi, Beáta

    A Web-based Interactive Transcription Tool for Encrypted Manuscripts

    Part of Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020, 2020.

    Open access
  • Megyesi, Beáta

    Proceedings of the 3rd International Conference on Historical Cryptology

    2020.

    Open access
  • Megyesi, Beáta; Esslinger, Bernhard; Fornés, Alicia; Kopal, Nils et al.

    Decryption of historical manuscripts: the DECRYPT project

    Part of Cryptologia, 2020.

    Open access
  • Her, One-Soon; Tang, Marc

    A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers

    Part of Journal of Quantitative Linguistics, p. 93-113, 2020.

    Open access
  • Lapshinova-Koltunski, Ekaterina; Loáiciga, Sharid; Hardmeier, Christian; Krielke, Pauline

    Cross-lingual Inconguences in the Annotation of Coreference

    Part of Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), p. 26-34, 2019.

    Open access
  • Kunz, Jenny; Hardmeier, Christian

    Entity Decisions in Neural Language Modelling: Approaches and Problems

    Part of Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), p. 15-19, 2019.

    Open access
  • Popescu-Belis, Andrei; Loáiciga, Sharid; Hardmeier, Christian; Xiong, Deyi

    Proceedings of the Fourth Workshop on Discourse in Machine Translation

    2019.

    Open access
  • Basirat, Ali; Tang, Marc

    Linguistic information in word embeddings

    Part of Agents and Artificial Intelligence, p. 492-513, 2019.

  • Basirat, Ali; de Lhoneux, Miryam; Kulmizev, Artur; Kurfal, Murathan et al.

    Polyglot Parsing for One Thousand and One Languages (And Then Some)

    2019.

    Open access
  • Berglund, Karl; Dahllöf, Mats; Määttä, Jerry

    Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction

    Part of Samlaren, p. 228-260, 2019.

    Open access
  • de Lhoneux, Miryam; Stymne, Sara; Nivre, Joakim

    What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?

    Part of CoRR, 2019.

  • Arentzen, Thomas; Westergren, Andreas

    Prolog

    Part of Patristica Nordica Annuaria, p. 3-4, 2019.

  • Nivre, Joakim; Ginter, Filip; Oepen, Stephan; Tiedemann, Jörg

    DL4NLP 2019. Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing

    2019.

  • Webster, Kellie; Costa-jussa, Marta R.; Hardmeier, Christian; Radford, Will

    Gendered Ambiguous Pronouns (GAP) Shared Task at the Gender Bias in NLP Workshop 2019

    Part of Gender Bias In Natural Language Processing (GEBNLP 2019), p. 1-7, 2019.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Understanding Neural Machine Translation by Simplification: The Case of Encoder-Free Models

    Part of Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), p. 1186-1193, 2019.

  • Fano, Elena; Karlgren, Jussi; Nivre, Joakim

    Uppsala University and Gavagai at CLEF eRISK: Comparing Word Embedding Models

    Part of Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9-12, 2019, 2019.

  • Meechan-Maddon, Ailsa; Nivre, Joakim

    How to Parse Low-Resource Languages: Cross-Lingual Parsing, Target Language Annotation, or Both?

    Part of Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019), p. 112-120, 2019.

  • Kulmizev, Artur; de Lhoneux, Miryam; Gontrum, Johannes; Fano, Elena et al.

    Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited

    Part of Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), p. 2755-2768, 2019.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Encoders Help You Disambiguate Word Senses in Neural Machine Translation

    Part of Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), p. 1429-1435, 2019.

  • Volodina, Elena; Granstedt, Lena; Matsson, Arild; Megyesi, Beáta et al.

    The SweLL Language Learner Corpus: From Design to Annotation

    Part of Northern European Journal of Language Technology (NEJLT), p. 67-104, 2019.

    Open access
  • Arentzen, Thomas

    Mary Retold

    Part of Ancient Jew Review, 2019.

  • Tang, Marc; Her, One-Soon

    Insights on the Greenberg-Sanches-Slobin generalization: Quantitative typological data on classifiers and plural markers

    Part of Folia linguistica, p. 297-331, 2019.

  • Ahrenberg, Lars; Megyesi, Beáta

    Proceedings of the Workshop on NLP and Pseudonymisation

    2019.

    Open access
  • Pettersson, Eva; Megyesi, Beáta

    Matching Keys and Encrypted Manuscripts

    Part of Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa '19), 2019.

    Open access
  • Dahllöf, Mats; Berglund, Karl

    Faces, Fights, and Families: Topic Modeling and Gendered Themes in Two Corpora of Swedish Prose Fiction

    Part of DHN 2019 Copenhagen, Proceedings of 4th Conference of The Association Digital Humanities in the Nordic Countries Copenhagen, March 6-8 2019, p. 92-111, 2019.

    Open access
  • Megyesi, Beáta; Volodina, Elena

    Pseudonymization of Language Learner Data

    Part of Workshop om pseudonymisering av textdata, 2019.

    Open access
  • Baró, Arnau; Chen, Jialuo; Fornés, Alicia; Megyesi, Beáta

    Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts

    Part of Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019.

  • Megyesi, Beáta; Palmér, Anne; Näsman, Jesper

    SWEGRAM: Annotering och analys av svenska texter

    2019.

    Open access
  • Megyesi, Beáta; Blomqvist, Nils; Pettersson, Eva

    The DECODE Database: Collection of Historical Ciphers and Keys

    Part of Proceedings of the 2nd International Conference on Historical Cryptology, p. 69-78, 2019.

    Open access
  • Dahllöf, Mats

    Clustering writing components from medieval manuscripts

    Part of Proceedings of the Workshop on Computational Methods in the Humanities 2018, p. 23-32, 2019.

    Open access
  • Tang, Marc

    A typology of classifiers and gender: From description to computation

    Open access
  • Basirat, Ali; Tang, Marc

    Lexical and Morpho-syntactic Features in Word Embeddings: A Case Study of Nouns in Swedish

    Part of Proceedings of the 10th International Conference on Agents and Artificial Intelligence, p. 663-674, 2018.

  • Stymne, Sara; de Lhoneux, Miryam; Smith, Aaron; Nivre, Joakim

    Parser Training with Heterogeneous Treebanks

    Part of Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 619-625, 2018.

    Open access
  • Šoštarić, Margita; Hardmeier, Christian; Stymne, Sara

    Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation

    Part of Proceedings of the Third Conference on Machine Translation: Research Papers, p. 36-48, 2018.

    Open access
  • Volodina, Elena; Granstedt, Lena; Megyesi, Beáta; Prentice, Julia et al.

    Annotation of learner corpora: first SweLL insights

    Part of Abstracts of SLTC 2018, p. 86-89, 2018.

    Open access
  • Dubremetz, Marie; Nivre, Joakim

    Rhetorical Figure Detection: Chiasmus, Epanaphora, Epiphora

    Part of Frontiers in Digital Humanities, 2018.

  • Dahllöf, Mats

    Automatic Scribe Attribution for Medieval Manuscripts

    Part of Digital Medievalist, p. 1-26, 2018.

    Open access
  • Tang, Gongbo; Müller, Mathias; Rios, Annette; Sennrich, Rico

    Why Self-Attention?: A Targeted Evaluation of Neural Machine Translation Architectures

    Part of Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p. 4263-4272, 2018.

  • Schuster, Sebastian; Nivre, Joakim; Manning, Christopher D.

    Sentences with Gapping: Parsing and Reconstructing Elided Predicates

    Part of Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, p. 1156-1168, 2018.

  • Nivre, Joakim; Marongiu, Paola; Ginter, Filip; Kanerva, Jenna et al.

    Enhancing Universal Dependency Treebanks: A Case Study

    Part of Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), p. 102-107, 2018.

  • Bouma, Gosse; Hajič, Jan; Haug, Dag; Nivre, Joakim et al.

    Expletives in Universal Dependency Treebanks

    Part of Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), p. 18-26, 2018.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    An analysis of Attention Mechanism: The Case of Word Sense Disambiguation in Neural Machine Translation

    Part of Proceedings of the Third Conference on Machine Translation, p. 26-35, 2018.

  • Smith, Aaron; Bohnet, Bernd; de Lhoneux, Miryam; Nivre, Joakim et al.

    82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models

    Part of Proceedings of the CoNLL 2018 Shared Task, p. 113-123, 2018.

  • Smith, Aaron; de Lhoneux, Miryam; Stymne, Sara; Nivre, Joakim

    An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing

    Part of Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p. 2711-2720, 2018.

  • Tang, Gongbo; Cap, Fabienne; Pettersson, Eva; Nivre, Joakim

    An evaluation of neural machine translation models on historical spelling normalization

    Part of Proceedings of the 27th International Conference on Computational Linguistics, p. 1320-1331, 2018.

  • Megyesi, Beáta; Granstedt, Lena; Johansson, Sofia; Prentice, Julia et al.

    Learner Corpus Anonymization in the Age of GDPR: Insights from the Creation of a Learner Corpus of Swedish

    Part of Proceedings of the 7th NLP4CALL, 2018.

    Open access
  • Søgaard, Anders; de Lhoneux, Miryam; Augenstein, Isabelle

    Nightmare at test time: How punctuation prevents parsers from generalizing

    Part of Proceedings of the 2018 EMNLP Workshop BlackboxNLP, p. 25-29, 2018.

    Open access
  • de Lhoneux, Miryam; Bjerva, Johannes; Augenstein, Isabelle; Søgaard, Anders

    Parameter sharing between dependency parsers for related languages

    Part of Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p. 4992-4997, 2018.

    Open access
  • Dahllöf, Mats

    Clustering Writing Components from Medieval Manuscripts

    Part of COMHUM 2018: Book of Abstracts for the Workshop on Computational Methods in the Humanities 2018, p. 11-13, 2018.

    Open access
  • Megyesi, Beáta

    Proceedings of the 1st International Conference on Historical Cryptology: HistoCrypt 2018

    2018.

    Open access
  • Basirat, Ali

    Principal Word Vectors

    Open access
  • Shao, Yan; Hardmeier, Christian; Nivre, Joakim

    Universal Word Segmentation: Implementation and Interpretation

    Part of Transactions of the Association for Computational Linguistics, p. 421-435, 2018.

    Open access
  • Pettersson, Eva; Megyesi, Beata

    The HistCorp Collection of Historical Corpora and Resources

    Part of DHN 2018, p. 306-320, 2018.

    Open access
  • Shao, Yan

    Segmenting and Tagging Text with Neural Networks

    Open access
  • Zarei, F.; Basirat, Ali; Faili, H.; Mirain, M.

    A bootstrapping method for development of Treebank

    Part of Journal of experimental and theoretical artificial intelligence (Print), p. 19-42, 2017.

  • Ide, Nancy; Calzolari, Nicoletta; Eckle-Kohler, Judith; Gibbon, Dafydd et al.

    Community Standards for Linguistically-Annotated Resources

    Part of Handbook of Linguistic Annotation, p. 113-165, 2017.

  • Hammarström, Harald; Virk, Shafqat Mumtaz; Forsberg, Markus

    Poor Man’s OCR Post-Correction: Unsupervised Recognition of Variant Spelling Applied to a Multilingual Document Collection

    Part of Proceedings of the Digital Access to Textual Cultural Heritage (DATeCH) conference, p. 71-75, 2017.

  • Dubremetz, Marie; Nivre, Joakim

    Machine Learning for Rhetorical Figure Detection: More Chiasmus with Less Annotation

    Part of Proceedings of the 21st Nordic Conference of Computational Linguistics, p. 37-45, 2017.

  • Virk, Shafqat Mumtaz; Borin, Lars; Saxena, Anju; Hammarström, Harald

    Automatic extraction of typological linguistic features from descriptive grammars

    Part of Text, Speech, and Dialogue, p. 111-119, 2017.

  • Shao, Yan; Hardmeier, Christian; Tiedemann, Jörg; Nivre, Joakim

    Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF

    Part of Proceedings of the The 8th International Joint Conference on Natural Language Processing, p. 173-183, 2017.

    Open access
  • Shao, Yan; Hardmeier, Christian; Nivre, Joakim

    Recall is the Proper Evaluation Metric for Word Segmentation

    Part of Proceedings of the The 8th International Joint Conference on Natural Language Processing, p. 86-90, 2017.

    Open access
  • de Lhoneux, Miryam; Yan, Shao; Basirat, Ali; Kiperwasser, Eliyahu et al.

    From raw text to Universal Dependencies: look, no tags!

    Part of Proceedings of the CoNLL 2017 Shared Task, p. 207-217, 2017.

    Open access
  • Shao, Yan

    Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

    Part of Proceedings of MLP 2017, p. 75-80, 2017.

    Open access
  • Parks, Magdalena; Karlgren, Jussi; Stymne, Sara

    Plausibility Testing for Lexical Resources

    Part of Proceedings of CLEF 2017, p. 132-137, 2017.

  • Adams, Allison; Stymne, Sara

    Learning with learner corpora: Using the TLE for native language identification

    Part of Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition, p. 1-7, 2017.

    Open access
  • Stymne, Sara

    The Effect of Translationese on Tuning for Statistical Machine Translation

    Part of Proceedings of the 21st Nordic Conference on Computational Linguistics, p. 241-246, 2017.

    Open access
  • Loáiciga, Sharid; Stymne, Sara; Nakov, Preslav; Hardmeier, Christian et al.

    Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction

    Part of Proceedings of the Third Workshop on Discourse in Machine Translation, 2017.

    Open access
  • Stymne, Sara; Loàiciga, Sharid; Cap, Fabienne

    A BiLSTM-based System for Cross-lingual Pronoun Prediction

    2017.

  • Padilla López, Rebeca; Cap, Fabienne

    Did you ever read about Frogs drinking Coffee?: Investigating the Compositionality of Multi-Emoji Expressions

    Part of Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, p. 113-117, 2017.

    Open access