Publikationer för datorlingvistik

  • Basirat, Ali; Allassonnière-Tang, Marc; Berdicevskis, Aleksandrs

    An empirical study on the contribution of formal and semantic features to the grammatical gender of nouns

    Ingår i Linguistics Vanguard, 2021.

  • Megyesi, Beáta; Tudor, Crina

    Transcription of Historical Ciphers and Keys: Guidelines, version 2.0

    2021.

  • Basirat, Ali; Nivre, Joakim

    Real-valued syntactic word vectors

    Ingår i Journal of experimental and theoretical artificial intelligence (Print), s. 557-579, 2020.

    Open access
  • Hubková, Helena; Král, Pavel; Pettersson, Eva

    Czech Historical Named Entity Corpus v 1.0

    Ingår i 12th Conference on Language Resources and Evaluation (LREC 2020), s. 4458-4465, 2020.

  • Nivre, Joakim

    Multilingual Dependency Parsing from Universal Dependencies to Sesame Street

    Ingår i Text, Speech, and Dialogue (TSD 2020), s. 11-29, 2020.

  • Stymne, Sara; Östman, Carin

    SLäNDa: An Annotated Corpus of Narrative and Dialogue in Swedish Literary Fiction

    Ingår i Proceedings of the 12th Language Resources and Evaluation Conference, s. 826-834, 2020.

    Open access
  • Stymne, Sara

    Cross-Lingual Domain Adaptation for Dependency Parsing

    Ingår i Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (TLT), s. 62-69, 2020.

    Open access
  • Veeman, Hartger; Allassonniere-Tang, Marc; Berdicevskis, Aleksandrs; Basirat, Ali

    Cross-lingual Embeddings Reveal Universal and Lineage-Specific Patterns in Grammatical Gender Assignment

    Ingår i Proceedings of the the 24th Conference on Computational Natural Language Learning, s. 265-275, 2020.

    Open access
  • Rizal, Arra’Di Nur; Stymne, Sara

    Evaluating Word Embeddings for Indonesian–English Code-Mixed Text Based on Synthetic Data

    Ingår i Proceedings of the 4th Workshop on Computational Approaches to Code Switching, s. 26-35, 2020.

    Open access
  • Ramisch, Carlos; Savary, Agata; Guillaume, Bruno; Waszczuk, Jakub et al.

    Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

    Ingår i Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, s. 107-118, 2020.

    Open access
  • Moghe, Nikita; Hardmeier, Christian; Bawden, Rachel

    The University of Edinburgh-Uppsala University's Submission to the WMT 2020 Chat Translation Task

    Ingår i Proceedings of the 5th Conference on Machine Translation (WMT), s. 473-478, 2020.

    Open access
  • Loáiciga, Sharid; Hardmeier, Christian; Sayeed, Asad

    Exploiting Cross-lingual Hints to Discover Event Pronouns

    Ingår i Proceedings of the 12th Conference on Linguistic Resources and Evaluation (LREC), s. 99-103, 2020.

    Open access
  • Lapshinova-Koltunski, Ekaterina; Krielke, Marie-Pauline; Hardmeier, Christian

    Coreference Strategies in English-German Translation

    Ingår i Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference, s. 139-153, 2020.

    Open access
  • Della Corte, Giuseppe; Stymne, Sara

    IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation

    Ingår i Proceedings of the First International Workshop on Natural Language Processing Beyond Text, s. 41-50, 2020.

  • Tang, Gongbo

    Understanding Neural Machine Translation: An investigation into linguistic phenomena and attention mechanisms

    Open access
  • Costa-jussà, Marta R.; Hardmeier, Christian; Radford, Will; Webster, Kellie

    Proceedings of the First Workshop on Gender Bias in Natural Language Processing

    2020.

    Open access
  • Braud, Chloé; Hardmeier, Christian; Li, Junyi Jessy; Louis, Annie et al.

    Proceedings of the First Workshop on Computational Approaches to Discourse

    2020.

    Open access
  • Costa-jussà, Marta R.; Hardmeier, Christian; Radford, Will; Webster, Kellie

    Proceedings of the Second Workshop on Gender Bias in Natural Language Processing

    2020.

    Open access
  • Chen, Shifei; Basirat, Ali

    Cross-lingual Word Embeddings beyond Zero-shot Machine Translation

    2020.

    Open access
  • Veeman, Hartger; Basirat, Ali

    An Exploration of the Encoding of Grammatical Gender in Word Embeddings

    2020.

    Open access
  • Dahlqvist, Bengt

    Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish

    Ingår i Proceedings of the 12th International Conference on Agents and Artificial Intelligence, s. 452-458, 2020.

    Open access
  • Dahlqvist, Bengt

    Marian Miracles in Old Swedish Texts

    Ingår i Les miracles de Notre-Dame du Moyen Âge à nos jours, s. 179-190, 2020.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English

    Ingår i Proceedings of the 28th International Conference on Computational Linguistics, s. 4251-4262, 2020.

  • Lasry, George; Megyesi, Beáta; Kopal, Nils

    Deciphering Papal Ciphers from the 16th to the 18th Century.

    Ingår i Cryptologia, 2020.

    Open access
  • Volodina, Elena; Mohammed, Yousuf Ali; Matsson, Arild; Derbring, Sandra et al.

    Towards Privacy by Design in Learner Corpora Research: A Case of On-the-fly Pseudonymization of Swedish Learner Essays

    Ingår i Proceedings of the 28th International Conference on Computational Linguistics. COLING 2020, s. 357-369, 2020.

  • Hershcovich, Daniel; de Lhoneux, Miryam; Kulmizev, Artur; Pejhan, Elham et al.

    Kopsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding

    Ingår i 16th International Conference on Parsing Technologies and IWPT 2020 Shared Task on Parsing Into Enhanced Universal Dependencies, s. 236-244, 2020.

  • Megyesi, Beáta

    Transcription of Historical Ciphers and Keys

    Ingår i Proceedings of the 3rd International Conference on Historical Cryptology, s. 106-115, 2020.

    Open access
  • Tudor, Crina; Megyesi, Beáta; Láng, Benedek

    Automatic Key Structure Extraction

    Ingår i Proceedings of the 3rd International Conference on Historical Cryptology, s. 146-152, 2020.

    Open access
  • Arentzen, Thomas; Johnsén, Henrik Rydell; Westergren, Andreas

    Rubenson on the Move: A Biographical Journey

    Ingår i Wisdom on the Move, s. 247-250, 2020.

  • Megyesi, Beáta

    Transcription of Historical Ciphers and Keys: Guidelines

    2020.

    Open access
  • Dahllöf, Mats

    Classification of Medieval Documents: Determining the Issuer, Place of Issue, and Decade for Old Swedish Charters

    Ingår i DHN 2020 Digital Humanities in the Nordic Countries, s. 12-23, 2020.

    Open access
  • Chen, Jialuo; Souibgui, Mohamed Ali; Fornés, Alicia; Megyesi, Beáta

    A Web-based Interactive Transcription Tool for Encrypted Manuscripts

    Ingår i Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020, 2020.

    Open access
  • Megyesi, Beáta

    Proceedings of the 3rd International Conference on Historical Cryptology

    2020.

    Open access
  • Megyesi, Beáta; Esslinger, Bernhard; Fornés, Alicia; Kopal, Nils et al.

    Decryption of historical manuscripts: the DECRYPT project

    Ingår i Cryptologia, 2020.

    Open access
  • Her, One-Soon; Tang, Marc

    A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers

    Ingår i Journal of Quantitative Linguistics, s. 93-113, 2020.

    Open access
  • Lapshinova-Koltunski, Ekaterina; Loáiciga, Sharid; Hardmeier, Christian; Krielke, Pauline

    Cross-lingual Inconguences in the Annotation of Coreference

    Ingår i Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), s. 26-34, 2019.

    Open access
  • Kunz, Jenny; Hardmeier, Christian

    Entity Decisions in Neural Language Modelling: Approaches and Problems

    Ingår i Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), s. 15-19, 2019.

    Open access
  • Popescu-Belis, Andrei; Loáiciga, Sharid; Hardmeier, Christian; Xiong, Deyi

    Proceedings of the Fourth Workshop on Discourse in Machine Translation

    2019.

    Open access
  • Basirat, Ali; Tang, Marc

    Linguistic information in word embeddings

    Ingår i Agents and Artificial Intelligence, s. 492-513, 2019.

  • Basirat, Ali; de Lhoneux, Miryam; Kulmizev, Artur; Kurfal, Murathan et al.

    Polyglot Parsing for One Thousand and One Languages (And Then Some)

    2019.

    Open access
  • Berglund, Karl; Dahllöf, Mats; Määttä, Jerry

    Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction

    Ingår i Samlaren, s. 228-260, 2019.

    Open access
  • de Lhoneux, Miryam; Stymne, Sara; Nivre, Joakim

    What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?

    Ingår i CoRR, 2019.

  • Arentzen, Thomas; Westergren, Andreas

    Prolog

    Ingår i Patristica Nordica Annuaria, s. 3-4, 2019.

  • Nivre, Joakim; Ginter, Filip; Oepen, Stephan; Tiedemann, Jörg

    DL4NLP 2019. Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing

    2019.

  • Webster, Kellie; Costa-jussa, Marta R.; Hardmeier, Christian; Radford, Will

    Gendered Ambiguous Pronouns (GAP) Shared Task at the Gender Bias in NLP Workshop 2019

    Ingår i Gender Bias In Natural Language Processing (GEBNLP 2019), s. 1-7, 2019.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Understanding Neural Machine Translation by Simplification: The Case of Encoder-Free Models

    Ingår i Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), s. 1186-1193, 2019.

  • Fano, Elena; Karlgren, Jussi; Nivre, Joakim

    Uppsala University and Gavagai at CLEF eRISK: Comparing Word Embedding Models

    Ingår i Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9-12, 2019, 2019.

  • Meechan-Maddon, Ailsa; Nivre, Joakim

    How to Parse Low-Resource Languages: Cross-Lingual Parsing, Target Language Annotation, or Both?

    Ingår i Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019), s. 112-120, 2019.

  • Kulmizev, Artur; de Lhoneux, Miryam; Gontrum, Johannes; Fano, Elena et al.

    Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited

    Ingår i Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), s. 2755-2768, 2019.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    Encoders Help You Disambiguate Word Senses in Neural Machine Translation

    Ingår i Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), s. 1429-1435, 2019.

  • Volodina, Elena; Granstedt, Lena; Matsson, Arild; Megyesi, Beáta et al.

    The SweLL Language Learner Corpus: From Design to Annotation

    Ingår i Northern European Journal of Language Technology (NEJLT), s. 67-104, 2019.

    Open access
  • Arentzen, Thomas

    Mary Retold

    Ingår i Ancient Jew Review, 2019.

  • Tang, Marc; Her, One-Soon

    Insights on the Greenberg-Sanches-Slobin generalization: Quantitative typological data on classifiers and plural markers

    Ingår i Folia linguistica, s. 297-331, 2019.

  • Ahrenberg, Lars; Megyesi, Beáta

    Proceedings of the Workshop on NLP and Pseudonymisation

    2019.

    Open access
  • Pettersson, Eva; Megyesi, Beáta

    Matching Keys and Encrypted Manuscripts

    Ingår i Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa '19), 2019.

    Open access
  • Dahllöf, Mats; Berglund, Karl

    Faces, Fights, and Families: Topic Modeling and Gendered Themes in Two Corpora of Swedish Prose Fiction

    Ingår i DHN 2019 Copenhagen, Proceedings of 4th Conference of The Association Digital Humanities in the Nordic Countries Copenhagen, March 6-8 2019, s. 92-111, 2019.

    Open access
  • Megyesi, Beáta; Volodina, Elena

    Pseudonymization of Language Learner Data

    Ingår i Workshop om pseudonymisering av textdata, 2019.

    Open access
  • Baró, Arnau; Chen, Jialuo; Fornés, Alicia; Megyesi, Beáta

    Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts

    Ingår i Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019.

  • Megyesi, Beáta; Palmér, Anne; Näsman, Jesper

    SWEGRAM: Annotering och analys av svenska texter

    2019.

    Open access
  • Megyesi, Beáta; Blomqvist, Nils; Pettersson, Eva

    The DECODE Database: Collection of Historical Ciphers and Keys

    Ingår i Proceedings of the 2nd International Conference on Historical Cryptology, s. 69-78, 2019.

    Open access
  • Dahllöf, Mats

    Clustering writing components from medieval manuscripts

    Ingår i Proceedings of the Workshop on Computational Methods in the Humanities 2018, s. 23-32, 2019.

    Open access
  • Tang, Marc

    A typology of classifiers and gender: From description to computation

    Open access
  • Basirat, Ali; Tang, Marc

    Lexical and Morpho-syntactic Features in Word Embeddings: A Case Study of Nouns in Swedish

    Ingår i Proceedings of the 10th International Conference on Agents and Artificial Intelligence, s. 663-674, 2018.

  • Stymne, Sara; de Lhoneux, Miryam; Smith, Aaron; Nivre, Joakim

    Parser Training with Heterogeneous Treebanks

    Ingår i Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), s. 619-625, 2018.

    Open access
  • Šoštarić, Margita; Hardmeier, Christian; Stymne, Sara

    Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation

    Ingår i Proceedings of the Third Conference on Machine Translation: Research Papers, s. 36-48, 2018.

    Open access
  • Volodina, Elena; Granstedt, Lena; Megyesi, Beáta; Prentice, Julia et al.

    Annotation of learner corpora: first SweLL insights

    Ingår i Abstracts of SLTC 2018, s. 86-89, 2018.

    Open access
  • Dubremetz, Marie; Nivre, Joakim

    Rhetorical Figure Detection: Chiasmus, Epanaphora, Epiphora

    Ingår i Frontiers in Digital Humanities, 2018.

  • Dahllöf, Mats

    Automatic Scribe Attribution for Medieval Manuscripts

    Ingår i Digital Medievalist, s. 1-26, 2018.

    Open access
  • Tang, Gongbo; Müller, Mathias; Rios, Annette; Sennrich, Rico

    Why Self-Attention?: A Targeted Evaluation of Neural Machine Translation Architectures

    Ingår i Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, s. 4263-4272, 2018.

  • Schuster, Sebastian; Nivre, Joakim; Manning, Christopher D.

    Sentences with Gapping: Parsing and Reconstructing Elided Predicates

    Ingår i Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, s. 1156-1168, 2018.

  • Nivre, Joakim; Marongiu, Paola; Ginter, Filip; Kanerva, Jenna et al.

    Enhancing Universal Dependency Treebanks: A Case Study

    Ingår i Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), s. 102-107, 2018.

  • Bouma, Gosse; Hajič, Jan; Haug, Dag; Nivre, Joakim et al.

    Expletives in Universal Dependency Treebanks

    Ingår i Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), s. 18-26, 2018.

  • Tang, Gongbo; Sennrich, Rico; Nivre, Joakim

    An analysis of Attention Mechanism: The Case of Word Sense Disambiguation in Neural Machine Translation

    Ingår i Proceedings of the Third Conference on Machine Translation, s. 26-35, 2018.

  • Smith, Aaron; Bohnet, Bernd; de Lhoneux, Miryam; Nivre, Joakim et al.

    82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models

    Ingår i Proceedings of the CoNLL 2018 Shared Task, s. 113-123, 2018.

  • Smith, Aaron; de Lhoneux, Miryam; Stymne, Sara; Nivre, Joakim

    An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing

    Ingår i Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, s. 2711-2720, 2018.

  • Tang, Gongbo; Cap, Fabienne; Pettersson, Eva; Nivre, Joakim

    An evaluation of neural machine translation models on historical spelling normalization

    Ingår i Proceedings of the 27th International Conference on Computational Linguistics, s. 1320-1331, 2018.

  • Megyesi, Beáta; Granstedt, Lena; Johansson, Sofia; Prentice, Julia et al.

    Learner Corpus Anonymization in the Age of GDPR: Insights from the Creation of a Learner Corpus of Swedish

    Ingår i Proceedings of the 7th NLP4CALL, 2018.

    Open access
  • Søgaard, Anders; de Lhoneux, Miryam; Augenstein, Isabelle

    Nightmare at test time: How punctuation prevents parsers from generalizing

    Ingår i Proceedings of the 2018 EMNLP Workshop BlackboxNLP, s. 25-29, 2018.

    Open access
  • de Lhoneux, Miryam; Bjerva, Johannes; Augenstein, Isabelle; Søgaard, Anders

    Parameter sharing between dependency parsers for related languages

    Ingår i Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, s. 4992-4997, 2018.

    Open access
  • Dahllöf, Mats

    Clustering Writing Components from Medieval Manuscripts

    Ingår i COMHUM 2018: Book of Abstracts for the Workshop on Computational Methods in the Humanities 2018, s. 11-13, 2018.

    Open access
  • Megyesi, Beáta

    Proceedings of the 1st International Conference on Historical Cryptology: HistoCrypt 2018

    2018.

    Open access
  • Basirat, Ali

    Principal Word Vectors

    Open access
  • Shao, Yan; Hardmeier, Christian; Nivre, Joakim

    Universal Word Segmentation: Implementation and Interpretation

    Ingår i Transactions of the Association for Computational Linguistics, s. 421-435, 2018.

    Open access
  • Pettersson, Eva; Megyesi, Beata

    The HistCorp Collection of Historical Corpora and Resources

    Ingår i DHN 2018, s. 306-320, 2018.

    Open access
  • Shao, Yan

    Segmenting and Tagging Text with Neural Networks

    Open access
  • Zarei, F.; Basirat, Ali; Faili, H.; Mirain, M.

    A bootstrapping method for development of Treebank

    Ingår i Journal of experimental and theoretical artificial intelligence (Print), s. 19-42, 2017.

  • Ide, Nancy; Calzolari, Nicoletta; Eckle-Kohler, Judith; Gibbon, Dafydd et al.

    Community Standards for Linguistically-Annotated Resources

    Ingår i Handbook of Linguistic Annotation, s. 113-165, 2017.

  • Hammarström, Harald; Virk, Shafqat Mumtaz; Forsberg, Markus

    Poor Man’s OCR Post-Correction: Unsupervised Recognition of Variant Spelling Applied to a Multilingual Document Collection

    Ingår i Proceedings of the Digital Access to Textual Cultural Heritage (DATeCH) conference, s. 71-75, 2017.

  • Dubremetz, Marie; Nivre, Joakim

    Machine Learning for Rhetorical Figure Detection: More Chiasmus with Less Annotation

    Ingår i Proceedings of the 21st Nordic Conference of Computational Linguistics, s. 37-45, 2017.

  • Virk, Shafqat Mumtaz; Borin, Lars; Saxena, Anju; Hammarström, Harald

    Automatic extraction of typological linguistic features from descriptive grammars

    Ingår i Text, Speech, and Dialogue, s. 111-119, 2017.

  • Shao, Yan; Hardmeier, Christian; Tiedemann, Jörg; Nivre, Joakim

    Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF

    Ingår i Proceedings of the The 8th International Joint Conference on Natural Language Processing, s. 173-183, 2017.

    Open access
  • Shao, Yan; Hardmeier, Christian; Nivre, Joakim

    Recall is the Proper Evaluation Metric for Word Segmentation

    Ingår i Proceedings of the The 8th International Joint Conference on Natural Language Processing, s. 86-90, 2017.

    Open access
  • de Lhoneux, Miryam; Yan, Shao; Basirat, Ali; Kiperwasser, Eliyahu et al.

    From raw text to Universal Dependencies: look, no tags!

    Ingår i Proceedings of the CoNLL 2017 Shared Task, s. 207-217, 2017.

    Open access
  • Shao, Yan

    Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

    Ingår i Proceedings of MLP 2017, s. 75-80, 2017.

    Open access
  • Parks, Magdalena; Karlgren, Jussi; Stymne, Sara

    Plausibility Testing for Lexical Resources

    Ingår i Proceedings of CLEF 2017, s. 132-137, 2017.

  • Adams, Allison; Stymne, Sara

    Learning with learner corpora: Using the TLE for native language identification

    Ingår i Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition, s. 1-7, 2017.

    Open access
  • Stymne, Sara

    The Effect of Translationese on Tuning for Statistical Machine Translation

    Ingår i Proceedings of the 21st Nordic Conference on Computational Linguistics, s. 241-246, 2017.

    Open access
  • Loáiciga, Sharid; Stymne, Sara; Nakov, Preslav; Hardmeier, Christian et al.

    Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction

    Ingår i Proceedings of the Third Workshop on Discourse in Machine Translation, 2017.

    Open access
  • Stymne, Sara; Loàiciga, Sharid; Cap, Fabienne

    A BiLSTM-based System for Cross-lingual Pronoun Prediction

    2017.

  • Padilla López, Rebeca; Cap, Fabienne

    Did you ever read about Frogs drinking Coffee?: Investigating the Compositionality of Multi-Emoji Expressions

    Ingår i Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, s. 113-117, 2017.

    Open access