Publications in Computational Linguistics
-
What Was Encoded in Historical Cipher Keys in the Early Modern Era?
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., 2022.
-
Identifying Cleartext in Historical Ciphers
Part of Proceedings of the Workshop on Language Technologies for Historical and Ancient Languages. LT4HALA 2022., 2022.
-
The DECODE Database of Historical Ciphers and Keys: Version 2
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022., p. 111-114, 2022.
DOI for The DECODE Database of Historical Ciphers and Keys: Version 2
-
Lost in Transcription of Graphic Signs in Ciphers
Part of Proceedings of the 5th International Conference on Historical Cryptology. HistoCrypt 2022, p. 153-158, 2022.
-
Whit’s the Richt Pairt o Speech: PoS tagging for Scots
Part of Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), p. 39-48, 2021.
Download full text (pdf) of Whit’s the Richt Pairt o Speech: PoS tagging for Scots
-
Investigation of Transfer Languages for Parsing Latin: Italic Branch vs. Hellenic Branch
Part of Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), p. 315-320, 2021.
-
Survey and reproduction of computational approaches to dating of historical texts
Part of Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), p. 145-156, 2021.
-
Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation
Part of Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), p. 150-156, 2021.
-
Swedish FrameNet++ and comparative linguistics
Part of The Swedish FrameNet++, p. 139-166, 2021.
-
Audiobook stylistics: Comparing print and audio in the bestselling segment
Part of Journal of Cultural Analytics, p. 1-30, 2021.
DOI for Audiobook stylistics: Comparing print and audio in the bestselling segment Download full text (pdf) of Audiobook stylistics: Comparing print and audio in the bestselling segment
-
Universal Dependencies
Part of Computational Linguistics, p. 255-308, 2021.
DOI for Universal Dependencies Download full text (pdf) of Universal Dependencies
-
Universals of Linguistic Idiosyncrasy in Multilingual Computational Linguistics
Part of Dagstuhl Reports, p. 89-138, 2021.
DOI for Universals of Linguistic Idiosyncrasy in Multilingual Computational Linguistics
-
Syntactic Nuclei in Dependency Parsing –: A Multilingual Exploration
Part of Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, p. 1376-1387, 2021.
DOI for Syntactic Nuclei in Dependency Parsing –: A Multilingual Exploration
-
Attention Can Reflect Syntactic Structure (If You Let It)
Part of Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, p. 3031-3045, 2021.
DOI for Attention Can Reflect Syntactic Structure (If You Let It)
-
Revisiting Negation in Neural Machine Translation
Part of Transactions of the Association for Computational Linguistics, p. 740-755, 2021.
-
Bidirectional Domain Adaptation Using Weighted Multi-Task Learning
Part of IWPT 2021, p. 93-105, 2021.
DOI for Bidirectional Domain Adaptation Using Weighted Multi-Task Learning
-
Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
Part of Proceedings of the 4th International Conference on Historical Cryptology HistoCrypt 2021, 2021.
DOI for Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images Download full text (pdf) of Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images
-
Key Design in the Early Modern Era in Europe
Part of Proceedings of the 4th International Conference on Historical Cryptology (HistoCrypt 2021), 2021.
DOI for Key Design in the Early Modern Era in Europe Download full text (pdf) of Key Design in the Early Modern Era in Europe
-
Czech Historical Named Entity Corpus v 1.0
Part of Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), p. 4458-4465, 2020.
Download full text (pdf) of Czech Historical Named Entity Corpus v 1.0
-
What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
Part of Computational linguistics - Association for Computational Linguistics (Print), p. 763-784, 2020.
DOI for What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions? Download full text (pdf) of What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
-
The DReaM Corpus: A Multilingual Annotated Corpus of Grammars for the World's Languages
Part of Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), p. 878-884, 2020.
-
Czech Historical Named Entity Corpus v 1.0
Part of 12th Conference on Language Resources and Evaluation (LREC 2020), p. 4458-4465, 2020.
Download full text (pdf) of Czech Historical Named Entity Corpus v 1.0
-
Exploiting Cross-lingual Hints to Discover Event Pronouns
Part of Proceedings of the 12th Conference on Linguistic Resources and Evaluation (LREC), p. 99-103, 2020.
Download full text (pdf) of Exploiting Cross-lingual Hints to Discover Event Pronouns
-
A Tale of Three Parsers: Towards Diagnostic Evaluation for Meaning Representation Parsing
Part of Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), p. 1902-1909, 2020.
-
SLäNDa: An Annotated Corpus of Narrative and Dialogue in Swedish Literary Fiction
Part of Proceedings of the 12th Language Resources and Evaluation Conference, p. 826-834, 2020.
-
A bird’s-eye view on South Asian languages through LSI
Part of Journal of South Asian languages and linguistics, p. 203-237, 2020.
DOI for A bird’s-eye view on South Asian languages through LSI Download full text (pdf) of A bird’s-eye view on South Asian languages through LSI
-
Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English
Part of Proceedings of the 28th International Conference on Computational Linguistics, p. 4251-4262, 2020.
-
Real-valued syntactic word vectors
Part of Journal of experimental and theoretical artificial intelligence (Print), p. 557-579, 2020.
DOI for Real-valued syntactic word vectors Download full text (pdf) of Real-valued syntactic word vectors
-
Multilingual Dependency Parsing from Universal Dependencies to Sesame Street
Part of Text, Speech, and Dialogue (TSD 2020), p. 11-29, 2020.
DOI for Multilingual Dependency Parsing from Universal Dependencies to Sesame Street
-
Cross-Lingual Domain Adaptation for Dependency Parsing
Part of Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (TLT), p. 62-69, 2020.
DOI for Cross-Lingual Domain Adaptation for Dependency Parsing Download full text (pdf) of Cross-Lingual Domain Adaptation for Dependency Parsing
-
Cross-lingual Embeddings Reveal Universal and Lineage-Specific Patterns in Grammatical Gender Assignment
Part of Proceedings of the the 24th Conference on Computational Natural Language Learning, p. 265-275, 2020.
DOI for Cross-lingual Embeddings Reveal Universal and Lineage-Specific Patterns in Grammatical Gender Assignment Download full text (pdf) of Cross-lingual Embeddings Reveal Universal and Lineage-Specific Patterns in Grammatical Gender Assignment
-
Evaluating Word Embeddings for Indonesian–English Code-Mixed Text Based on Synthetic Data
Part of Proceedings of the 4th Workshop on Computational Approaches to Code Switching, p. 26-35, 2020.
-
Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions
Part of Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, p. 107-118, 2020.
-
The University of Edinburgh-Uppsala University's Submission to the WMT 2020 Chat Translation Task
Part of Proceedings of the 5th Conference on Machine Translation (WMT), p. 473-478, 2020.
-
Coreference Strategies in English-German Translation
Part of Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference, p. 139-153, 2020.
Download full text (pdf) of Coreference Strategies in English-German Translation
-
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation
Part of Proceedings of the First International Workshop on Natural Language Processing Beyond Text, p. 41-50, 2020.
DOI for IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation
-
Tang, Gongbo
Understanding Neural Machine Translation: An investigation into linguistic phenomena and attention mechanisms
-
Text Processing Procedures for Analysing a Corpus with Medieval Marian Miracle Tales in Old Swedish
Part of Proceedings of the 12th International Conference on Agents and Artificial Intelligence, p. 452-458, 2020.
-
Marian Miracles in Old Swedish Texts
Part of Les miracles de Notre-Dame du Moyen Âge à nos jours, p. 179-190, 2020.
-
Towards Privacy by Design in Learner Corpora Research: A Case of On-the-fly Pseudonymization of Swedish Learner Essays
Part of Proceedings of the 28th International Conference on Computational Linguistics. COLING 2020, p. 357-369, 2020.
-
Kopsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding
Part of 16th International Conference on Parsing Technologies and IWPT 2020 Shared Task on Parsing Into Enhanced Universal Dependencies, p. 236-244, 2020.
DOI for Kopsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding
-
Transcription of Historical Ciphers and Keys
Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 106-115, 2020.
Download full text (pdf) of Transcription of Historical Ciphers and Keys
-
Automatic Key Structure Extraction
Part of Proceedings of the 3rd International Conference on Historical Cryptology, p. 146-152, 2020.
Download full text (pdf) of Automatic Key Structure Extraction
-
Rubenson on the Move: A Biographical Journey
Part of Wisdom on the Move, p. 247-250, 2020.
-
Classification of Medieval Documents: Determining the Issuer, Place of Issue, and Decade for Old Swedish Charters
Part of DHN 2020 Digital Humanities in the Nordic Countries, p. 12-23, 2020.
-
A Web-based Interactive Transcription Tool for Encrypted Manuscripts
Part of Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020, 2020.
DOI for A Web-based Interactive Transcription Tool for Encrypted Manuscripts Download full text (pdf) of A Web-based Interactive Transcription Tool for Encrypted Manuscripts
-
A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers
Part of Journal of Quantitative Linguistics, p. 93-113, 2020.
DOI for A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers Download full text (pdf) of A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers
-
Linguistic information in word embeddings
Part of Agents and Artificial Intelligence, p. 492-513, 2019.
-
Cross-lingual Inconguences in the Annotation of Coreference
Part of Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), p. 26-34, 2019.
Download full text (pdf) of Cross-lingual Inconguences in the Annotation of Coreference
-
Entity Decisions in Neural Language Modelling: Approaches and Problems
Part of Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC), p. 15-19, 2019.
DOI for Entity Decisions in Neural Language Modelling: Approaches and Problems Download full text (pdf) of Entity Decisions in Neural Language Modelling: Approaches and Problems
-
What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?
Part of CoRR, 2019.
-
Prolog
Part of Patristica Nordica Annuaria, p. 3-4, 2019.
-
Gendered Ambiguous Pronouns (GAP) Shared Task at the Gender Bias in NLP Workshop 2019
Part of Gender Bias In Natural Language Processing (GEBNLP 2019), p. 1-7, 2019.
-
Understanding Neural Machine Translation by Simplification: The Case of Encoder-Free Models
Part of Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), p. 1186-1193, 2019.
-
Uppsala University and Gavagai at CLEF eRISK: Comparing Word Embedding Models
Part of Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9-12, 2019, 2019.
-
How to Parse Low-Resource Languages: Cross-Lingual Parsing, Target Language Annotation, or Both?
Part of Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019), p. 112-120, 2019.
-
Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Part of Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), p. 2755-2768, 2019.
-
Encoders Help You Disambiguate Word Senses in Neural Machine Translation
Part of Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), p. 1429-1435, 2019.
-
The SweLL Language Learner Corpus: From Design to Annotation
Part of Northern European Journal of Language Technology (NEJLT), p. 67-104, 2019.
DOI for The SweLL Language Learner Corpus: From Design to Annotation Download full text (pdf) of The SweLL Language Learner Corpus: From Design to Annotation
-
Mary Retold
Part of Ancient Jew Review, 2019.
-
Matching Keys and Encrypted Manuscripts
Part of Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa '19), 2019.
Download full text (pdf) of Matching Keys and Encrypted Manuscripts
-
Faces, Fights, and Families: Topic Modeling and Gendered Themes in Two Corpora of Swedish Prose Fiction
Part of DHN 2019 Copenhagen, Proceedings of 4th Conference of The Association Digital Humanities in the Nordic Countries Copenhagen, March 6-8 2019, p. 92-111, 2019.
-
Pseudonymization of Language Learner Data
Part of Workshop om pseudonymisering av textdata, 2019.
Download full text (pdf) of Pseudonymization of Language Learner Data
-
Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts
Part of Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019.
DOI for Towards a Generic Unsupervised Method for Transcription of Encoded Manuscripts
-
The DECODE Database: Collection of Historical Ciphers and Keys
Part of Proceedings of the 2nd International Conference on Historical Cryptology, p. 69-78, 2019.
Download full text (pdf) of The DECODE Database: Collection of Historical Ciphers and Keys
-
Clustering writing components from medieval manuscripts
Part of Proceedings of the Workshop on Computational Methods in the Humanities 2018, p. 23-32, 2019.
-
-
Medical Entity Corpus with PICO Elements and Sentiment Analysis
Part of Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), p. 292-296, 2018.
-
ParCorFull: a Parallel Corpus Annotated with Full Coreference
Part of Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), p. 423-428, 2018.
-
Automatic Scribe Attribution for Medieval Manuscripts
Part of Digital Medievalist Medieval Book Hand Scribes, p. 1-26, 2018.
DOI for Automatic Scribe Attribution for Medieval Manuscripts Download full text (pdf) of Automatic Scribe Attribution for Medieval Manuscripts
-
Lexical and Morpho-syntactic Features in Word Embeddings: A Case Study of Nouns in Swedish
Part of Proceedings of the 10th International Conference on Agents and Artificial Intelligence, p. 663-674, 2018.
-
Parser Training with Heterogeneous Treebanks
Part of Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 619-625, 2018.
DOI for Parser Training with Heterogeneous Treebanks Download full text (pdf) of Parser Training with Heterogeneous Treebanks
-
Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation
Part of Proceedings of the Third Conference on Machine Translation: Research Papers, p. 36-48, 2018.
-
Annotation of learner corpora: first SweLL insights
Part of Abstracts of SLTC 2018, p. 86-89, 2018.
Download full text (pdf) of Annotation of learner corpora: first SweLL insights
-
Rhetorical Figure Detection: Chiasmus, Epanaphora, Epiphora
Part of Frontiers in Digital Humanities, 2018.
DOI for Rhetorical Figure Detection: Chiasmus, Epanaphora, Epiphora
-
Why Self-Attention?: A Targeted Evaluation of Neural Machine Translation Architectures
Part of Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p. 4263-4272, 2018.
-
Sentences with Gapping: Parsing and Reconstructing Elided Predicates
Part of Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, p. 1156-1168, 2018.
-
Enhancing Universal Dependency Treebanks: A Case Study
Part of Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), p. 102-107, 2018.