Computational Linguistics Resources

Software and linguistic data (different types of corpora) are indispensable for research, application development, and teaching in computational linguistics and language technology. Below are several freely available resources developed by researchers in the department (in collaboration with researchers at other institutions):

  • MaltParser is a system for data-driven dependency parsing.
  • OPUS is a collection of parallel corpora.
  • Swedish Treebank is a Swedish corpus with syntactic annotation.
  • Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora.