Seminar in Computational Linguistics

  • Date: –14:30
  • Location: Engelska parken 22-1017
  • Lecturer: Roni Katzir
  • Contact person: Miryam de Lhoneux
  • Seminarium

Representing and learning phonological knowledge

The representation and learning of linguistic knowledge are closely connected. I will discuss two aspects of this connection, focusing on phonological knowledge. Starting from representations, I show how fixing an explicit scheme for storing phonological knowledge yields an evaluation criterion, based on the principle of Minimum Description Length (MDL), that allows the learner to compare competing hypotheses regarding the grammar. When combined with a general-purpose optimization procedure (e.g., simulated annealing or a genetic algorithm), this criterion supports an unsupervised learner for phonology. I show how an implemented MDL learner of this kind succeeds in learning complex phonological patterns that have challenged other approaches to learning.


Turning to the opposite direction, I show how fixing a general learning approach such as MDL makes it possible to reason about competing hypotheses about the correct representational scheme for grammars. I illustrate this using a case study concerning the phonological lexicon. Early work in phonology made use of language-specific constraints on the lexicon, while more recent work (especially within Optimality Theory) has rejected such constraints. I show that if language-specific constraints are banned, an MDL learner will fail to learn aspects of phonological knowledge that human speakers acquire. If such constraints are allowed, the learner succeeds. This constitutes an argument in favor of allowing language-specific constraints on the lexicon.