Generating a lexicon of Scandinavian modals

  • Gunnar Hrafn Hrafnbjargarson The Text Laboratory, University of Oslo
Keywords: Danish, Faroese, Icelandic, Norwegian, parallel corpora, Scandinavian modal auxiliaries, Swedish, The Sophie Treebank


Morphology and phonology can in many cases be used to figure out which words correspond to which in Scandinavian. For instance, it is rather easy to figure out which Norwegian personal pronoun corresponds to which in Danish, and even Icelandic or Faroese. However, when it comes to prepositions and modal verbs we cannot rely on morphology or phonology alone. For example, Norwegian and Danish måtte do not always have the same meaning, and similarly, Icelandic vilja is not used as the future modal as Norwegian and Danish ville is. Instead of relying on morphology or phonology, we can use parallel corpora. Unfortunately, there are not many parallel corpora that include all of the Scandinavian languages, and those that exist are maybe not large enough to give reliable results. Nevertheless, to get a picture of what it could look like, the Danish, Faroese, Icelandic, Norwegian, Swedish, and English parts of a small treebank, The Sophie Treebank, were used to find out which modal verbs correspond to which in the various languages.