FinUgRevita: Developing Language Technology Tools for Udmurt and Mansi

  • Veronika Vincze University of Szeged Department of Informatics
  • Ágoston Nagy University of Szeged Institute of English-American Studies
  • Csilla Horváth University of Szeged Institute of English-American Studies
  • Norbert Szilágyi University of Szeged Department of Finno-Ugric Studies
  • István Kozmács University of Szeged Department of Finno-Ugric Studies
  • Edit Bogár University of Szeged Institute of English-American Studies
  • Anna Fenyvesi University of Szeged Institute of English-American Studies

Abstract

Nowadays, digital language use such as reading and writing e-mails, chats, messages, weblogs and comments on websites and social media platforms such as Facebook and Twitter has increased the amount of written language production for most of the users. Thus, it is primarily important for speakers of minority languages to have the possibility of using their own languages in the digital world too. The FinUgRevita project aims at providing computational language tools for endangered indigenous Finno-Ugric languages in Russia, assisting the speakers of these languages in using the indigenous languages in the digital space. Currently, we are working on two Finno-Ugric minority languages, namely, Udmurt and Mansi. In the project, we have been developing electronic dictionaries for both languages, besides, we have been creating corpora with a substantial number of texts collected, among other sources like literature, newspaper articles and social media. We have been also implementing morphological analyzers for both languages, exploiting the lexical entries of our dictionaries. We believe that the results achieved by the FinUgRevita project will contribute to the revitalization of Udmurt and Mansi and the tools to be developed will help these languages establish their existence in the digital space as well.

Published
2015-06-17