Creating a corpus for Kven, a minority language in Norway




Corpus linguistics, revitalisation, minority language, kven


Language documentation, including the development and use of corpora, is frequently linked to revitalisation. This is also the case for the Kven language, a Finnic minoritised language, traditionally spoken in the two northernmost counties of Norway. Kven is a recognised minority language in Norway, protected by the European Charter for Regional or Minority Languages. This status led to increased efforts to document Kven, including the development of the Ruija Corpus, consisting of recordings of interviews in Kven. The corpus was an important tool for the standardisation of Kven. In this article we describe how the corpus was developed and account for search functions, including a discussion of the limitations of the corpus. We also discuss the role of corpora and other online tools for language revitalisation, with a particular focus on the standardisation of Kven and conclude by reflecting on how expertise also resides with the speakers of an endangered language and that they have a right to be involved in efforts of language documentation and revitalisation.


Austin, Peter. 2020. Language documentation and revitalisation. In Revitalizing Endangered Languages: A Practical Guide, edited by Justyna Olko and Julia Sallabank, pp. 199–219. Cambridge University Press, Cambridge.

Costa, James, Haley De Korne and Pia Lane. 2017. Standardising minority languages: Reinventing peripheral languages in the 21st century. In Standardizing Minority Languages: Competing Ideologies of Authority and Authenticity in the Global Periphery, edited by Pia Lane, James Costa and Haley De Korne, 99.1-23. Routledge, New York.

Eira Christine. 2007. Addressing the ground of language endangerment. In Working Together for Endangered Languages: Research Challenges and Social Impacts – Proceedings of Foundation for Endangered Languages Conference XI Kuala Lumpur October 26–28 2007, edited by Maya K. David, Nicholas Ostlerand Ceasar Dealwis, pp. 82–90. Foundation for Endangered Languages.

Evans, Nicholas and Alan Dench. 2006. Introduction: Catching language. In Catching Language: The Standing Challenge of Grammar Writing, edited by Felix Ameka, Alan Dench and Nicholas Evans, pp. 1–39. Mouton de Gruyter, Berlin, New York.

Gal, Susan. 2006. Contradictions of standard language in Europe: Implications for the study of publics and practices. Social Anthropology: 14(2), 163–181.


Haugen, Einar. 1972. The ecology of language: Essays by Einar Haugen. Selected and introduced by Anwar S. Dil. Stanford University Press, Stanford.

Hill, Jane. 2002. “Expert rhetorics” in advocacy for endangered languages: who is listening, and what do they hear? Journal of Linguistic Anthropology 12(2): 119–133.

Hinton, Leanne. 2018. Approaches to and strategies for language revitalization. In The Oxford Handbook of Endangered Languages, edited by Kenneth Rehg and Lyle Campbell, pp. 443–465. Oxford University Press, New York.

Hinton, Leanne, Leena Huss and Gerald Roche. 2018. Language revitalization as a growing field of study and practice. In The Routledge Handbook of Language Revitalization, edited by Leanne Hinton, Leena Huss and Gerald Roche, pp. xxi-xxx. Routledge. New York.

Hyltenstam, Kenneth and Tommaso M. Milani. 2003. Kvenskans Status: Rapport for Kommunal- og regionaldepartementet og Kultur- og Kirkedepartementet i Norge. Oslo.

IMS Open Corpus Workbench:

Johannessen, Janne Bondi; Nygaard, Lars; Priestley, Joel; Nøklestad, Anders. 2008. Glossa: a multilingual, multimodal, configurable user interface. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias, pp. 617–621. European Language Resources Association (ELRA), Paris.

Kaplan, Robert, Richard Baldauf, Anthony Liddicoat, Pauline Bryant, Marie-Thérèse Barbaux and Martin Pütz. 2000. Current issues in language planning, Current Issues in Language Planning, 1(1): 1–10.

Keränen, Mari. 2018. Language maintenance through corpus planning – the case of Kven. Acta Borealia, 35(2): 176–191.

Kloss, Heinz 1967Bilingualism and nationalism. Journal of Social Issues, 23(2), 39–47.

Kven online dictionary, Nettidigisanat

Latomaa, Sirkku and Pirkko Nuolijärvi. 2005. The language situation in Finland. In Language Planning and Policy in Europe, Vol. 1. Hungary, Finland and Sweden, edited by Robert B. Kaplan and Richard B. Baldauf, pp. 125-232. Multilingual Matters, Clevedon.

Lane, Pia. 2011. The birth of the Kven language in Norway: Emancipation through state recognition. International Journal of the Sociology of Language 209: 7–74.

Lane Pia. 2016. Standardising Kven: Participation and the role of users. Sociolinguistica 30: 105–124.

Lane, Pia. 2015. Minority language standardisation and the role of users. Language Policy 14, 263–283.

Lane, Pia. 2017. Language standardisation as frozen mediated actions – the materiality of language standardization. In Standardizing Minority Languages: Competing Ideologies of Authority and Authenticity in the Global Periphery, edited by Pia Lane, James Costa and Haley De Korne, pp. 101–117. Routledge, New York.

Lane, Pia and Miki Makihara. 2017. Indigenous peoples and their languages. In The Oxford Handbook of Language and Society, edited by Ofelia García, Nelson Flores and Massimilano Spotti, pp. 299–230. Oxford University Press, New York.

Leonard, Wesley. 2017. Producing language reclamation by decolonising ‘language’. Language Documentation and Description 14: 15–36.

Milroy, James and Leslie Milroy. 1999. Authority in Language: Investigating Standard English. Routledge, London.

Niemi, Einar. 1995. The Finns in northern Scandinavia and minority policy. In Ethnicity and Nation Building in the Nordic World, edited by Sven Tägil, pp. 145–178. Hurst and co, London.

Niemi, Einar. 2003. Regimeskifte, innvandrere og fremmede. In Norsk innvandringshistorie. I nasjonalstatens tid 1814-1940, edited by Einar Niemi, Jan Eivind Myhre and Knut Kjeldstadli, pp. 11–47. Pax forlag, Valdres.

Nøklestad, Anders, Kristin Hagen, Janne Bondi Johannessen, Michal Kosek and Joel Priestley. 2017. A modernised version of the Glossa corpus search system. In Proceedings of the 21st Nordic Conference on Computational Linguistics (NoDaLiDa), edited by Jörg Tiedemann and Nina Tahmasebi, pp. 251–254. Association for Computational Linguistics, Gothenburg.

O’Rourke, Bernadette and Joan Pujolar. 2015. New speakers of minority languages: the challenging opportunity – Foreword. International Journal of the Sociology of Language 231: 1–20.

Pietikäinen, Sari, Leena Huss, Sirkka Laihiala-Kankainen, Ulla Aikio-Puoskari and Pia Lane. 2010. Regulating multilingualism. Acta Borealia: 27(1): 1–23.

Ruija Corpus:

Sundelin, Egil. 1998. Kvenene – en nasjonal minoritet i Nord-Troms og Finnmark? In Kvenenes historie og kultur, edited by Helge Guttormsen, pp. 35-48. Nord-Troms historielag, Skjervøy.

Söderholm, Eira. 2014. Kainun Kielen Grammatikki. Suomalaisen Kirjallisuuden Seura, Helsinki.

Söderholm, Eira. 2017. Kvensk Grammatikk. Cappelen Damm Akademisk, Oslo.

Søfteland, Åshild, Anders Nøklestad, Joel Priestley and Kristin Hagen. 2020. Glossa som forskningsverktøy. Hva folk søker etter og hva resultatene brukes til. Oslo Studies in Language: 11(2): 449–464.

Trosterud, Sindre Reino, Trond Trosterud, Anna-Kaisa Räisänen, Leena Niiranen, Mervi Haavisto and Kaisa Maliniemi. 2017. A morphological analyser for Kven. In Proceedings of the Third Workshop on Computational Linguistics for Uralic Languages, pp. 76–88, St. Petersburg, Russia. Association for Computational Linguistics.

Trosterud, Trond. 2019. Kva bruker vi minoritetsspråksordbøker til? Ein studie av brukarloggane for tolv tospråklege ordbøker. LexicoNordica 26: 177–198.

Woolard, Katheryn. 2008. Language and identity choice in Catalonia: The interplay of contrasting ideologies of linguistic authority. In Lengua, Nación e Identidad: La Regulación del Plurilingüismo en España y América Latina, edited by Kirsten Siiselbeck, Ulrike Mühlschlegel, and Peter Masson, pp. 303–323. Vervuert, Frankfurt am Main /Iberoamericana, Madrid.

Wright, Sue. 2004. Language Policy and Language Planning: From Nationalism to Globalisation, Palgrave Macmillan, Basingstoke.

Östman, Jan-Ola. 2000. Ethics and appropriation – with special reference to Hwalbáy. In Issues of Minority Peoples, edited by Frances Karttunen and Jan-Ola Östman, pp. 37–60. Department of General Linguistics, University of Helsinki.