CLARINO language resource DSpace repository at the University of Bergen Library
DOI:
https://doi.org/10.7557/5.3649Abstract
CLARINO is a Norwegian infrastructure project jointly funded by the Research Council of Norway and a consortium of Norwegian universities and research institutions. Its goal is to implement the Norwegian part of CLARIN. Every CLARIN Centre type B is required to run a dedicated language resource data repository in accordance with certain criteria for good practice and compatibility in the CLARIN infrastructure. The University of Bergen Library (UBL) which participates in CLARINO was assigned the task of implementing and running a repository to primarily manage the resources at the University of Bergen. The repository is also open to other partners in CLARINO and to the whole CLARIN community.
In 2013 the University of Bergen decided to use the open software application DSpace, as modified by the Institute of Formal and Applied Linguistics at the Charles University in Prague for their CLARIN/LINDAT repository.
This poster and demo presents the architecture of the CLARINO Bergen repository. We describe the motivation for using the LINDAT repository as a model. We describe the process of setting up in Bergen the required functions by adapting the LINDAT software where needed.
UBL had some previous experience with DSpace for the implementation of the Bergen Open Research Archive. This experience showed that DSpace is a functional and stable platform which is open source and well maintained by an active user community. It provides long term storage and linking, suitable authentication mechanisms, handling of licenses for downloading of resources, and metadata can be harvested at an OAI-PMH endpoint.
Our poster presents the Bergen CLARINO repository as a good example of sharing of technical solutions. We will show the main features which has been added in LINDAT in order adapt DSpace for use in CLARIN, namely CMDI metadata integration and a method for license handling which adds the possibility of signing licenses.
We will point out remaining technical challenges. A major one has to do with the metadata flow. Our system is able to generate CMDI metadata, as required by CLARIN, from DSpace internal metadata fields. However, there are differences between the Norwegian (CLARINO) and Czech (LINDAT) metadata profiles. Furthermore, DSpace cannot accommodate arbitrary new CMDI profiles. A principal limitation is that DSpace local metadata are a flat structure while CMDI has a hierarchical structure with possible embedding of components in components.