@article{Eglen_Nüst_2019, place={Tromsø, Norway}, title={CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications}, url={https://septentrio.uit.no/index.php/SCS/article/view/4910}, DOI={10.7557/5.4910}, abstractNote={<p>Watch the <a href="https://mediasite.uit.no/Mediasite/Play/8027873496dc465ebc4b9b3ab0338ad01d?playFrom=1772000">VIDEO</a>.</p> <p>Analysis of data and computational modelling is central to most scientific disciplines. The underlying computer programs are complex and costly to design. However, these computational techniques are rarely checked during review of the corresponding papers, nor shared upon publication. Instead, the primary method for sharing data and computer programs today is for authors to state "data available upon reasonable request", although the actual code and data is the only sufficiently detailed description of a computational workflow that allows reproduction and reuse. Despite best intentions, these programs and data can quickly disappear from laboratories. Furthermore, there is a reluctance to share: only 8% of papers in recent top-tier AI conferences shared code relating to their publications (Gundersen et al. 2018). This low-rate of code sharing is seen in other fields, e.g. computational physics (Stodden et al. 2018). Given that code and data are rich digital artefacts that can be shared relatively easily, and that funders and journal publishers increasingly mandate sharing of resources, we should be sharing more and follow best practices for data and software publication. The permanent archival of valuable code and datasets would allow other researchers to make use of these resources in their work, and improve the reliability of reporting as well as the quality of tools.</p> <p>We are building a computational platform, called CODECHECK (<a href="http://www.codecheck.org.uk">http://www.codecheck.org.uk</a>), to enhance the availability, discovery and reproducibility of published computational research. Researchers that provide code and data will have their code independently run to ensure the computational parts of a workflow can be reproduced. The results from our independent run will then be shared freely post-publication in an open repository. The reproduction is attributed to the person perfoming the check. Our independent runs will act as a "certificate of reproducible computation". These certificates will be of use to several parties at different times during the generation of a scientific publication.</p> <ol> <li class="show">Prior to peer review, the researchers themselves can check that their code runs on a separate platform.</li> <li class="show">During peer review, editors and reviewers can check if the figures in the certificate match those presented in manuscripts for review without cumbersome download and installation procedures.</li> <li class="show">Once published, any interested reader can download the software and even data that was used to generate the results shown in the certificate.</li> </ol> <p>The code and results from papers are shared according to the principles we recently outlined (Eglen et al. 2017). To ensure our system scales to large numbers of papers and is trustworthy, our system will be as automated as possible, fully open itself, and rely on open source software and open scholarly infrastructure. This presentation will discuss the challenges faced to date in building the system and in connecting it with existing peer-review principles, and plans for links with open access journals.</p> <p><strong>Acknowledgements</strong></p> <p>This work has been funded by the UK Software Sustainability Institute, a Mozilla Open Science Mini grant and the German Research Foundation (DFG) under project number PE 1632/17-1.</p>}, number={1}, journal={Septentrio Conference Series}, author={Eglen, Stephen and Nüst, Daniel}, year={2019}, month={Sep.} }