| Literature DB >> 33775145 |
M S Krafczyk1, A Shi1, A Bhaskar1, D Marinov1, V Stodden1.
Abstract
We carry out efforts to reproduce computational results for seven published articles and identify barriers to computational reproducibility. We then derive three principles to guide the practice and dissemination of reproducible computational research: (i) Provide transparency regarding how computational results are produced; (ii) When writing and releasing research software, aim for ease of (re-)executability; (iii) Make any code upon which the results rely as deterministic as possible. We then exemplify these three principles with 12 specific guidelines for their implementation in practice. We illustrate the three principles of reproducible research with a series of vignettes from our experimental reproducibility work. We define a novel Reproduction Package, a formalism that specifies a structured way to share computational research artifacts that implements the guidelines generated from our reproduction efforts to allow others to build, reproduce and extend computational science. We make our reproduction efforts in this paper publicly available as exemplar Reproduction Packages. This article is part of the theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico'.Entities:
Keywords: code packaging; open code; open data; reproducibility; software testing; verification
Year: 2021 PMID: 33775145 PMCID: PMC8059663 DOI: 10.1098/rsta.2020.0069
Source DB: PubMed Journal: Philos Trans A Math Phys Eng Sci ISSN: 1364-503X Impact factor: 4.226
Origin of featured articles in this study.
| DOI | origin |
|---|---|
| JCP Study | |
| JCP Study | |
| JCP Study | |
| JCP Study | |
| JCP Study | |
| found while trying to reproduce [ | |
| found while trying to reproduce [ |
Figure 1Our efforts to reproduce each article took varying amounts of time (up to 40 h) and yielded varying levels of success. We tracked our progress on each article by noting when significant events occurred and how much of the article we had reproduced at each of these points. Progress is measured by first enumerating the number of computed numbers within tables and plots within figures, referred to as ‘assets’. We rate our completion of each article as the percentage of assets that have been reproduced. Details about which assets were completed when are recorded in each Reproduction Package with a file namednotes.txt. We present the completion percentage over time for each article in this study. The solid line is for article [18], the dotted line is for [14], the long dashed line is for [17], the short dashed line is for [16], the long dash dotted line is for [13], the short dash dotted line is for [15] and the long dash double dotted line is for [12]. For each figure, the x-axis is time measured in hours; the y-axis is the completion measured in percentage. Occasionally, we reached insurmountable roadblocks, terminating our reproduction effort prematurely (before 40 h had been spent). (Online version in colour.)
Figure 2Reproduction of figure 5c from article [15] after the transformation with conspicuous empty spaces. (Online version in colour.)
Figure 3Reproduction of figure 5c from article [15] after the transformation and with conspicuous empty spaces removed. (Online version in colour.)