| Literature DB >> 35429300 |
Abstract
Reproducibility is not only essential for the integrity of scientific research but is also a prerequisite for model validation and refinement for the future application of predictive algorithms. However, reproducible research is becoming increasingly challenging, particularly in high-dimensional genomic data analyses with complex statistical or algorithmic techniques. Given that there are no mandatory requirements in most biomedical and statistical journals to provide the original data, analytical source code, or other relevant materials for publication, accessibility to these supplements naturally suggests a greater credibility of the published work. In this study, we performed a reproducibility assessment of the notable paper by Gerstung et al. (Nat Genet 49:332-340, 2017) by rerunning the analysis using their original code and data, which are publicly accessible. Despite an open science setting, it was challenging to reproduce the entire research project; reasons included: incomplete data and documentation, suboptimal code readability, coding errors, limited portability of intensive computing performed on a specific platform, and an R computing environment that could no longer be re-established. We learn that the availability of code and data does not guarantee transparency and reproducibility of a study; paradoxically, the source code is still liable to error and obsolescence, essentially due to methodological and computational complexity, a lack of reproducibility checking at submission, and updates for software and operating environment. The complex code may also hide problematic methodological aspects of the proposed research. Building on the experience gained, we discuss the best programming and software engineering practices that could have been employed to improve reproducibility, and propose practical criteria for the conduct and reporting of reproducibility studies for future researchers.Entities:
Mesh:
Year: 2022 PMID: 35429300 PMCID: PMC9360099 DOI: 10.1007/s00439-022-02455-8
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 5.881
Reproducibility checklist
| Aspect | Item No | Item | Note |
|---|---|---|---|
| Accessibility (yes/partially/no) | 1a | Data | Are the FAIR principles respected? Is there sufficient metadata to understand the data? Is the data provided in an easy-to-use format? |
| 1b | Is the data (if available) original, processed, anonymized, or simulated? | Simulated data is usually provided when the original data is confidential | |
| 1c | Data dictionary | A collection of names, definitions, descriptions, etc., of variables in the dataset(s) of the research project | |
| 2 | Source code | Is the source code a plain text script or presented as a dynamic report? | |
| 3 | Documentation of the project | Is there a README file, technical note, and/or study protocol? | |
| 4 | Statistical software | Is the specific version, e.g. R (v.3.1.2), accessible? | |
| 5 | Software extensions | Are the specific versions, e.g. survival (v.3.2–13), accessible? | |
| 6 | Operating system and hardware layer | Does the reproducer have access to the same computing platform, e.g. Debian GNU/Linux 11? | |
| 7 | Can dependencies be set up easily on the reproducer’s computing platform? | e.g. By running simple commands | |
| 8 | If not, are there any compatibility issues hindering the setup process? | ||
| Clarity (yes/partially/no) | 9 | Description of methods | e.g. Theoretical concepts, analytical strategies, algorithmic considerations |
| 10 | Code readability | Is the code self-explanatory, regardless of comments? Does the code follow any style guide? Are compiled languages like C or C + + used? | |
| 11 | Inline comments | Are the comments comprehensible? Are there unnecessary obsolete code lines? | |
| 12 | Documentation of custom packages and functions, if applicable | e.g. R package vignette | |
| Code execution | 13 | Is any form of testing on the functions/packages performed? | e.g. |
| 14 | (Running analysis code) On mouse-clicks / Minor modifications required / Major modifications with expertise required (e.g. reverse engineering of results) / Impossible to rerun | ||
| Implementation of the theoretical methods | 15 | Consistent / Largely consistent / Largely inconsistent / Unable to identify | Does the code reflect the methods described in the paper? |
| Matching of outputs | 16 | Format of the results | e.g. tables, figures, GUI (graphical user interface) |
| 17 | Identical with exactly the same results / Same interpretation with deviations in numbers / Inconsistent conclusions / Unable to reproduce the results | ||
| Overall reproducibility | 18 | Reproducible / Partially reproducible / Irreproducible | |
| 19 | Background of researcher(s) performing the assessment | e.g. Clinician, epidemiologist, bioinformatician, (bio)statistician, engineer |
Reproducibility assessment of the study by Gerstung et al.
| Aspect | Item No | Item | Assessment | Note |
|---|---|---|---|---|
| Accessibility (yes/partially/no) | 1a | Data | Partially | Most data available, TCGA clinical data not provided |
| 1b | Is the data (if available) original, processed, anonymized, or simulated? | Anonymized | ||
| 1c | Data dictionary | Partially | Data dictionary incomplete | |
| 2 | Source code | Yes | Documented in the accompanying Supplementary Note | |
| 3 | Documentation of the project | Yes | Supplementary Note and README files provided | |
| 4 | Statistical software | Yes | R (v.3.1.2) | |
| 5 | Software extensions | Yes | Version information listed in the Supplementary Note | |
| 6 | Operating system and hardware layer | Partially | The authors have no access to the LSF environment used by Gerstung et al. | |
| 7 | Can dependencies be set up easily on the reproducer’s computing platform? | No | Dockerfile is not complete enough to allow a rebuild of the original computing environment | |
| 8 | If not, are there any compatibility issues hindering the setup process? | Yes | Today’s R (v.3.1.2) does not support many packages used; conflicts between OS and R (v.3.1.2) occurred | |
| Clarity (yes/partially/no) | 9 | Description of methods | Yes | Documented in the Supplementary Note |
| 10 | Code readability | Partially | Some variable names not self-explanatory or consistent; coding errors spotted; R style guide seemingly not followed; C + + used via Rcpp package | |
| 11 | Inline comments | Partially | Inline comments are helpful however not sufficient; several code lines are commented out but not deleted | |
| 12 | Documentation of custom packages and functions, if applicable | Yes | Two custom packages with documentation provided: | |
| Code execution | 13 | Is any form of testing on the functions/packages performed? | Partially | |
| 14 | (Running analysis code) On mouse-clicks / Minor modifications required / Major modifications with expertise required (e.g. reverse engineering of results) / Impossible to rerun | Major modifications with expertise required | Coding errors, limited cross-platform portability | |
| Implementation of the theoretical methods | 15 | Consistent /Largely consistent / Largely inconsistent / Unable to identify | Consistent | |
| Matching of outputs | 16 | Format of the results | Main paper: figures; Supplementary: tables and figures; | |
| 17 | Identical with exactly the same results / Same interpretation with deviations in numbers / Inconsistent conclusions / Unable to reproduce the results | Same interpretation with deviations in numbers / Unable to reproduce the results | Minor deviations spotted among the results that could be regenerated, see | |
| Overall reproducibility | 18 | Reproducible / Partially reproducible / Irreproducible | Partially reproducible | |
| 19 | Background of researcher(s) performing the assessment | A clinical epidemiologist and a mathematician |
Fig. 1Six-layered structure of an R computing environment. The Dockerfile from Gerstung et al. constructs layers upon a pre-built Docker-based R (v.3.1.2), which was built by reading the instructions described in the Dockerfile from the Rocker project; however, dependency conflicts occur across these layers
Fig. 2Flow chart showing treatments and prognoses of AML patients in the knowledge bank database
Fig. 3Predicted 3-year mortality reduction from allograft in CR1, as opposed to salvage allograft after relapse (y-axis), predicted 3-year mortality of standard chemotherapy only (x-axis). Date of complete remission as the starting point. Individual patients denoted by the points and colored by ELN risk classifications. Population average mortality fitted and presented by the curve. a The KBA was based on the entire 1540 patients, while predictions were calculated for 995 patients eligible for allograft at CR1,
adapted from Gerstung et al. (2017); b Modified KBA based on the 995 eligible patients