| Literature DB >> 28923006 |
Leslie D McIntosh1, Anthony Juehne2, Cynthia R H Vitale3, Xiaoyan Liu1, Rosalia Alcoser1, J Christian Lukas1, Bradley Evanoff4.
Abstract
BACKGROUND: The reproducibility of research is essential to rigorous science, yet significant concerns of the reliability and verifiability of biomedical research have been recently highlighted. Ongoing efforts across several domains of science and policy are working to clarify the fundamental characteristics of reproducibility and to enhance the transparency and accessibility of research.Entities:
Keywords: Accessibility; Ehr; Electronic health records; Replication; Reproducibility; Secondary data re-use; Transparency
Mesh:
Year: 2017 PMID: 28923006 PMCID: PMC5604503 DOI: 10.1186/s12874-017-0377-6
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Workflow to identify elements needed to reproduce studies
Categories for RR variables
| Axes of Research Reproducibility | Example | Categories |
|---|---|---|
| Transparency is the robust write up or description of research, such that it is clear and explicit. | All data collection processes are described clearly within publication methods and metadata. | data collection, data cleaning/preparation, data integration, data analysis, data sharing, code (cleaning, integration, analysis), data, software, documentation |
| Accessibility is a multi-faceted term encompassing both sharing and discoverability. Shared information such as a research dataset or analysis code must be discoverable, in a form that people can use, and available. Discoverability is defined as being in a location that enables the finding of the data and supplemental materials. | A query script used in data collection procedures is shared in a freely accessible and easily discoverable database. |
RepeAT framework variables where inter-rater reliability could be calculated using Cohen’s kappa
| RepeAT Framework Variable | Cohen’s Kappa | Kappa Bounds | var Rater 1 | var Rater 2 | Percent Agreement |
|---|---|---|---|---|---|
| Publication state database(s) source(s) of data? | 0.320 | (0.580–0.060) | 0.095 | 0.250 | 70.6 |
| Does the publication clearly state process(es) for validating data minded via nlp and/or queried from a database? | 0.440 | (0.860–0.019) | 0.182 | 0.069 | 85.7 |
| Does the author state any clear process documented for accounting for missing data? | 0.520 | (0.890–0.140) | 0.115 | 0.261 | 83.3 |
| Does the research involve natural language processing or text mining? | 0.870 | (1.100–0.630) | 0.134 | 0.107 | 97.1 |
| Does the author indicate the software used to develop the analysis code? | 0.880 | (1.000–0.710) | 0.236 | 0.243 | 94.1 |
RepeAT framework categories and concepts
| Reproducibility Category | Major Concepts |
|---|---|
| Research Design and Aim | Recording administrative and study information |
| Database and Data Collection Methods | Clarifying study data source(s) and methods of collection |
| Data Mining and Data Cleaning | Describing process for cleaning, merging, and validating data |
| Data Analysis | Clarifying methods and materials for data analysis |
| Data Sharing and Documentation | Making relevant research data and documentation shared, accessible, and intelligible |
Abbreviated RepeAT framework with example variablesa
| Publication Overview and Bibliographic Information (21 items) | |
| Article Title | Text |
| DOI | Text |
| Is the research hypothesis-driven or hypothesis-generating? | Hypothesis Driven |
| Database and Data Collection (63 items) | |
| Publication states database(s) source(s) of data? | Yes/No |
| bPublication states database(s) source(s) of data in the following location: | Not Stated |
| Query methodology | Manual extraction |
| bDoes the shared query script for database contain comments and/or notations for ease of reproducibility? | Yes/No |
| Methods: Data Mining and Cleaning (19 items) | |
| Does the research involve natural language processing or text mining? | Yes/No |
|
bPlease list all software applications used for text mining: | Text |
|
bIs the text mining software application proprietary or open? | 1. Proprietary |
| Methods: Data Analysis (15 items) | |
| Does the author state analysis methodology and process? | Yes/No |
| Does the author indicate the software used to develop the analysis code? | Yes/No |
| bIs the analysis software proprietary or open? | Proprietary |
| Data Sharing and Data Documentation (36 items) | |
| Is the finalized dataset shared? | Yes |
| bWhere is the finalized dataset shared? | Affiliated Research Center Website |
| Is there a clear process for requesting the data? | Yes |
aThe full Framework can be found in the Additional file 4: Appendix as well as online within this project's Github repository (https://github.com/CBMIWU/Research_Reproducibility/tree/master/DataDictionary) and our project's Open Science Framework project management tool (Additional file 3: https://osf.io/ppnwa/)
bIndicates items that are shown only if a specific response to another item has been selected using skip-logic