| Literature DB >> 30457984 |
Joshua D Wallach1,2, Kevin W Boyack3, John P A Ioannidis4,5,6,7,8.
Abstract
Currently, there is a growing interest in ensuring the transparency and reproducibility of the published scientific literature. According to a previous evaluation of 441 biomedical journals articles published in 2000-2014, the biomedical literature largely lacked transparency in important dimensions. Here, we surveyed a random sample of 149 biomedical articles published between 2015 and 2017 and determined the proportion reporting sources of public and/or private funding and conflicts of interests, sharing protocols and raw data, and undergoing rigorous independent replication and reproducibility checks. We also investigated what can be learned about reproducibility and transparency indicators from open access data provided on PubMed. The majority of the 149 studies disclosed some information regarding funding (103, 69.1% [95% confidence interval, 61.0% to 76.3%]) or conflicts of interest (97, 65.1% [56.8% to 72.6%]). Among the 104 articles with empirical data in which protocols or data sharing would be pertinent, 19 (18.3% [11.6% to 27.3%]) discussed publicly available data; only one (1.0% [0.1% to 6.0%]) included a link to a full study protocol. Among the 97 articles in which replication in studies with different data would be pertinent, there were five replication efforts (5.2% [1.9% to 12.2%]). Although clinical trial identification numbers and funding details were often provided on PubMed, only two of the articles without a full text article in PubMed Central that discussed publicly available data at the full text level also contained information related to data sharing on PubMed; none had a conflicts of interest statement on PubMed. Our evaluation suggests that although there have been improvements over the last few years in certain key indicators of reproducibility and transparency, opportunities exist to improve reproducible research practices across the biomedical literature and to make features related to reproducibility more readily visible in PubMed.Entities:
Mesh:
Year: 2018 PMID: 30457984 PMCID: PMC6245499 DOI: 10.1371/journal.pbio.2006930
Source DB: PubMed Journal: PLoS Biol ISSN: 1544-9173 Impact factor: 8.029
Data sharing characteristics among 19 biomedical articles with a data sharing statement.
| PMID | Data statement | Category | PubMed | Functioning |
|---|---|---|---|---|
| 26484203 | “Gene Expression Omnibus (GEO) database repository with the dataset identifier GSE63072.” | Identifiers/accession numbers | Yes | Yes |
| 27096608 | “All data are made available on a public repository (OpenfMRI, accession number ds000202). All other relevant data are added to the text as supplementary material.” | Identifiers/accession numbers | No | Yes |
| 27348411 | “NCBI Sequence Read Archive: TCR sequence data, | Identifiers/accession numbers; excel data | Yes | Yes |
| 27617276 | “Raw data derived from this analysis have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PRIDE: PXD002768.” | Identifiers/accession numbers | Yes | Yes |
| 28970499 | “The dataset generated during and/or analysed during the current study are available from the corresponding author on reasonable request.” | Upon request | No | N/A |
| 28632753 | “All relevant data are within the paper and its Supporting Information files.” | Excel data | No | Yes |
| 28241009 | “All relevant data are within the paper and its Supporting Information files. The accession codes for LSSmCherry1 and RDSmCherry1 are KX638424 and KX638425, which can be viewed here: | Identifiers/accession numbers; excel data | No | Yes |
| 28886694 | “Sequence data of 15 RNA-seq have been uploaded to the NCBI database, and the SRA number was SRX2843778.” | Identifiers/accession numbers | No | No |
| 27214551 | “Supplemental material available online with this article.” | Primers used for qPCR analyses; excel data | No | Yes |
| 27791002 | “The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, | Identifiers/accession numbers | No | Yes |
| 26238763 | “The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD001593 and 10.6019/PXD001592.” | Identifiers/accession numbers | No | Yes |
| 27768894 | “The accession number for the coordinates and structures factors of CB1_AM6538 reported in this paper is PDB: 5TGZ.” | Identifiers/accession numbers | No | Yes |
| 25252277 | “We demonstrate SCoTMI on publicly available resting-state fMRI data from the Human Connectome Project.” | Public data | No | No |
| 26639818 | “ | Identifiers/accession numbers | No | Yes |
| 27871817 | “Collection data and GenBank accession numbers for | Identifiers/accession numbers | Yes | N/A |
| 28349993 | “The genotype data of the 1000 Genomes Project Phase 1 based on 1,092 healthy subjects—525 male (48.1%) and 567 female (51.9%; | Data link | No | Yes |
| 28528644 | “Gene expression array data will be provided or personal research purposes through the corresponding author; residual tissues from the studies may be applied for through the Tayside Tissue Bank, Dundee, Scotland.” | Upon request | No | N/A |
| 28412520 | “The accession number for the RNA-sequencing and whole-genome sequencing data reported in this paper is Sequence Read Archive: SRP100435.” | Identifiers/accession numbers | No | Yes |
| 27108998 | “The complete genome sequence of SAIBK2 obtained in this study was submitted to Genbank database under the accession number of KU317090.” | Identifiers/accession numbers | No | Yes |
* PubMed Central Open Access (PMCOA) articles.
a Category of data sharing.
b Data sharing information available at the abstract/PubMed level.
c Were the links, identifiers, or accession numbers functioning?
Abbreviations: N/A, not applicable; PMID, PubMed identification number.
Articles in the PMCOA versus non-PMCOA and PMCID versus non-PMCID categories.
| Variable | PMCOA | Non-PMCOA | PMCID | Non-PMCID | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| % | % | % | % | |||||||
| * | ||||||||||
| No Mention | 10 | 27.0 | 36 | 32.1 | 13 | 20.3 | 33 | 38.8 | ||
| No Funding | 2 | 5.4 | 8 | 7.1 | 2 | 3.1 | 8 | 9.4 | ||
| Public | 12 | 32.4 | 43 | 38.4 | 31 | 48.4 | 24 | 28.2 | ||
| Private | 0 | 0.0 | 3 | 2.7 | 0 | 0.0 | 3 | 3.5 | ||
| Other | 5 | 13.5 | 6 | 5.4 | 5 | 7.8 | 6 | 7.1 | ||
| Some combination of Public, Private, or Other | 8 | 21.6 | 16 | 14.3 | 13 | 20.3 | 11 | 12.9 | ||
| No Statement | 6 | 16.2 | 46 | 41.1 | * | 19 | 29.7 | 33 | 38.8 | |
| Statement, No Conflict Exists | 28 | 75.7 | 59 | 52.7 | 39 | 60.9 | 48 | 56.5 | ||
| Statement, Conflict Exists | 3 | 8.1 | 7 | 6.2 | 6 | 9.4 | 4 | 4.7 | ||
| Full Protocol | 0 | 0.0 | 1 | 1.4 | 0 | 0 | 1 | 1.8 | ||
| No Protocol | 29 | 100.0 | 74 | 98.6 | 47 | 100.0 | 56 | 98.2 | ||
| Some Data Sharing | 9 | 31.0 | 10 | 13.3 | * | 12 | 25.5 | 7 | 12.3 | |
| No Data Sharing | 20 | 69.0 | 65 | 86.7 | 35 | 74.5 | 50 | 87.7 | ||
| Novel Findings | 14 | 53.8 | 42 | 59.2 | 27 | 61.4 | 29 | 54.7 | ||
| Replication | 0 | 0.0 | 5 | 7.0 | 0 | 0.0 | 5 | 9.4 | ||
| Novel Findings and Replication | 2 | 7.7 | 8 | 11.3 | 5 | 11.4 | 5 | 9.4 | ||
| No Statement on Novelty or Replication | 10 | 38.5 | 16 | 22.5 | 12 | 27.3 | 14 | 26.4 | ||
| No Citing Article | 26 | 100.0 | 69 | 97.2 | 44 | 100.0 | 51 | 96.2 | ||
| At Least One Citing Article | 0 | 0.0 | 2 | 2.8 | 0 | 0.0 | 2 | 3.8 | ||
| No Citing Article | 26 | 100.0 | 70 | 98.6 | 44 | 100.0 | 52 | 98.1 | ||
| At Least One Citing Article, No Data Included | 0 | 0.0 | 1 | 1.4 | 0 | 0.0 | 1 | 1.9 | ||
| At Least One Citing Article, Data Excluded | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||
| At Least One Citing Article, Data Included | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||
a Funding, Statement of conflict, Protocol availability, and Data availability determined using the full text of articles. Replication determined using the abstract and/or introduction.
b P values based on Fisher’s exact test *<0.05 and **0.05 to 0.005.
Abbreviations: PMCID, PubMed Central reference number; PMCOA, PubMed Central Open Access.
Comparison of 2000–2014 and 2015–2017 samples.
| Variable | Sample | ||||
|---|---|---|---|---|---|
| 2000–2014 | 2015–2017 | ||||
| % | % | ||||
| No Mention | 226 | 51.3 | 46 | 30.9 | 1.4 × 10−5 |
| No Funding | 12 | 2.7 | 10 | 6.7 | |
| Public | 87 | 19.7 | 55 | 36.9 | |
| Private | 19 | 4.3 | 3 | 2.0 | |
| Other | 29 | 6.6 | 11 | 7.4 | |
| Some combination of Public, Private, or Other | 68 | 15.4 | 24 | 16.1 | |
| No Statement | 305 | 69.2 | 52 | 34.9 | 2.5 × 10−13 |
| Statement, No Conflict Exists | 110 | 24.9 | 87 | 58.4 | |
| Statement, Conflict Exists | 26 | 5.9 | 10 | 6.7 | |
| Any Protocol | 1 | 0.4 | 1 | 1.0 | 0.48 |
| No Protocol | 267 | 99.6 | 103 | 99.0 | |
| Some Data Sharing | 5c | 1.9 | 19 | 18.3 | 9.7 × 10−8 |
| No Data Sharing | 263 | 98.1 | 85 | 81.7 | |
| Novel Findings | 133 | 51.4 | 56 | 57.7 | 3.0 × 10−4 |
| Replication | 5 | 1.9 | 5 | 5.2 | |
| Novel Findings and Replication | 5 | 1.9 | 10 | 10.3 | |
| No Statement on Novelty or Replication | 111 | 42.9 | 26 | 26.8 | |
| No Abstract | 5 | 1.9 | 0 | 0.0 | |
| No Citing Article | 251 | 96.9 | 95 | 97.9 | 0.73 |
| At Least One Citing Article | 8 | 3.1 | 2 | 2.1 | |
| No Citing Article | 221 | 85.3 | 96 | 99.0 | 8.9 × 10−4 |
| At Least One Citing Article, No Data Included | 19 | 7.3 | 1 | 1.0 | |
| At Least One Citing Article, Data Excluded | 3 | 1.2 | 0 | 0.0 | |
| At Least One Citing Article, Data Included | 16 | 6.2 | 0 | 0.0 | |
| Yes | 87 | 19.3 | 64 | 42.9 | 6.7 × 10−8 |
| No | 354 | 80.7 | 85 | 57.1 | |
| Yes | 33 | 7.5 | 37 | 24.8 | 1.2 × 10−7 |
| No | 408 | 92.5 | 112 | 75.2 | |
aFunding, statement of conflict, protocol availability, and data availability determined using the full text of articles. Replication determined using the abstract and/or introduction.
bBased on Fisher’s Exact Test.
cAll were partial data sharing or data sharing statements (complete data set upon request statement, supplementary PDB file provided, demographic and clinical features of patients provided in supplementary table, GenBank links provided, and a link to an original data source used for the study provided).
Abbreviations: PMCID, PubMed Central reference number; PMCOA, PubMed Central Open Access.
Fig 1Proportion of articles with funding or COI statements, 2000–2017 (3-year moving proportion).
Underlying data for Fig 1 can be found at https://osf.io/3ypdn/. COI, conflicts of interest
Fig 2Proportion of articles with data sharing statement, 2000–2017 (3-year moving proportion).
Underlying data for Fig 2 can be found at https://osf.io/3ypdn/.
Fig 3Proportion of articles reporting novel, replication, or unclear findings, 2000–2017 (3-year moving proportion).
Underlying data for Fig 3 can be found at https://osf.io/3ypdn/.
Number of articles across 3-year periods.
| Years | Number of articles |
|---|---|
| 2000–2002 | 79 |
| 2003–2005 | 89 |
| 2006–2008 | 79 |
| 2009–2011 | 91 |
| 2012–2014 | 103 |