| Literature DB >> 32243442 |
Ester M Eckert1, Andrea Di Cesare1, Diego Fontaneto1, Thomas U Berendonk2, Helmut Bürgmann3, Eddie Cytryn4, Despo Fatta-Kassinos5, Andrea Franzetti6, D G Joakim Larsson7,8, Célia M Manaia9, Amy Pruden10, Andrew C Singer11, Nikolina Udikovic-Kolic12, Gianluca Corno1.
Abstract
Have you ever sought to use metagenomic DNA sequences reported in scientific publications? Were you successful? Here, we reveal that metagenomes from no fewer than 20% of the papers found in our literature search, published between 2016 and 2019, were not deposited in a repository or were simply inaccessible. The proportion of inaccessible data within the literature has been increasing year-on-year. Noncompliance with Open Data is best predicted by the scientific discipline of the journal. The number of citations, journal type (e.g., Open Access or subscription journals), and publisher are not good predictors of data accessibility. However, many publications in high-impact factor journals do display a higher likelihood of accessible metagenomic data sets. Twenty-first century science demands compliance with the ethical standard of data sharing of metagenomes and DNA sequence data more broadly. Data accessibility must become one of the routine and mandatory components of manuscript submissions-a requirement that should be applicable across the increasing number of disciplines using metagenomics. Compliance must be ensured and reinforced by funders, publishers, editors, reviewers, and, ultimately, the authors.Entities:
Year: 2020 PMID: 32243442 PMCID: PMC7159239 DOI: 10.1371/journal.pbio.3000698
Source DB: PubMed Journal: PLoS Biol ISSN: 1544-9173 Impact factor: 8.029
Fig 1Main trends in accessibility of metagenomic data.
(A) Proportion of papers with accessible metagenomic data and type of inaccessible data (grey) (B) in B, M, or T journals. (C) Temporal trends divided by B, M, and T journals. B, biological; M, multidisciplinary; T, technological.
Potential drivers of the proportion of papers with accessible data for each journal.
| Driver | LR Chisq | df | Effect | |
|---|---|---|---|---|
| 16.9 | 2 | 0.0002 | - | |
| 10.8 | 1 | 0.0010 | Negative | |
| 7.1 | 1 | 0.0075 | Positive | |
| 24.1 | 14 | 0.0447 | - | |
| 2.6 | 1 | 0.1094 | - | |
| 0.4 | 1 | 0.8378 | Negative |
Data are accounting for the following: the differences among biological, multidisciplinary, and technological journals (B-M-T), the impact factor (in log scale), the number of papers (in log scale), the publisher, Open Access, and the age (in log scale). Analysis of the deviance table from binomial GLM with the LR Chisq, df, p-value, and directionality of the effect for continuous variables.
Abbreviations: B-M-T, biological, multidisciplinary, and technological journal; df, degrees of freedom; GLM, generalised linear model; LR Chisq, likelihood ratio chi-squared test