| Literature DB >> 32084240 |
Mike Thelwall1, Marcus Munafò2,3, Amalia Mas-Bleda1, Emma Stuart1, Meiko Makita1, Verena Weigert4, Chris Keene4, Nushrat Khan1, Katie Drax2,3, Kayvan Kousha1.
Abstract
Primary data collected during a research study is often shared and may be reused for new studies. To assess the extent of data sharing in favourable circumstances and whether data sharing checks can be automated, this article investigates summary statistics from primary human genome-wide association studies (GWAS). This type of data is highly suitable for sharing because it is a standard research output, is straightforward to use in future studies (e.g., for secondary analysis), and may be already stored in a standard format for internal sharing within multi-site research projects. Manual checks of 1799 articles from 2010 and 2017 matching a simple PubMed query for molecular epidemiology GWAS were used to identify 314 primary human GWAS papers. Of these, only 13% reported the location of a complete set of GWAS summary data, increasing from 3% in 2010 to 23% in 2017. Whilst information about whether data was shared was typically located clearly within a data availability statement, the exact nature of the shared data was usually unspecified. Thus, data sharing is the exception even in suitable research fields with relatively strong data sharing norms. Moreover, the lack of clear data descriptions within data sharing statements greatly complicates the task of automatically characterising shared data sets.Entities:
Mesh:
Year: 2020 PMID: 32084240 PMCID: PMC7034915 DOI: 10.1371/journal.pone.0229578
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Availability of summary statistics in published primary GWAS articles from 2010 and 2017, according to the article text.
| GWAS summary statistics availability | 2010 | % | 2017 | % | Total | % |
|---|---|---|---|---|---|---|
| On request to the authors | 0 | 0% | 15 | 9% | 15 | 5% |
| On request via dbGaP | 3 | 2% | 5 | 3% | 8 | 3% |
| On request via EGA | 1 | 1% | 2 | 1% | 3 | 1% |
| On request via another portal | 0 | 0% | 3 | 2% | 3 | 1% |
| Free online without login, plain text | 0 | 0% | 12 | 7% | 12 | 4% |
| Broken link or not findable | 3 | 2% | 3 | 2% | 6 | 2% |
| Not stated in article | 145 | 95% | 122 | 75% | 267 | 85% |
| Articles checked | 867 | 932 | 1799 |