| Literature DB >> 17375194 |
Heather A Piwowar1, Roger S Day, Douglas B Fridsma.
Abstract
BACKGROUND: Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available. PRINCIPALEntities:
Mesh:
Year: 2007 PMID: 17375194 PMCID: PMC1817752 DOI: 10.1371/journal.pone.0000308
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Characteristics of Eligible Trials by Data Sharing.
| Number of Articles | Odds Ratio (95% confidence interval) | |||
| Total | Data Shared | Data Not Shared | ||
|
|
|
|
| |
|
| 12 | 12 (100%) | 0 (0%) | ∞ (3.8 to ∞) |
|
| 73 | 29 (40%) | 44 (60%) | |
|
| 6 | 5 (83%) | 1 (17%) | 6.0 (0.6 to 288.5) |
|
| 79 | 36 (46%) | 43 (54%) | |
|
| 56 | 35 (63%) | 21 (38%) | 6.4 (2.0 to 21.9) |
|
| 29 | 6 (21%) | 23 (79%) | |
Figure 1Distribution of 2004–2005 citation counts of 85 trials by data availability.
The 41 clinical trial publications which publicly shared their microarray data received more citations, in general, than the 44 publications which did not share their microarray data. In this plot of the distribution of citation counts received by each publication, the extent of the box encompasses the interquartile range of the citation counts, whiskers extend to 1.5 times the interquartile range, and lines within the boxes represent medians.
Multivariate regression on citation count for 85 publications
| Percent increase in citation count (95% confidence interval) | p-value | |
| Publish in a journal with twice the impact factor | 84% (59 to 109%) | <0.001 |
| Increase the publication date by a month | −3% (−5 to −2%) | <0.001 |
| Include a US author | 38% (1 to 89%) | 0.049 |
|
|
|
|
We calculated a multivariate linear regression over the citation counts, including covariates for journal impact factor, date of publication, US authorship, and data availability. The coefficients and p-values for each of the covariates are shown here, representing the contribution of each covariate to the citation count, independent of other covariates.
Figure 2Distribution of 2004–2005 citation counts of the 70 lower-profile trials by data availability.
For trials which were published after 2000 and in journals with an impact factor less than 25, the 27 clinical trial publications which publicly shared their microarray data received more citations, in general, than the 43 publications which did not share their microarray data. In this plot of the distribution of citation counts received by each publication, the extent of the box encompasses the interquartile range of the citation counts, whiskers extend to 1.5 times the interquartile range, and lines within the boxes represent medians.
Exploratory regressions on citation count for the 41 publications with shared data
| Number of articles (% of total) | Number of citations (% of total) | Percent increase in citation count | p-value | |
|
|
|
| ||
| Trial size>25 patients | 26 (63%) | 3704 (69%) | 122% | <0.001 |
| Clinical endpoint | 18 (44%) | 3404 (64%) | 79% | 0.01 |
| Affymetrix platform | 22 (54%) | 2735 (51%) | 18% | 0.43 |
| In GEO database | 6 (15%) | 939 (18%) | −52% | 0.02 |
| In SMD database | 6 (15%) | 1114 (21%) | 24% | 0.48 |
| Raw data available | 20 (49%) | 2437 (46%) | −2% | 0.91 |
| Pub mentions Suppl. Data | 35 (85%) | 4854 (91%) | 11% | 0.73 |
| Has Oncomine profile | 35 (85%) | 4884 (92%) | 19% | 0.54 |
The coefficient and p-value for each covariate in the table were calculated from separate multivariate linear regressions over the citation count, including covariates for journal impact factor, date of publication, and US authorship.