| Literature DB >> 27552985 |
Mark Ziemann1, Yotam Eren1,2, Assam El-Osta3,4.
Abstract
The spreadsheet software Microsoft Excel, when used with default settings, is known to convert gene names to dates and floating-point numbers. A programmatic scan of leading genomics journals reveals that approximately one-fifth of papers with supplementary Excel gene lists contain erroneous gene name conversions.Entities:
Keywords: Gene symbol; Microsoft Excel; Supplementary data
Mesh:
Year: 2016 PMID: 27552985 PMCID: PMC4994289 DOI: 10.1186/s13059-016-1044-7
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Results of the systematic screen of supplementary Excel files for gene name conversion errors
| Journala | Number of Excel files screened | Number of gene lists found | Number of papers with gene lists | Number of supplementary files affected | Number of papers affected | Number of gene names converted |
|---|---|---|---|---|---|---|
|
| 7783 | 2202 | 994 | 220 | 170 | 4240 |
|
| 11464 | 1650 | 801 | 218 | 158 | 4932 |
|
| 2607 | 580 | 251 | 114 | 68 | 3180 |
|
| 2117 | 540 | 315 | 88 | 67 | 1661 |
|
| 2678 | 664 | 257 | 97 | 63 | 1878 |
|
| 932 | 395 | 190 | 75 | 55 | 1593 |
|
| 980 | 372 | 168 | 48 | 27 | 1724 |
|
| 482 | 150 | 74 | 27 | 23 | 1375 |
|
| 1790 | 235 | 152 | 26 | 21 | 534 |
|
| 569 | 127 | 77 | 20 | 15 | 1341 |
|
| 264 | 70 | 37 | 12 | 9 | 178 |
|
| 731 | 112 | 67 | 11 | 6 | 339 |
|
| 177 | 79 | 32 | 6 | 6 | 46 |
|
| 143 | 54 | 29 | 7 | 5 | 206 |
|
| 995 | 112 | 79 | 7 | 4 | 56 |
|
| 172 | 36 | 19 | 7 | 3 | 451 |
|
| 490 | 32 | 25 | 2 | 2 | 121 |
|
| 801 | 57 | 30 | 2 | 2 | 6 |
|
| 35175 | 7467 | 3597 | 987 | 704 | 23861 |
aThe 18 journals investigated are ordered by the number of papers affected by gene name conversion errors
Fig. 1Prevalence of gene name errors in supplementary Excel files. a Percentage of published papers with supplementary gene lists in Excel files affected by gene name errors. b Increase in gene name errors by year