| Literature DB >> 29557976 |
Alison Callahan1, Rainer Winnenburg1, Nigam H Shah1.
Abstract
Measuring the usage of informatics resources such as software tools and databases is essential to quantifying their impact, value and return on investment. We have developed a publicly available dataset of informatics resource publications and their citation network, along with an associated metric (u-Index) to measure informatics resources' impact over time. Our dataset differentiates the context in which citations occur to distinguish between 'awareness' and 'usage', and uses a citing universe of open access publications to derive citation counts for quantifying impact. Resources with a high ratio of usage citations to awareness citations are likely to be widely used by others and have a high u-Index score. We have pre-calculated the u-Index for nearly 100,000 informatics resources. We demonstrate how the u-Index can be used to track informatics resource impact over time. The method of calculating the u-Index metric, the pre-computed u-Index values, and the dataset we compiled to calculate the u-Index are publicly available.Entities:
Year: 2018 PMID: 29557976 PMCID: PMC5859919 DOI: 10.1038/sdata.2018.43
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Details of the PubMed Central Open Access articles used as a citing universe.
| Articles (n = 1,152,905) | Research | Other | |||
|---|---|---|---|---|---|
| Articles (n = 1,152,905) | 691,527 | 461,378 | |||
| Articles with references | 669,714 (97%) | 329,969 (72%) | |||
| All | With PMIDs | All | With PMIDs | ||
| # References | 27,218,513 | 21,130,489 | 12,594,333 | 9,563,297 | |
| Average # references | |||||
| All articles | 39.4 | 30.6 | 27.3 | 20.7 | |
| Articles with references | 40.6 | 31.6 | 38.2 | 29.0 | |
| Articles with references in methods | 596,627 (89%) | 18,401 (6%) | |||
| All | With PMIDs | All | With PMIDs | ||
| # References in methods | 4,734,310 | 3,230,941 | 193,310 | 138,650 | |
| Average # references | |||||
| All articles | 6.8 | 4.7 | 0.4 | 0.3 | |
| Articles with refs. in methods | 7.9 | 5.4 | 10.5 | 7.5 | |
Recall of the PubMed query for informatics resource articles published in Bioinformatics Application Notes, the NAR Database special issue, and Oxford Database.
| Bioinformatics Application Notes | 2,913 | 2,540 | 0.87 |
| NAR Database | 2,029 | 2,024 | 0.99 |
| Oxford Database | 516 | 408 | 0.79 |
| Total | 5,458 | 4,972 | 0.91 |
Precision, recall, and specificity of the PubMed query for informatics resource articles, for the 250 articles with the most usage citations in the PubMed Central Open Access subset.
| 250 articles with the most usage citations in PMC | 116 | 3 | 97 | 34 | 0.97 | 0.77 | 0.97 |
Figure 1The proportion of usage citations for informatics resources across the domains of biology, medicine, bioinformatics and medical informatics.
The X-axis shows the four domains and the Y-axis shows the proportion of papers in which the tool is used in comparison to the total number of citations of this tool. The size of the bubble is proportional to the total citations for each tool. Each color is tool specific and conserved across the four domains, e.g. BLAST is consistently shown in green.
Figure 2u-Index values over time for multiple sequence alignment tools and text mining tools. Multiple sequence alignment tool u-Index values are shown in panel.
(a), and text mining tool u-Index values are shown in panel (b). The X-axis shows the years a tool has been cited and used and the Y-axis shows the u-Index for a tool, based on the cumulative usage and awareness citations up to and including that year. Three tools dominate among the multiple sequence alignment tools – Muscle, Clustal and MAFFT. Text mining tools, on the other hand, have lower u-Index values, with the most reused tool being LINNAEUS, used to identify species names in text.
Headings used to filter the PubMed query to retrieve informatics resource articles.
| addendum/addenda | erratum/errata |
| brief communication | interview |
| clinical overview | opinion |
| column | perspective |
| comment | reply |
| correction | report |
| editorial/editor | review |
Number of articles and journals assigned to the four categories in the citing universe.
| Medicine | 253,729 | 1,934 |
| Biology | 157,567 | 1,021 |
| Bioinformatics | 12,289 | 18 |
| Medical Informatics | 3,151 | 23 |
| Total | 691,527 | 2,548 |