| Literature DB >> 27122647 |
Isabella Peters1, Peter Kraker2, Elisabeth Lex3, Christian Gumpenberger4, Juan Gorraiz4.
Abstract
In this study, we explore the citedness of research data, its distribution over time and its relation to the availability of a digital object identifier (DOI) in the Thomson Reuters database Data Citation Index (DCI). We investigate if cited research data "impacts" the (social) web, reflected by altmetrics scores, and if there is any relationship between the number of citations and the sum of altmetrics scores from various social media platforms. Three tools are used to collect altmetrics scores, namely PlumX, ImpactStory, and Altmetric.com, and the corresponding results are compared. We found that out of the three altmetrics tools, PlumX has the best coverage. Our experiments revealed that research data remain mostly uncited (about 85 %), although there has been an increase in citing data sets published since 2008. The percentage of the number of cited research data with a DOI in DCI has decreased in the last years. Only nine repositories are responsible for research data with DOIs and two or more citations. The number of cited research data with altmetrics "foot-prints" is even lower (4-9 %) but shows a higher coverage of research data from the last decade. In our study, we also found no correlation between the number of citations and the total number of altmetrics scores. Yet, certain data types (i.e. survey, aggregate data, and sequence data) are more often cited and also receive higher altmetrics scores. Additionally, we performed citation and altmetric analyses of all research data published between 2011 and 2013 in four different disciplines covered by the DCI. In general, these results correspond very well with the ones obtained for research data cited at least twice and also show low numbers in citations and in altmetrics. Finally, we observed that there are disciplinary differences in the availability and extent of altmetrics scores.Entities:
Keywords: Altmetrics; Citation analysis; Citedness; Co-citation analysis; Data Citation Index; Research data
Year: 2016 PMID: 27122647 PMCID: PMC4833815 DOI: 10.1007/s11192-016-1887-4
Source DB: PubMed Journal: Scientometrics ISSN: 0138-9130 Impact factor: 3.238
General description of the citation and altmetrics analyses performed in DCI for the last 5 and half decades (n = 3,984,028 items)
| Data Citation Index | 1960–19669 | 1970–19779 | 1980–19889 | 1990–19999 | 2000–20009 | 2010–2014 |
|---|---|---|---|---|---|---|
| Total # items | 6040 | 23,712 | 43,620 | 186,965 | 2,096,023 | 1,627,668 |
| Uncited (%) | 99.9 % | 82.3 % | 82.8 % | 76.6 % | 88.6 % | 86.6 % |
| # Items with at least 1 citation | 5 | 4207 | 7519 | 43,749 | 239,867 | 218,440 |
| # Items with ≥2 citations | 5 | 110 | 360 | 956 | 4727 | 4777 |
| Items with ≥2 citations and DOI | 4 | 107 | 343 | 846 | 1381 | 226 |
| % with ≥2 citations and DOI | 0.8 | 97.27 % | 95.28 % | 88.49 % | 29.22 % | 4.73 % |
| Thereoff with data in PlumX | 1 | 5 | 14 | 40 | 114 | 20 |
| % thereoff with data in PlumX | 25.0 % | 4.7 % | 4.1 % | 4.7 % | 8.3 % | 8.8 % |
| Items with ≥2 citations and URL only | 1 | 3 | 17 | 110 | 3346 | 4551 |
| % with ≥ 2 citations and URL only | 0.2 | 2.73 % | 4.72 % | 11.51 % | 70.78 % | 95.27 % |
| Thereoff with data in PlumX | 1 | 1 | 8 | 11 | 54 | 33 |
| % thereoff with data in PlumX | 100.0 % | 33.3 % | 47.1 % | 10.0 % | 1.6 % | 0.7 % |
Fig. 1Evolution of uncitedness in DCI in the last 14 years (n = 3,723,691 items)
Citation distribution of Sample 1 (n = 10,934 items)
| Items with at least 2 citations | Document type | # Items | Total citations | Mean citations | Maximum citations | SD | Variance |
|---|---|---|---|---|---|---|---|
| All | Data set | 5641 | 17,984 | 3.19 | 121 | 3.38 | 11.46 |
| Data study | 5242 | 91,623 | 17.48 | 1236 | 50.22 | 2521.67 | |
| Repository | 51 | 10,076 | 197.57 | 3193 | 618.73 | 382,824.45 | |
| Total | 10,934 | 119,683 | 10.95 | 3193 | 56.39 | 3179.49 | |
| With DOI | Data set | 342 | 977 | 2.86 | 52 | 3.86 | 14.93 |
| Data study | 2565 | 53,293 | 20.78 | 1236 | 63.44 | 4024.45 | |
| Total | 2907 | 54,270 | 18.67 | 1236 | 59.88 | 3585.92 | |
| With URL only | Data set | 5299 | 17,007 | 3.21 | 121 | 3.35 | 11.23 |
| Data study | 2677 | 38,330 | 14.32 | 272 | 32.59 | 1062.31 | |
| Repository | 51 | 10,076 | 197.57 | 3193 | 618.73 | 382,824.45 | |
| Total | 8027 | 65,413 | 8.15 | 3193 | 54.80 | 3003.30 |
Analysis of Sample 1 by sources (repositories) (n = 10,934 items)
| Data types (with DOI) | # Items | # Citations | Data types (with URL only) | # Items | # Citations |
|---|---|---|---|---|---|
| Inter-university Consortium for Political and Social Research | 2530 | 53,041 | miRBase | 3456 | 10,209 |
| Worldwide Protein Data Bank | 229 | 458 | Cancer Models Database | 864 | 2698 |
| Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics | 108 | 508 | UK Data Archive | 836 | 25,479 |
| Archaeology Data Service | 21 | 75 | European Nucleotide Archive | 361 | 1346 |
| 3TU.Datacentrum | 8 | 22 | Gene Expression Omnibus | 353 | 754 |
| SHARE—Survey of Health, Ageing and Retirement in Europe | 4 | 151 | National Snow and Ice Data Center | 298 | 2796 |
| World Agroforestry Centre | 3 | 6 | Australian Data Archive | 264 | 2469 |
| Dryad | 2 | 4 | Australian Antarctic Data Centre | 249 | 1621 |
| GigaDB | 2 | 5 | nmrshiftdb2 | 219 | 445 |
| Finnish Social Science Data Archive | 183 | 913 |
Analysis of Sample 1 by data types (manually merged), top 10 types (n = 10,934 items)
| Data types (with DOI) | # Items | # Citations | Data types (with URL only) | # Items | # Citations |
|---|---|---|---|---|---|
| Survey data | 1734 | 43,686 | Sequence data | 3408 | 10,458 |
| Administrative records data | 302 | 3326 | Profiling by array, gen, etc | 352 | 752 |
| Aggregate data | 274 | 9440 | Individual (micro) level | 240 | 9024 |
| Event/transaction data | 210 | 2400 | Numeric data | 216 | 4317 |
| Clinical data | 118 | 3469 | Structured questionnaire | 155 | 673 |
| Census/enumeration data | 109 | 1019 | Survey data | 127 | 1315 |
| Protein structure | 95 | 190 | Seismic:Reflection:MCS | 47 | 185 |
| Observational data | 30 | 575 | Statistical data | 41 | 1352 |
| Program source code | 10 | 116 | Digital media | 40 | 290 |
| Roll call voting data | 8 | 236 | EXCEL | 25 | 101 |
Analysis of Sample 1 by research areas and document types, top 10 areas (n = 10,934 items)
| With DOI | With URL only | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Research area | # Items | # Citations | Research area | # Items | # Citations | ||||
| Data set | Data study | Data set | Data study | Data set | Data study | Data set | Data study | ||
| Criminology and Penology | 471 | 4403 | Genetics and Heredity | 4658 | 159 | 14,024 | 571 | ||
| Sociology | 432 | 7930 | Meteorology and Atmospheric Sciences | 91 | 298 | 493 | 2796 | ||
| Government and Law | 352 | 10,399 | Biochemistry and Molecular Biology; Genetics and Heredity | 353 | 754 | ||||
| Demography | 317 | 9178 | Sociology | 286 | 1994 | ||||
| Health Care Sciences and Services | 290 | 8170 | Physics | 5 | 214 | 10 | 435 | ||
| Biochemistry and Molecular Biology | 229 | 458 | Business and Economics; Sociology | 143 | 12,665 | ||||
| Business and Economics | 204 | 3083 | Biochemistry and Molecular Biology; Spectroscopy | 129 | 383 | ||||
| Environmental Sciences and Ecology; Geology | 108 | 508 | Oceanography; Geology | 114 | 353 | ||||
| Education and Educational Research | 69 | 1881 | Demography; Sociology | 103 | 5673 | ||||
| Family Studies | 68 | 2268 | Sociology; Demography; Communication | 84 | 393 | ||||
| Sum | 337 | 2203 | 966 | 47,312 | Sum | 4997 | 1640 | 15,263 | 25,281 |
Citation and altmetrics results of Sample 2 (n = 301 items) according to document type
| Document type | # Items | Total citations | Mean citations | Maximum citations | SD | Variance |
|---|---|---|---|---|---|---|
| With DOI | ||||||
| Data set | 15 | 173 | 11.53 | 52 | 13.75 | 189.12 |
| Data study | 179 | 6716 | 37.52 | 1135 | 107.36 | 11,525.43 |
| Total | 194 | 6889 | 35.51 | 1135 | 103.40 | 10,691.82 |
* 8 items with URL that were found in PlumX could not properly be identified (broken URL, wrong item, etc.)
Citation and altmetrics overview of Sample 2 (n = 301 items) according to their data type
| Data type (with DOI) | # Items | Total citations | Mean citations | Total scores | Mean scores | Data type (with URL only) * | # Items | Total citations | Mean citations | Total scores | Mean scores |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Survey data | 110 | 5276 | 47.96 | 353 | 3.21 | miRNA sequence data | 15 | 71 | 4.73 | 21 | 1.40 |
| Aggregate data | 26 | 793 | 30.50 | 80 | 3.08 | FITS images; spectra; calibrations; redshifts | 4 | 248 | 62 | 16 | 4.00 |
| Event/transaction data | 19 | 414 | 21.79 | 43 | 2.26 | Statistical data | 3 | 333 | 111 | 22 | 7.33 |
| Administrative records data | 13 | 125 | 9.62 | 58 | 4.46 | Expression profiling by array | 3 | 6 | 2 | 4 | 1.33 |
| Clinical data | 11 | 314 | 28.55 | 26 | 2.36 | Sensor data; survey data | 2 | 51 | 25.5 | 10 | 5.00 |
| Census/enumeration data | 8 | 90 | 11.25 | 14 | 1.75 | Quantitative | 2 | 35 | 17.5 | 10 | 5.00 |
| Observational data | 4 | 99 | 24.75 | 7 | 1.75 | Images | 1 | 20 | 20 | 3 | 3.00 |
| Longitudinal data; Panel Data; Micro data | 2 | 79 | 39.50 | 46 | 23.00 | Images; spectra | 1 | 4 | 4 | 102 | 102.00 |
| Roll call voting data | 2 | 178 | 89.00 | 3 | 1.50 | Table | 1 | 9 | 9 | 1 | 1.00 |
| Machine-readable text | 1 | 5 | 5.00 | 1 | 1.00 | Redshifts; spectra | 1 | 5 | 5 | 213 | 213.00 |
| Program source code | 1 | 2 | 2.00 | 1 | 1.00 | Images; spectra; astrometry | 1 | 2 | 2 | 90 | 90.00 |
Field DY; no aggregated counts, without consideration of the “document type” “repository” = 34 items
Citation and altmetrics overview of Sample 2 (n = 301 items) according to their subject area
| With DOI | With URL only | ||||||
|---|---|---|---|---|---|---|---|
| Subject areas | # Items | # Citations | # Scores | Subject areas | # Items | # Citations | # Scores |
| Sociology | 35 | 1226 | 213 | Genetics and Heredity | 26 | 492 | 654 |
| Government and Law | 28 | 793 | 53 | Meteorology and Atmospheric Sciences | 15 | 166 | 28 |
| Criminology and Penology | 22 | 317 | 42 | Astronomy and Astrophysics | 9 | 933 | 427 |
| Health Care Sciences and Services | 14 | 1498 | 70 | Biochemistry and Molecular Biology; Genetics and Heredity | 5 | 22 | 557 |
| Environmental Sciences and Ecology; Geology | 14 | 171 | 33 | Cell Biology | 4 | 13 | 383 |
| Demography | 12 | 433 | 28 | Health Care Sciences and Services; Business and Economics | 3 | 335 | 68 |
| Family Studies | 10 | 166 | 26 | Genetics and Heredity; Biochemistry and Molecular Biology | 2 | 27 | 36 |
| Archaeology | 10 | 47 | 139 | Business and Economics | 2 | 35 | 10 |
| Education and Educational Research | 9 | 661 | 40 | Health Care Sciences and Services | 2 | 423 | 2 |
| International Relations | 9 | 384 | 46 | Communication; Sociology; Telecommunications | 2 | 51 | 10 |
PlumX altmetrics scores for all document types in Sample 2 (n = 301 items) with or without DOI
| Document type | With DOI | With URL only | ||||||
|---|---|---|---|---|---|---|---|---|
| Data set | Data study | Total | Data set | Data study | Repository | Total | ||
| # Items | 15 | 179 | 194 | 24 | 31 | 44 | 99 | |
| Captures | Sum | 32 | 471 | 503 | 0 | 0 | 30 | 30 |
| Mean | 2.13 | 2.63 | 2.59 | 0.00 | 0.00 | 0.68 | 0.28 | |
| Max | 6 | 48 | 48 | 0 | 0 | 23 | 23 | |
| Social media | Sum | 1 | 220 | 221 | 407 | 281 | 3060 | 3890 |
| Mean | 0.07 | 1.23 | 1.14 | 16.96 | 9.06 | 69.55 | 36.36 | |
| Max | 1 | 58 | 58 | 366 | 119 | 1008 | 1008 | |
| Mentions | Sum | 1 | 13 | 14 | 13 | 62 | 433 | 629 |
| Mean | 0.07 | 0.07 | 0.07 | 0.54 | 2.00 | 9.84 | 5.88 | |
| Max | 1 | 4 | 4 | 12 | 31 | 119 | 120 | |
| Usage | Sum | 0 | 6 | 6 | 8 | 321 | 438 | 770 |
| Mean | 0.00 | 0.03 | 0.03 | 0.33 | 10.35 | 9.95 | 7.20 | |
| Max | 0 | 6 | 6 | 4 | 187 | 92 | 187 | |
| Total entries | 34 | 710 | 744 | 428 | 664 | 3961 | 5319 | |
| % Captures | 94.1 | 66.3 | 67.6 | 0.0 | 0.0 | 0.8 | 0.6 | |
| % Social media | 2.9 | 31.0 | 29.7 | 95.1 | 42.3 | 77.3 | 73.1 | |
| % Mentions | 2.9 | 1.8 | 1.9 | 3.0 | 9.3 | 10.9 | 11.8 | |
| % Usage | 0.0 | 0.8 | 0.8 | 1.9 | 48.3 | 11.1 | 14.5 | |
Top 10 research data with URL only according to the total scores as reported in PlumX
| Title | PY | Data type | Total captures | Total mentions | Total social media | Total usage | Total scores | Total citations |
|---|---|---|---|---|---|---|---|---|
| DrugBank | 2006 | Repository | 0 | 119 | 1008 | 23 | 1,150 | 3 |
|
| 2002 | Repository | 0 | 91 | 379 | 68 | 538 | 11 |
| WVS Database | 1981 | Repository | 0 | 19 | 358 | 7 | 384 | 3193 |
| The Cell: An Image Library—Image CIL:12654 | 2012 | Data set | 0 | 12 | 366 | 0 | 378 | 2 |
| Home | 1000 Genomes | 2008 | Repository | 0 | 32 | 222 | 92 | 346 | 344 |
| CDC—BRFSS—Behavioral Risk Factor Surveillance System | 1984 | Repository | 0 | 21 | 160 | 68 | 249 | 13 |
| BOSS: Dark Energy and the Geometry of Space—SDSS-III | 2011 | Data study | 0 | 31 | 119 | 63 | 213 | 5 |
|
| 0 | 120 | 81 | 0 | 201 | |||
| Genotype information for Agrostis chloroplast SSR, matK, and Agrostis nuclear SSR markers | 2012 | Data study | 0 | 0 | 0 | 0 | 187 | 2 |
| Human Metabolome Database | 2005 | Repository | 0 | 17 | 134 | 16 | 167 | 3 |
Citation numbers for research data published between 2011 and 2013 in four selected disciplines (Sample 3; n = 4054 items)
| Subject category | Citation analysis PY = 2011–2013 | ||||||
|---|---|---|---|---|---|---|---|
| DT | All | # Items | # Citations | Citations/item | Max | SD | |
| Astronomy and Astrophysics | All DTs | Data set | 1162 | 2 | 0.00 | 1 | 0.041 |
| Data study | 106 | 84 | 0.79 | 5 | 0.765 | ||
| Repository | 8 | 0 | 0.00 | 0 | 0.000 | ||
| Total | 1276 | 86 | 0.07 | 5 | 0.312 | ||
| With DOI | Data study | 4 | 1 | 0.25 | 1 | 0.500 | |
| Total | 4 | 1 | 0.25 | 1 | 0.500 | ||
| Without DOI | Data set | 1162 | 2 | 0.00 | 1 | 0.041 | |
| Data study | 102 | 83 | 0.81 | 5 | 0.767 | ||
| Repository | 8 | 0 | 0.00 | 0 | 0.000 | ||
| Total | 1272 | 85 | 0.07 | 5 | 0.311 | ||
| Chemistry | All DTs | Data study | 990 | 22 | 0.02 | 1 | 0.147 |
| Repository | 1 | 0 | 0.00 | 0 | |||
| Total | 991 | 22 | 0.02 | 1 | 0.147 | ||
| With DOI | Data study | 373 | 22 | 0.06 | 1 | 0.236 | |
| Total | 373 | 22 | 0.06 | 1 | 0.236 | ||
| Without DOI | Data study | 617 | 0 | 0.00 | 0 | 0.000 | |
| Repository | 1 | 0 | 0.00 | 0 | |||
| Total | 618 | 0 | 0.00 | 0 | 0.000 | ||
| Mathematics | All DTs | Data set | 120 | 0 | 0.00 | 0 | 0.000 |
| Data study | 5 | 1 | 0.20 | 1 | 0.447 | ||
| Total | 125 | 1 | 0.01 | 1 | 0.089 | ||
| With DOI | Data set | 12 | – | 0.00 | 0 | 0.000 | |
| Data study | 5 | 1 | 0.20 | 1 | 0.447 | ||
| Total | 17 | 1 | 0.06 | 1 | 0.243 | ||
| Without DOI | Data set | 108 | 0 | 0 | 0 | ||
| Total | 108 | 0 | 0 | 0 | |||
| Sociology | All DTs | Data set | 881 | 12 | 0.01 | 4 | 0.165 |
| Data study | 781 | 181 | 0.23 | 41 | 1.645 | ||
| Total | 1662 | 193 | 0.12 | 41 | 1.139 | ||
| With DOI | Data set | 117 | 0 | 0 | 0 | ||
| Data study | 56 | 46 | 0.82 | 5 | 1.177 | ||
| Total | 173 | 46 | 0.27 | 5 | 0.769 | ||
| Without DOI | Data set | 764 | 12 | 0.02 | 4 | 0.177 | |
| Data study | 725 | 135 | 0.19 | 41 | 1.668 | ||
| Total | 1489 | 147 | 0.10 | 41 | 1.173 | ||
Altmetrics scores for research data published between 2011 and 2013 in four selected disciplines (Sample 3)
| Subject category | Altmetric analysis in PLUM-X | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Items with scores | Data type | With DOI | Total captures | Total mentions | Total usage | Total social media | Total scores | Total citations | |
| Astronomy and Astrophysics | 1 | Data set* | No | 0 | 114 | 32 | 477 | 623 | 0 |
| 2 | Data set | No | 0 | 31 | 125 | 106 | 262 | 0 | |
| 3 | Repository* | No | 0 | 31 | 63 | 119 | 213 | 0 | |
| 4 | Data set* | No | 0 | 10 | 54 | 38 | 102 | 4 | |
| 5 | Data set* | No | 0 | 7 | 7 | 75 | 89 | 0 | |
| 6 | Data study | No | 0 | 0 | 7 | 0 | 7 | 0 | |
| 7 | Data study | Yes | 0 | 0 | 0 | 3 | 3 | 1 | |
| 8 | Data study | No | 0 | 0 | 0 | 2 | 2 | 0 | |
| 9 | Data study | No | 0 | 0 | 0 | 2 | 2 | 0 | |
| 10 | Data study | No | 0 | 0 | 0 | 1 | 1 | 0 | |
| 11 | Data study | Yes | 0 | 0 | 0 | 1 | 1 | 0 | |
| 12 | Data set | No | 0 | 0 | 0 | 1 | 1 | 0 | |
| Chemistry | 0 | n.a. | n.a. | 0 | 0 | 0 | 0 | 0 | 0 |
| Mathematics | 1 | Data set | No | 0 | 0 | 0 | 2 | 2 | 0 |
| Sociology | 1 | Data set* | No | 0 | 11 | 0 | 0 | 11 | 0 |
| 2 | Data study | No | 0 | 4 | 0 | 0 | 4 | 0 | |
| 3 | Data set* | No | 0 | 4 | 0 | 0 | 4 | 0 | |
| 4 | Data set* | No | 0 | 4 | 0 | 0 | 4 | 0 | |
| 5 | Data set* | No | 0 | 2 | 0 | 0 | 2 | 0 | |
| 6 | Data set* | No | 0 | 2 | 0 | 0 | 2 | 0 | |
| 7 | Data set* | No | 0 | 2 | 0 | 0 | 2 | 0 | |
| 8 | Data study* | No | 0 | 2 | 0 | 0 | 2 | 0 | |
| 9 | Data study* | No | 0 | 1 | 0 | 0 | 1 | 0 | |
| 10 | Data set* | No | 0 | 1 | 0 | 0 | 1 | 0 | |
* Matching of source information from DCI (i.e. URL and title of research data) and result from PlumX is not necessarily correct because of missing or changed information in altmetrics search results. Since URLs are not permanent identifiers like DOIs URLs as indexed in the DCI may have disappeared or changed and, thus, PlumX might not have retrieved the exact same content as has been indexed by the DCI