| Literature DB >> 25821280 |
Abstract
Modeling distributions of citations to scientific papers is crucial for understanding how science develops. However, there is a considerable empirical controversy on which statistical model fits the citation distributions best. This paper is concerned with rigorous empirical detection of power-law behaviour in the distribution of citations received by the most highly cited scientific papers. We have used a large, novel data set on citations to scientific papers published between 1998 and 2002 drawn from Scopus. The power-law model is compared with a number of alternative models using a likelihood ratio test. We have found that the power-law hypothesis is rejected for around half of the Scopus fields of science. For these fields of science, the Yule, power-law with exponential cut-off and log-normal distributions seem to fit the data better than the pure power-law model. On the other hand, when the power-law hypothesis is not rejected, it is usually empirically indistinguishable from most of the alternative models. The pure power-law model seems to be the best model only for the most highly cited papers in "Physics and Astronomy". Overall, our results seem to support theories implying that the most highly cited scientific papers follow the Yule, power-law with exponential cut-off or log-normal distribution. Our findings suggest also that power laws in citation distributions, when present, account only for a very small fraction of the published papers (less than 1 % for most of science fields) and that the power-law scaling parameter (exponent) is substantially higher (from around 3.2 to around 4.7) than found in the older literature.Entities:
Keywords: Citation distribution; Power law; Scopus; Statistical modelling
Year: 2015 PMID: 25821280 PMCID: PMC4365275 DOI: 10.1007/s11192-014-1524-z
Source DB: PubMed Journal: Scientometrics ISSN: 0138-9130 Impact factor: 3.238
Definitions of alternative discrete distributions
| Distribution name | Probability distribution function |
|---|---|
| Exponential |
|
| Stretched exponential (Weibull) |
|
| Log-normal |
|
| Tsallis |
|
| Yule |
|
| Digamma |
|
| Power law with exponential cut-off |
|
The distributions have been normalized to ensure that the total probability in the domain is 1. Discrete log-normal distribution is approximated by rounding the continuous log-normally distributed reals to the nearest integers. For Tsallis distribution, we use a parametrization considered by Shalizi (2007)
Descriptive statistics for citation distributions, Scopus, 1998–2002, 5-year citation window
| Scopus subject area of science | Total number of papers | No. of papers in the sample | % of all papers in the sample | Mean no. of citations | Std. Dev. of citations | Max. no. of citations |
|---|---|---|---|---|---|---|
| Agricultural and Biological Sciences | 372,575 | 99,804 | 26.8 | 15.17 | 14.36 | 628 |
| Arts and Humanities | 47,191 | 47,074 | 99.8 | 1.256 | 3.357 | 91 |
| Biochemistry, Genetics and Molecular Biology | 636,421 | 99,819 | 15.7 | 49.09 | 46.29 | 3,118 |
| Business, Management and Accounting | 61,211 | 61,156 | 99.9 | 3.452 | 7.273 | 287 |
| Chemical Engineering | 158,673 | 98,989 | 62.4 | 7.232 | 9.236 | 344 |
| Chemistry | 416,660 | 99,398 | 23.9 | 21.07 | 21.17 | 1,065 |
| Computer Science | 134,179 | 99,933 | 74.5 | 6.44 | 18.13 | 2,737 |
| Decision Sciences | 27,409 | 27,393 | 99.9 | 3.467 | 5.496 | 143 |
| Earth and Planetary Sciences | 228,197 | 99,788 | 43.7 | 14.1 | 17.03 | 1,195 |
| Economics, Econometrics and Finance | 49,645 | 49,559 | 99.8 | 4.652 | 8.653 | 287 |
| Energy | 67,076 | 66,378 | 99.0 | 2.553 | 5.596 | 334 |
| Engineering | 439,719 | 99,765 | 22.7 | 11.77 | 15.83 | 971 |
| Environmental Science | 186,898 | 99,847 | 53.4 | 10.72 | 11.27 | 730 |
| Immunology and Microbiology | 195,339 | 99,858 | 51.1 | 22.11 | 25.11 | 926 |
| Materials Science | 331,310 | 99,591 | 30.1 | 12.48 | 14.49 | 697 |
| Mathematics | 193,740 | 99,922 | 51.6 | 6.912 | 11.38 | 929 |
| Medicine | 1,191,154 | 99,823 | 8.4 | 48.55 | 60.14 | 4,365 |
| Neuroscience | 445,181 | 99,886 | 22.4 | 18.97 | 20.39 | 771 |
| Nursing | 51,283 | 50,464 | 98.4 | 5.274 | 12.07 | 518 |
| Pharmacology, Toxicology and Pharmaceutics | 179,427 | 99,757 | 55.6 | 12.19 | 12.28 | 347 |
| Physics and Astronomy | 541,328 | 99,817 | 18.4 | 24.75 | 31.64 | 3,118 |
| Psychology | 104,449 | 99,736 | 95.5 | 7.446 | 11.55 | 377 |
| Social Sciences | 215,410 | 99,890 | 46.4 | 6.148 | 8.055 | 519 |
| Veterinary | 53,203 | 53,117 | 99.8 | 3.637 | 5.843 | 128 |
| Dentistry | 27,470 | 27,437 | 99.9 | 4.943 | 6.736 | 115 |
| Health Professions | 75,491 | 75,414 | 99.9 | 7.272 | 11.49 | 348 |
| Multidisciplinary | 50,287 | 50,226 | 99.9 | 30.38 | 76.08 | 5,187 |
| All Sciences | 6,480,926 | 2,203,841 | 34.0 | 14.92 | 27.74 | 5,187 |
Power-law fits to citation distributions, Scopus, 1998–2002, 5-year citation window
| Scopus subject area of science |
|
| No. of power-law papers | % of total papers |
|
|---|---|---|---|---|---|
| Agricultural and Biological Sciences | 92 (15.1) | 4.19 (0.25) | 488 | 0.1 | 0.566 |
| Arts and Humanities | 14 (5.4) | 3.46 (0.47) | 655 | 1.4 | 0.005 |
| Biochemistry, Genetics and Molecular Biology | 148 (28.0) | 3.72 (0.13) | 2,813 | 0.4 | 0.175 |
| Business, Management and Accounting | 24 (10.1) | 3.4 (0.38) | 1,339 | 2.2 | 0.000 |
| Chemical Engineering | 38 (6.7) | 4.01 (0.19) | 1,418 | 0.9 | 0.099 |
| Chemistry | 41(7.1) | 3.4 (0.05) | 8,193 | 2.0 | 0.110 |
| Computer Science | 26 (10.6) | 2.78 (0.11) | 3,989 | 3.0 | 0.000 |
| Decision Sciences | 12 (4.0) | 3.36 (0.24) | 1,596 | 5.8 | 0.000 |
| Earth and Planetary Sciences | 36 (8.9) | 3.37 (0.09) | 5,834 | 2.6 | 0.000 |
| Economics, Econometrics and Finance | 21 (10.2) | 3.13 (0.36) | 1,995 | 4.0 | 0.000 |
| Energy | 32 (5.4) | 3.91 (0.22) | 356 | 0.5 | 0.825 |
| Engineering | 26 (9.4) | 3.14 (0.09) | 7,986 | 1.8 | 0.000 |
| Environmental Science | 63 (10.3) | 4.33 (0.22) | 624 | 0.3 | 0.506 |
| Immunology and Microbiology | 78 (13.6) | 3.48 (0.10) | 2,713 | 1.4 | 0.049 |
| Materials Science | 43 (8.9) | 3.47 (0.11) | 2,687 | 0.8 | 0.193 |
| Mathematics | 24 (4.0) | 3.11 (0.06) | 4,152 | 2.1 | 0.012 |
| Medicine | 59 (16.3) | 3.07 (0.04) | 20,163 | 1.7 | 0.000 |
| Neuroscience | 135 (28.4) | 4.69 (0.41) | 423 | 0.1 | 0.896 |
| Nursing | 60 (15.7) | 3.68 (0.40) | 439 | 0.9 | 0.256 |
| Pharmacology, Toxicology and Pharmaceutics | 56 (6.8) | 4.1 (0.12) | 1,215 | 0.7 | 0.865 |
| Physics and Astronomy | 61 (6.5) | 3.35 (0.04) | 5,034 | 0.9 | 0.797 |
| Psychology | 52 (8.8) | 3.9 (0.17) | 1,060 | 1.0 | 0.812 |
| Social Sciences | 24 (6.4) | 3.56 (0.15) | 2,963 | 1.4 | 0.007 |
| Veterinary | 23 (4.0) | 4.09 (0.27) | 858 | 1.6 | 0.017 |
| Dentistry | 20 (2.4) | 3.89 (0.18) | 1,012 | 3.7 | 0.011 |
| Health Professions | 49 (10.2) | 3.85 (0.24) | 942 | 1.2 | 0.352 |
| Multidisciplinary | 209 (40.4) | 3.24 (0.14) | 1,147 | 2.8 | 0.100 |
| All Sciences | 186 (46.3) | 3.45 (0.10) | 6,364 | 0.2 | 0.076 |
Standard errors are given in parentheses
Fig. 1The complementary cumulative distribution functions (blue circles) and best power-law fits (dashed black line) for citation distributions that did not pass the goodness-of-fit test, Scopus, 1998–2002, 5-year citation window
Model selection tests for citation distributions, Scopus, 1998–2002, 5-year citation window
| Scopus subject area of science |
| Exponential | Weibull | Log-normal | Tsallis | Yule | Digamma | PL with cut-off | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LR |
| LR |
| LR |
| LR |
| LR |
| LR |
| NLR |
| ||
| Agricultural and Biological Sciences | 0.566 | 20.740 | 0.009 | 0.338 | 0.779 | −0.096 | 0.782 | 0.054 | 0.890 | −0.011 | 0.858 | 1.048 | 0.295 | −0.268 | 0.464 |
| Arts and Humanities | 0.005 | 6.287 | 0.457 | −6.93 | 0.023 | −6.56 | 0.025 | −4.325 | 0.189 | −1.38 | 0.000 | −1.991 | 0.046 | −7.37 | 0.000 |
| Biochemistry, Genetics and Mol. Biol. | 0.175 | 204.5 | 0.000 | 1.22 | 0.758 | −1.12 | 0.473 | −1.227 | 0.479 | −0.155 | 0.108 | 1.553 | 0.121 | −0.567 | 0.287 |
| Business, Management and Accounting | 0.000 | 34.390 | 0.034 | −9.60 | 0.013 | −9.24 | 0.013 | −7.279 | 0.065 | −1.39 | 0.000 | −2.072 | 0.038 | −9.98 | 0.000 |
| Chemical Engineering | 0.099 | 69.480 | 0.001 | −0.021 | 0.994 | −0.972 | 0.480 | 0.025 | 0.990 | −0.358 | 0.187 | 0.710 | 0.477 | −0.78 | 0.211 |
| Chemistry | 0.110 | 736.0 | 0.000 | 7.48 | 0.262 | −2.67 | 0.204 | 1.290 | 0.687 | −0.999 | 0.060 | 3.956 | 0.000 | −3.31 | 0.010 |
| Computer Science | 0.000 | 609.4 | 0.000 | −7.05 | 0.248 | −8.80 | 0.035 | −6.719 | 0.132 | −2.00 | 0.000 | 0.664 | 0.507 | −5.23 | 0.001 |
| Decision Sciences | 0.000 | 77.730 | 0.001 | −6.71 | 0.046 | −6.81 | 0.048 | −.0275 | 0.956 | −2.66 | 0.000 | −1.176 | 0.240 | −5.91 | 0.001 |
| Earth and Planetary Sciences | 0.000 | 459.7 | 0.000 | −4.69 | 0.451 | −7.52 | 0.045 | −4.928 | 0.264 | −1.95 | 0.000 | 1.164 | 0.244 | −5.69 | 0.001 |
| Economics, Econometrics and Finance | 0.000 | 45.080 | 0.021 | −21.6 | 0.000 | −20.4 | 0.000 | −17.027 | 0.002 | −2.68 | 0.000 | −3.408 | 0.001 | −22.9 | 0.000 |
| Energy | 0.825 | 20.630 | 0.065 | 0.357 | 0.789 | −0.072 | 0.838 | 0.347 | 0.690 | −0.023 | 0.884 | 0.813 | 0.416 | −0.119 | 0.625 |
| Engineering | 0.000 | 825.5 | 0.000 | – | – | −7.98 | 0.032 | −0.763 | 0.877 | −2.71 | 0.000 | 2.498 | 0.013 | −7.52 | 0.000 |
| Environmental Science | 0.506 | 26.730 | 0.104 | 0.003 | 0.999 | −0.422 | 0.685 | −0.333 | 0.793 | −0.114 | 0.334 | 0.303 | 0.762 | −0.18 | 0.547 |
| Immunology and Microbiology | 0.049 | 170.3 | 0.000 | −1.85 | 0.539 | −2.48 | 0.176 | −1.111 | 0.496 | −0.268 | 0.076 | 1.643 | 0.100 | −3.98 | 0.005 |
| Materials Science | 0.193 | 233.4 | 0.000 | 2.02 | 0.610 | −1.02 | 0.460 | −0.034 | 0.987 | −0.412 | 0.178 | 1.852 | 0.064 | −0.850 | 0.192 |
| Mathematics | 0.012 | 414.8 | 0.000 | −1.54 | 0.784 | −4.97 | 0.083 | −0.264 | 0.943 | −1.56 | 0.007 | 1.694 | 0.090 | −5.19 | 0.001 |
| Medicine | 0.000 | 2740.0 | 0.000 | – | – | −7.78 | 0.043 | −4.566 | 0.309 | −2.03 | 0.000 | 6.142 | 0.000 | −5.62 | 0.001 |
| Neuroscience | 0.896 | 11.920 | 0.072 | −0.018 | 0.987 | −0.178 | 0.726 | −0.066 | 0.888 | −0.020 | 0.637 | 0.549 | 0.583 | −0.285 | 0.451 |
| Nursing | 0.256 | 21.520 | 0.012 | −0.284 | 0.803 | −0.372 | 0.580 | −0.048 | 0.936 | −0.045 | 0.565 | 0.716 | 0.474 | −0.733 | 0.226 |
| Pharmacology, Toxicology and Pharm. | 0.865 | 47.520 | 0.000 | −0.361 | 0.844 | −0.747 | 0.449 | −0.002 | 0.999 | −0.148 | 0.337 | 1.016 | 0.309 | −1.24 | 0.115 |
| Physics and Astronomy | 0.797 | 706.2 | 0.000 | 19.5 | 0.006 | 0.048 | 0.646 | 0.954 | 0.495 | 0.091 | 0.771 | 4.514 | 0.000 | 0.000 | 1.000 |
| Psychology | 0.812 | 53.220 | 0.000 | 0.186 | 0.920 | −0.460 | 0.562 | 0.129 | 0.904 | −0.112 | 0.475 | 1.201 | 0.230 | −0.791 | 0.208 |
| Social Sciences | 0.007 | 173.3 | 0.000 | −3.56 | 0.366 | −4.27 | 0.114 | 0.0774 | 0.983 | −1.43 | 0.007 | 0.692 | 0.489 | −4.21 | 0.004 |
| Veterinary | 0.017 | 38.090 | 0.000 | 0.841 | 0.598 | −0.183 | 0.677 | 1.953 | 0.330 | −0.047 | 0.874 | 1.520 | 0.128 | −0.542 | 0.298 |
| Dentistry | 0.011 | 11.830 | 0.200 | −6.60 | 0.025 | −6.26 | 0.028 | −3.714 | 0.257 | −1.28 | 0.000 | −1.958 | 0.050 | −7.14 | 0.000 |
| Health Professions | 0.352 | 38.620 | 0.001 | −0.944 | 0.599 | −1.10 | 0.352 | −0.395 | 0.760 | −0.192 | 0.189 | 0.569 | 0.569 | −1.63 | 0.071 |
| Multidisciplinary | 0.100 | 98.560 | 0.001 | −1.37 | 0.595 | −1.67 | 0.339 | −1.497 | 0.377 | −0.067 | 0.069 | 0.549 | 0.583 | −1.44 | 0.090 |
| All Sciences | 0.076 | 672.3 | 0.000 | 18.30 | 0.009 | −0.125 | 0.797 | −0.007 | 0.992 | −0.054 | 0.625 | 4.578 | 0.000 | −0.240 | 0.488 |
Second column gives the p value for the hypothesis that the data follow a power-law model. “–” means that the maximum likelihood estimator did not converge. Positive values of the log-likelihood ratio (LR) or the normalized log-likelihood ratio (NLR) indicate that the power-law model is favored over the alternative