| Literature DB >> 30024893 |
Joseph Staudt1, Huifeng Yu2, Robert P Light3, Gerald Marschke2,4, Katy Börner3,5, Bruce A Weinberg1,4.
Abstract
Countries, research institutions, and scholars are interested in identifying and promoting high-impact and transformative scientific research. This paper presents a novel set of text- and citation-based metrics that can be used to identify high-impact and transformative works. The 11 metrics can be grouped into seven types: Radical-Generative, Radical-Destructive, Risky, Multidisciplinary, Wide Impact, Growing Impact, and Impact (overall). The metrics are exemplified, validated, and compared using a set of 10,778,696 MEDLINE articles matched to the Science Citation Index ExpandedTM. Articles are grouped into six 5-year periods (spanning 1983-2012) using publication year and into 6,159 fields constructed using comparable MeSH terms, with which each article is tagged. The analysis is conducted at the level of a field-period pair, of which 15,051 have articles and are used in this study. A factor analysis shows that transformativeness and impact are positively related (ρ = .402), but represent distinct phenomena. Looking at the subcomponents of transformativeness, there is no evidence that transformative work is adopted slowly or that the generation of important new concepts coincides with the obsolescence of existing concepts. We also find that the generation of important new concepts and highly cited work is more risky. Finally, supporting the validity of our metrics, we show that work that draws on a wider range of research fields is used more widely.Entities:
Mesh:
Year: 2018 PMID: 30024893 PMCID: PMC6053144 DOI: 10.1371/journal.pone.0200597
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Classification of scientific work by radicalness and impact, with examples.
Article counts.
| Data Source | Articles | With Restrictions |
|---|---|---|
| MEDLINE 2014 Baseline | 22,376,811 | 20,667,693 |
| SCIE | 15,085,762 | 15,080,131 |
| | ||
| Intersection | 13,737,835 | |
| Published 1983–2012 | 10,778,696 |
*There are three restrictions on articles in the MEDLINE data: 1) the article must be the first version of an article, 2) the article must have “MEDLINE” status, and 3) the article must be tagged with at least one 4-digit MeSH term. For details on the version and status of MEDLINE articles see NLM, 2016. For details on 4-digit MeSH terms see the description below and Appendix C.
**There is one restriction on articles in the SCIE data: A small number of our SCIE records map to a PMID to which other SCIE records map. We retain the earliest SCIE ID that maps to each PMID, reducing our SCIE articles by 5,631 or .037% of our 15,085,762 SCIE records.
Exemplary depiction of field-period pairs.
| 1983–1987 | 1988–1992 | 1993–1997 | 1998–2002 | 2003–2007 | 2008–2012 | |
|---|---|---|---|---|---|---|
| Field | ||||||
| DNA Methylation | 0.00 | 0.01 | 10.13 | 92.28 | 276.07 | 564.49 |
| Embryonic Stem Cells | 0.36 | 2.77 | 3.90 | 3.16 | 450.15 | 2641.27 |
| Human Genome Project | 4.85 | 6.66 | 17.78 | 11.17 | 11.79 | |
| Nuclear Reprogramming | 0.00 | 9.88 | 313.47 | |||
| Pluripotent Stem Cells | 7.71 | 185.06 | 1301.76 | |||
| DNA Methylation | 32.60 | 4.00 | 57.64 | 73.94 | 46.11 | 18.72 |
| Embryonic Stem Cells | 34.62 | 37.47 | 36.30 | 25.27 | 32.45 | 19.45 |
| Human Genome Project | 7.03 | 18.37 | 26.63 | 14.21 | 10.43 | |
| Nuclear Reprogramming | 96.28 | 34.52 | ||||
| Pluripotent Stem Cells | 3.00 | 58.16 | 67.37 | 26.14 | ||
All six time periods are shown, but only five of the 6,159 fields and two of 11 metrics. Take the case of DNA Methylation: the numbers for the 2008–2012 period indicate that the (prorated) articles on DNA Methylation in this period used 564.49 top .01% concepts and were cited 18.72 times on average in the subsequent years.
Summary statistics for all metrics for time periods 1983–1987, …, 2008–2012 and all MESH4 fields in MEDLINE.
| Metric | Mean | S.D. | Metric Description, Measurement Period and Fields |
|---|---|---|---|
| FCiteMean | 22.311 | 12.658 | Mean citations received across articles indexed in both MEDLINE and SCIE in a field-period pair during all subsequent years (including later years in the target period) through 2014. |
| FCite25 | 3.489 | 2.654 | Quantiles of the distribution of citations received across all articles indexed in both MEDLINE and SCIE in a field-period pair across during all subsequent years (including later years in the target period) through 2014. |
| FCite50 | 9.838 | 6.075 | |
| FCite75 | 23.768 | 13.331 | |
| FCite90 | 50.266 | 27.708 | |
| FCite95 | 78.93 | 44.035 | |
| FCite99 | 192.539 | 111.834 | |
| FCite99.9 | 586.448 | 393.070 | |
| FCite99.99 | 1626.674 | 1623.510 | |
| FHerfCite | 0.979 | 0.005 | A Herfindahl index of the disciplinary diversity of the citations that the articles indexed in both MEDLINE and SCIE in a field-period pair received during all subsequent years (including later years in the target period) through 2014. |
| FCiteAge | 5.229 | 2.555 | The mean time to citation across all of the articles indexed in both MEDLINE and SCIE in a field-period pair. |
| BHerfCite | 0.979 | 0.006 | A Herfindahl index of the disciplinary diversity of the articles referenced by the articles indexed in both MEDLINE and SCIE in a field-period pair. Data on references cover all previous years (including earlier years in the target period). |
| BCiteAge | 9.642 | 2.603 | The mean of the mean age of the works referenced across all articles in a field-period pair. The mean reference age of an article is calculated over all references by the article to all articles that are published in all previous years (including earlier years in the target period) without limitations, which include all MEDLINE and non-MEDLINE indexed articles in the SCIE. |
| FCiteVar | 4802.822 | 14513.130 | The variance in citations received across all articles in a field-period pair. Citations to an article are the sum of citations received from all articles published in all subsequent years (including later years in the target period) through 2014 that are indexed in both MEDLINE and SCIE. |
| Concepts | 32.281 | 72.141 | The number of top .01% n-grams introduced by a field-period pair. These are measured by identifying the year and field(s) in which each n-gram is born (i.e., the n-gram’s vintage year and field(s)). Only includes articles indexed in both MEDLINE and SCIE. |
| BMent0 | 0.003 | 0.006 | The number of articles belonging to a field-period pair that mention a top .01% n-gram with in the first T (T=0, 1, 3, 5, 10, all) years since the n-gram was first used. Only includes articles indexed in both MEDLINE and SCIE. |
| BMent3 | 0.037 | 0.033 | |
| BMent5 | 0.085 | 0.078 | |
| BMent10 | 0.294 | 0.258 | |
| BMentAll | 1.028 | 0.930 | |
| FHerfMentions | 0.996 | 0.008 | A Herfindahl index of the disciplinary diversity of the use of the n-grams introduced by the articles in a field-period pair. The metric is constructed from all mentions in all articles published across all years subsequent to the vintage year (including later years in the target period) through 2012. Only includes articles indexed in both MEDLINE and SCIE. |
| BHerfMent0 | 0.911 | 0.183 | A Herfindahl index of the diversity of n-grams used by the articles in a field-period pair in the first T (T=0, 1, 3, 5, 10, all) years since the n-gram was first used. Only includes articles indexed in both MEDLINE and SCIE. |
| BHerfMent3 | 0.969 | 0.066 | |
| BHerfMent5 | 0.976 | 0.053 | |
| BHerfMent10 | 0.981 | 0.465 | |
| BHerfMentAll | 0.983 | 0.046 | |
Note: There are 15,051 field-period pairs. In all cases, articles are prorated across fields according to MeSH terms.
Fig 2Factor loadings from a factor analysis results for three of the seven aspects.
Interrelations between the metrics for aspects of impact and transformativeness.
| Radical—Generative | Radical—Destructive | Risky | Multidis-ciplinary | Wide Impact | Growing Impact | Impact | |
|---|---|---|---|---|---|---|---|
| 1 | |||||||
| 0.1045 | 1 | ||||||
| 0.2718 | 0.0501 | 1 | |||||
| -0.0752 | -0.0963 | 0.0762 | 1 | ||||
| 0.0676 | -0.0167 | 0.0841 | 0.2472 | 1 | |||
| -0.2948 | -0.3529 | -0.2322 | -0.0344 | -0.2428 | 1 | ||
| 0.3835 | 0.0272 | 0.5558 | 0.1002 | 0.0343 | -0.2000 | 1 |
Note: The table reports partial correlations between aspects of Impact and Transformativeness (the other six metrics) across field-period pairs after eliminating variation across field and time (that is, time and field fixed effects).
Fig 3Results from a factor analysis of six aspects of HITS.
The figure reports factor loadings on each aspect of transformative research from a factor analysis. The factor loadings indicate the extent to which the transformativeness metric loads on the (first) factor for each aspect of transformative research (excluding impact, which is treated separately).
Fig 4All seven aspects of HITS related to impact and transformativeness.
The figure shows the partial correlations between the metrics for the aspects of transformative research and the overall metrics for transformativeness and impact across field-period pairs after eliminating variation across field and time (that is, time and field fixed effects).
Fig 5FCiteN related to impact and transformativeness.
The figure shows the partial correlation between the individual metrics for impact and the overall metrics for transformativeness and impact across field-period pairs after eliminating variation across field and time (that is, time and field fixed effects).
Fig 6Ranking of fields in terms of impact and transformativeness across all periods (1982–2012).
Field size determined by the number of (weighted) articles across all periods. Research on stem cells is shown in red.