| Literature DB >> 36170296 |
Qi Li1, Xinbing Wang2, Luoyi Fu2, Jianghao Wang3, Ling Yao3, Xiaoying Gan1, Chenghu Zhou3.
Abstract
The rapid development of modern science nowadays makes it rather challenging to pick out valuable ideas from massive scientific literature. Existing widely-adopted citation-based metrics are not adequate for measuring how well the idea presented by a single publication is developed and whether it is worth following. Here, inspired by traditional X-ray imaging, which returns internal structure imaging of real objects along with corresponding structure analysis, we propose Scientific X-ray, a framework that quantifies the development degree and development potential for any scientific idea through an assembly of 'X-ray' scanning, visualization and parsing operated on the citation network associated with a target publication. We pick all 71,431 scientific articles of citation counts over 1,000 as high-impact target publications among totally 204,664,199 publications that cover 16 disciplines spanning from 1800 to 2021. Our proposed Scientific X-ray reproduces how an idea evolves from the very original target publication all the way to the up to date status via an extracted 'idea tree' that attempts to preserve the most representative idea flow structure underneath each citation network. Interestingly, we observe that while the citation counts of publications may increase unlimitedly, the maximum valid idea inheritance of those target publications, i.e., the valid depth of the idea tree, cannot exceed a limit of six hops, and the idea evolution structure of any arbitrary publication unexceptionally falls into six fixed patterns. Combined with a development potential index that we further design based on the extracted idea tree, Scientific X-ray can vividly tell how further a given idea presented by a given publication can still go from any well-established starting point. Scientific X-ray successfully identifies 40 out of 49 topics of Nobel prize as high-potential topics by their prize-winning papers in an average of nine years before the prizes are released. Various trials on articles of diverse topics also confirm the power of Scientific X-ray in digging out influential/promising ideas. Scientific X-ray is user-friendly to researchers with any level of expertise, thus providing important basis for grasping research trends, helping scientific policy-making and even promoting social development.Entities:
Mesh:
Year: 2022 PMID: 36170296 PMCID: PMC9518912 DOI: 10.1371/journal.pone.0275192
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1The framework of scientific X-ray.
We illustrate the pipeline utilizing two publications with different citation structures. (a) Retrieve the citing papers of the target publication, all links among the target publication and citing articles and all links among citing papers in the database. (b) Construct the citation network of the target publication based on the retrieval results. (c) Extract the idea tree from the citation network to reveal the flow of the target publication’s idea. (d) Utilize Knowledge Entropy (KE) to quantify the knowledge quality of nodes in the idea tree to highlight the powerful inheritor of the target idea. (e) Reproduce the evolution of the idea tree and quantify the degree of development of the target publication’s idea utilizing the Valid Depth (VD). (f) Utilize the Development Potential Index (DPI) to quantify the potential of the target publication and assess whether it is worth continuing to follow.
Fig 2The distribution of high-impact publications’ VD and the distribution of the VD inspired by a single article within idea trees.
The VD of 99% of high-impact publications is difficult to exceed six-hop. (b) 99% of the articles in the idea tree have difficulty contributing more than three-hop to the VD.
Fig 3Six fixed evolution patterns of publication’s idea.
(a,c,e,h,k,n) The evolution of VD of corresponding publications. (b,d,g,j,m,o) The evolution of idea trees of corresponding publications. All idea trees are pruned to ensure the visibility of the skeleton structure. All idea trees are visualized by the DOT algorithm [50]. Node size is rescaled in every idea tree and positively related to its KE. (f,i,l) The evolution of nodes’ KE in corresponding idea trees. These six patterns exist in a wide range of scientific fields and are not limited to geographic information system, ecology and climate change, computer vision, natural language processing, deep learning and geology.
The rules and results of the classification of idea trees’ evolution patterns.
| Patterns | Rules | Number of patterns |
|---|---|---|
| Pattern 1 | Keywords (survey, review, summary, introduction, book and software) can be matched in the title of the target publication to indicate that it is a summative article, and the target publication’s VD ≤ 1. | 197 |
| Pattern 2 | There are high KE child nodes in the subtree led by the high KE node. | 1947 |
| Pattern 3 | More than three high KE nodes appear on a connected path of the idea tree. | 369 |
| Pattern 4 | The KE of the child node exceeds the root node, and the VD of the subtree led by it is ≤ 2. | 52 |
| Pattern 5 | The subtrees led by the two child nodes with KE in the top2 do not contain each other, and among these two nodes, the KE of the first-published node is smaller than the later-published node. | 7 |
| Pattern 6 | The VD of the target publication reaches five or six, and the number of high KE child nodes exceeds seven. | 193 |
Mean of VDs driven by high KE nodes in different valid layers.
| Valid Layers | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
|
| 0.501 | 0.214 | 0.128 | 0.158 | 0 | 0 |
Fig 4The verification of scientific X-ray’s VD and DPI by prize data.
(a) The structure of the idea tree led by ‘A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity’ in 2021 (The Champion of science’s 2015 breakthrough). (b-j) The idea trees of runner-up papers of science’s 2015 breakthrough in 2021. The VD of the idea tree of the champion CRISPR is deeper than other publications, and more high KE nodes appear in the idea tree of CRISPR. (k) Identify the development potential of Nobel topics before they are awarded. Each point represents an award-winning article under the corresponding Nobel Prize topic. The ordinate is the maximum DPI within 1–8 years after the corresponding article was published. Points above the red dotted line are papers with DPI ≥ 1 in the time window. Since the average time interval from publication to awarding of prize-winning papers is 17 years, for a Nobel Prize-winning topic, if the maximum DPI of one of the prize-winning articles is over one within 1–8 years after its publication, it is considered that Scientific X-ray has successfully identified corresponding topic’s development potential.
Top ten development potential publications in the field of deep learning, geoscience and Covid-19.
| Publication | DPI |
|---|---|
|
| |
| Attention is All you Need | 2.935 |
| Prototypical Networks for Few-shot Learning | 2.410 |
| Matching networks for one shot learning | 2.229 |
| Semi-Supervised Classification with Graph Convolutional Networks | 1.843 |
| Understanding deep learning requires rethinking generalization | 1.730 |
| A Style-Based Generator Architecture for Generative Adversarial Networks | 1.723 |
| Universal Adversarial Perturbations | 1.705 |
| Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks | 1.567 |
| DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks | 1.427 |
| Overcoming catastrophic forgetting in neural networks | 1.376 |
|
| |
| The effect of human mobility and control measures on the COVID-19 epidemic in China | 1.874 |
| Mangroves among the most carbon-rich forests in the tropics | 1.746 |
| Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013 | 1.445 |
| Global land use change, economic globalization, and the looming land scarcity | 1.281 |
| ‘Structure-from-Motion’ photogrammetry: A low-cost, effective tool for geoscience applications | 1.254 |
| Hemispheric and large-scale land-surface air temperature variations: An extensive revision and an update to 2010 | 1.250 |
| Linear Mixed-Effects Models using‘Eigen’ and S4 | 1.221 |
| The Transiting Exoplanet Survey Satellite | 1.216 |
| Object-based cloud and cloud shadow detection in Landsat imagery | 1.170 |
| Bedmap2: improved ice bed, surface and thickness datasets for Antarctica | 1.097 |
|
| |
| Breakthrough: Chloroquine phosphate has shown apparent efficacy in treatment of COVID-19 associated pneumonia in clinical studies | 2.741 |
| Compassionate Use of Remdesivir for Patients with Severe Covid-19 | 2.601 |
| Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine | 2.475 |
| Endothelial cell infection and endotheliitis in COVID-19 | 2.445 |
| Neurologic Manifestations of Hospitalized Patients With Coronavirus Disease 2019 in Wuhan, China | 2.137 |
| SARS-CoV-2 Infection in Children | 2.110 |
| Characteristics and Outcomes of 21 Critically Ill Patients With COVID-19 in Washington State | 2.105 |
| The neuroinvasive potential of SARS-CoV2 may be at least partially responsible for the respiratory failure of COVID-19 patients | 2.104 |
| Coronavirus Infections—More Than Just the Common Cold | 2.103 |
| The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health—The latest 2019 novel coronavirus outbreak in Wuhan, China | 2.046 |