| Literature DB >> 36103497 |
Masanao Ochi1, Masanori Shiro2, Jun'ichiro Mori3, Ichiro Sakata1.
Abstract
Identifying promising research as early as possible is vital to determine which research deserves investment. Additionally, developing a technology for automatically predicting future research trends is necessary because of increasing digital publications and research fragmentation. In previous studies, many researchers have performed the prediction of scientific indices using specially designed features for each index. However, this does not capture real research trends. It is necessary to develop a more integrated method to capture actual research trends from various directions. Recent deep learning technology integrates different individual models and makes it easier to construct more general-purpose models. The purpose of this paper is to show the possibility of integrating multiple prediction models for scientific indices by network-based representation learning. This paper will conduct predictive analysis of multiple future scientific impacts by embedding a heterogeneous network and showing that a network embedding method is a promising tool for capturing and expressing scientific trends. Experimental results show that the multiple heterogeneous network embedding improved 1.6 points than a single citation network embedding. Experimental results show better results than baseline for the number of indices, including the author h-index, the journal impact factor (JIF), and the Nature Index after three years from publication. These results suggest that distributed representations of a heterogeneous network for scientific papers are the basis for the automatic prediction of scientific trends.Entities:
Year: 2022 PMID: 36103497 PMCID: PMC9473411 DOI: 10.1371/journal.pone.0274253
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Outline of the method.
Fig 2Outline of a heterogeneous network to be created.
Learning algorithm.
| Learning algorithm |
|---|
| 1: |
Dataset overview.
| Term | 2006–2016 |
|---|---|
| Number of Papers | 342,785 |
| Number of Citations | 5,249,635 |
| Number of Authors | 743,140 |
| Number of Institutes | 63,485 |
| Number of Journals | 30,764 |
| Number of Keywords | 529,979 |
Fig 3Relation between training period and test period in each index prediction.
Ranking of citations for papers published in 2013 in the dataset.
| Ranking @2016 | Authors | Title | Journal | No. Cited | Ref. | |
|---|---|---|---|---|---|---|
| Dec. 2014 | Dec. 2016 | |||||
| 1 | J. Burschka, | Sequential deposition as a route to high-performance perovskite-sensitized solar cells. | Nature | 573 | 2,320 | [ |
| 2 | M. Liu, | Efficient planar heterojunction perovskite solar cells by vapour deposition. | Nature | 401 | 1,860 | [ |
| 3 | S. D. Stranks, | Electron-hole diffusion lengths exceeding 1 micrometer in an organometal trihalide perovskite absorber. | Science | 248 | 1,423 | [ |
| 4 | J. You, | A polymer tandem solar cell with 10.6% power conversion efficiency. | Nature Communications | 619 | 1,415 | [ |
| 5 | G. Xing, | Long-range balanced electron-and hole-transport lengths in organic-inorganic CH3NH3PbI3. | Science | 219 | 1,161 | [ |
| 6 | J. H. Noh, | Chemical management for colorful, efficient, and stable inorganic-organic hybrid nanostructured solar cells. | Nano Letters | 212 | 848 | [ |
| 7 | J. H. Heo, | Efficient inorganic-organic hybrid heterojunction solar cells containing perovskite compound and polymeric hole conductors. | Nature Photonics | 215 | 768 | [ |
| 8 | J. M. Ball, | Low-temperature processed meso-superstructured to thin-film perovskite solar cells. | Energy&Environmanetal Science | 186 | 616 | [ |
| 9 | H. J. Snaith, | Perovskites: The emergence of a new era for low-cost, high-efficiency solar cells. | J. Phys. Chem. Lett. | 151 | 596 | [ |
| 10 | P. Docampo, | Efficient organometal trihalide perovskite planar-heterojunction solar cells on flexible polymer substrates. | Nature Communications | 96 | 518 | [ |
h-index ranking in the dataset.
| Ranking @2016 | Author | |||
|---|---|---|---|---|
| 2009 | 2013 | 2016 | ||
| 1 | Michael Graätzel | 23 | 72 | 116 |
| 2 | Mohammad K.haja Nazeeruddin | 13 | 45 | 80 |
| 3 | Anders Hagfeldt | 13 | 50 | 74 |
| 4 | Shaik M.ohammed Zakeeruddin | 15 | 47 | 64 |
| 5 | Henry J. Snaith | 7 | 29 | 63 |
| 6 | Li Cheng Sun | 13 | 44 | 60 |
| 7 | Yong-fang Li | 7 | 34 | 57 |
| 8 | Christoph J. Brabec | 12 | 33 | 57 |
| 9 | Alan J. Heeger | 8 | 32 | 55 |
| 10 | Frederik Christian Krebs | 12 | 41 | 55 |
Journal Impact Factor (JIF) ranking in the dataset.
| Ranking @2016 | Journal | Imapact Factor | ||
|---|---|---|---|---|
| 2009 | 2013 | 2016 | ||
| 1 | Nature Photonics | 3.4 | 65.15 | 98.0 |
| 2 | Science | 8.7 | 33.6 | 65.0 |
| 3 | Nature Chemistry | – | 10.5 | 51.6 |
| 4 | Nature Materials | 19.4 | 19.6 | 49.7 |
| 5 | Chemical Reviews | 23.5 | 3.4 | 41.8 |
| 6 | Nature Nanotechnology | 1.0 | 29.1 | 41.5 |
| 7 | Physics Reports | – | 8.0 | 30.7 |
| 8 | Energy and Environmental Science | 3.0 | 11.14 | 27.3 |
| 9 | Nature | 4.5 | 8.4 | 24.1 |
| 10 | Journal of the American Chemical Society | 5.8 | 14.1 | 23.1 |
Nature Index AC/FC ranking in the dataset.
| Ranking @2016 | Institute | Nature Index AC/FC | ||
|---|---|---|---|---|
| 2009 | 2013 | 2016 | ||
| 1 | Department of Mathematics and Statistics, University of Massachusetts at Amherst | 1.09 | 1.95 | 4.83 |
| 2 | Catalan Institution for Research and Advanced Studies (ICREA) | 4.27 | 4.34 | 4.62 |
| 3 | Divisions of Human Biology and Public Health Sciences, Howard Hughes Medical Institute,Fred Hutchinson Cancer Research Center | 1.70 | 2.57 | 4.23 |
| 4 | National Research University of Information Technologies, Mechanics, and Optics (ITMO), International Laboratory of Metamaterials | – | – | 4.14 |
| 5 | Joint Center for Artificial Photosynthesis and Materials Sciences Division, Lawrence Berkeley National Laboratory | 2.75 | 3.25 | 4.13 |
| 6 | Condensed Matter Physics and Materials Sciences Department, Brookhaven National Laboratory | 2.04 | 3.09 | 4.05 |
| 7 | Key Laboratory of Biomedical Information Engineering of the Ministry of Education, Frontier Institute of Science and Technology, Xi’an Jiaotong University | 1.59 | 1.76 | 3.85 |
| 8 | CAS Key Laboratory of Nanosystem and Hierarchical Fabrication CAS Center for Excellence in Nanoscience National Center for Nanoscience and Technology | 1.71 | 2.37 | 3.82 |
| 9 | Department of Chemistry and Nano Science, Ewha Womans University | 1.14 | 2.55 | 3.82 |
| 10 | Laboratory of Resources Environment and Geographic Information System, Capital Normal University | – | 1.67 | 3.80 |
Summary of the created heterogeneous network.
| Bipartite Network | nodes | edges | largest connected component | average node degree |
|---|---|---|---|---|
| Citation Network | 2,884,616 | 5,777,364 | 2,828,458 | 4.048 |
| Paper–Journal Network | 204,264 | 183,363 | 3,225 | 1.999 |
| Paper–Keyword Network | 496,034 | 1,830,474 | 495,891 | 7.382 |
| Paper–Author Network | 583,225 | 802,275 | 318,386 | 3.527 |
| Author–Institute Network | 428,488 | 475,363 | 375,400 | 2.335 |
Fig 4UMAP visualization of acquired distributed representation (color-coded results obtained using K-means method).
Comparison results of link prediction by distributed representation obtained from first-order connection.
| Method | Precision | ± | Recall | ± | ± | |
|---|---|---|---|---|---|---|
| Random Network | 0.540 | ±0.005 | 1.000 | ±0.000 | 0.702 | ±0.004 |
| Citation Network | 0.973 | ±0.012 | 0.898 | ±0.007 | 0.934 | ±0.004 |
| Hetero Network | ±0.006 | ±0.006 | ±0.003 |
* means t-test(P < 0.01) compared to “Random Network”.
† means t-test(P < 0.01) compared to “Citation Network”.
Results of emerging research area identification (AUC).
| Method | Number of Citation | JIF | NI(ACFC) | |
|---|---|---|---|---|
| Proposed |
|
|
|
|
| baseline | 0.497 | 0.852 | 0.735 | 0.599 |
Results of future citation prediction.
| Method |
| Precision | Recall | F value |
|---|---|---|---|---|
| proposed | 0.42 | 0.434 |
|
|
| proposed | 0.80 |
| 0.034 | 0.066 |
| random | 0.50 | 0.533 | 0.358 | 0.428 |
Results of future h-index prediction.
| Method |
| Precision | Recall | F value |
|---|---|---|---|---|
| Proposed | 0.50 | 0.757 | 0.739 | 0.748 |
| 0.50 | 0.757 | 0.739 | 0.748 |
Results of future JIF prediction.
| Method |
| Precision | Recall | F value |
|---|---|---|---|---|
| proposed | 0.17 | 0.586 | 0.189 | 0.286 |
| proposed | 0.80 |
| 0.060 | 0.110 |
| JIF@2013 | 0.486 |
|
| |
Results of future NI AC/FC prediction.
| Method |
| Precision | Recall | F value |
|---|---|---|---|---|
| proposed | 0.13 | 0.314 |
|
|
| proposed | 0.80 |
| 0.059 | 0.111 |
| NI AC/FC@2013 | 0.410 | 0.443 | 0.426 | |