Literature DB >> 33974653

Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling.

Maxime Rivest1,2, Etienne Vignola-Gagné1,2, Éric Archambault1,2,3.   

Abstract

Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques becomes available and as new research topics emerge. Convolutional neural networks, a subset of "deep learning" approaches, have recently offered novel and highly performant methods for classifying voluminous corpora of text. This article benchmarks a deep learning classification technique on more than 40 million scientific articles and on tens of thousands of scholarly journals. The comparison is performed against bibliographic coupling-, direct citation-, and manual-based classifications-the established and most widely used approaches in the field of bibliometrics, and by extension, in many science and innovation policy activities such as grant competition management. The results reveal that the performance of this first iteration of a deep learning approach is equivalent to the graph-based bibliometric approaches. All methods presented are also on par with manual classification. Somewhat surprisingly, no machine learning approaches were found to clearly outperform the simple label propagation approach that is direct citation. In conclusion, deep learning is promising because it performed just as well as the other approaches but has more flexibility to be further improved. For example, a deep neural network incorporating information from the citation network is likely to hold the key to an even better classification algorithm.

Entities:  

Year:  2021        PMID: 33974653      PMCID: PMC8112690          DOI: 10.1371/journal.pone.0251493

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  8 in total

1.  Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches.

Authors:  Kevin W Boyack; David Newman; Russell J Duhon; Richard Klavans; Michael Patek; Joseph R Biberstine; Bob Schijvenaars; André Skupin; Nianli Ma; Katy Börner
Journal:  PLoS One       Date:  2011-03-17       Impact factor: 3.240

2.  Design and update of a classification system: the UCSD map of science.

Authors:  Katy Börner; Richard Klavans; Michael Patek; Angela M Zoss; Joseph R Biberstine; Robert P Light; Vincent Larivière; Kevin W Boyack
Journal:  PLoS One       Date:  2012-07-12       Impact factor: 3.240

3.  P-values as percentiles. Commentary on: "Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations".

Authors:  Jose D Perezgonzalez
Journal:  Front Psychol       Date:  2015-04-01

4.  An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.

Authors:  George Tsatsaronis; Georgios Balikas; Prodromos Malakasiotis; Ioannis Partalas; Matthias Zschunke; Michael R Alvers; Dirk Weissenborn; Anastasia Krithara; Sergios Petridis; Dimitris Polychronopoulos; Yannis Almirantis; John Pavlopoulos; Nicolas Baskiotis; Patrick Gallinari; Thierry Artiéres; Axel-Cyrille Ngonga Ngomo; Norman Heino; Eric Gaussier; Liliana Barrio-Alvers; Michael Schroeder; Ion Androutsopoulos; Georgios Paliouras
Journal:  BMC Bioinformatics       Date:  2015-04-30       Impact factor: 3.169

5.  Hybrid self-optimized clustering model based on citation links and textual features to detect research topics.

Authors:  Dejian Yu; Wanru Wang; Shuai Zhang; Wenyu Zhang; Rongyu Liu
Journal:  PLoS One       Date:  2017-10-27       Impact factor: 3.240

6.  The emergent integrated network structure of scientific research.

Authors:  Jordan D Dworkin; Russell T Shinohara; Danielle S Bassett
Journal:  PLoS One       Date:  2019-04-30       Impact factor: 3.240

7.  A standardized citation metrics author database annotated for scientific field.

Authors:  John P A Ioannidis; Jeroen Baas; Richard Klavans; Kevin W Boyack
Journal:  PLoS Biol       Date:  2019-08-12       Impact factor: 8.029

8.  Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods.

Authors:  Lovro Šubelj; Nees Jan van Eck; Ludo Waltman
Journal:  PLoS One       Date:  2016-04-28       Impact factor: 3.240

  8 in total
  1 in total

1.  TeamTree analysis: A new approach to evaluate scientific production.

Authors:  Frank W Pfrieger
Journal:  PLoS One       Date:  2021-07-21       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.