Literature DB >> 29271976

Gaps within the Biomedical Literature: Initial Characterization and Assessment of Strategies for Discovery.

Yufang Peng1, Gary Bonifield2, Neil R Smalheiser2.   

Abstract

Within well-established fields of biomedical science, we identify "gaps", topical areas of investigation that might be expected to occur but are missing. We define a field by carrying out a topical PubMed query, and analyze Medical Subject Headings by which the set of retrieved articles are indexed. Medical Subject headings (MeSH terms) which occur in >1% of the articles are examined pairwise to see how often they are predicted to co-occur within individual articles (assuming that they are independent of each other). A pair of MeSH terms that are predicted to co-occur in at least 10 articles, yet are not observed to co-occur in any article, are "gaps" and were studied further in a corpus of 10 disease-related article sets and 10 related to biological processes. Overall, articles that filled gaps were cited more heavily than non-gap-filling articles and were 61% more likely to be published in multidisciplinary high-impact journals. Nine different features of these "gaps" were characterized and tested to learn which, if any, correlate with the appearance of one or more articles containing both MeSH terms within the next five years. Several different types of gaps were identified, each having distinct combinations of predictive features: a) those arising as a byproduct of MeSH indexing rules; b) those having little biological meaning; c) those representing "low hanging fruit" for immediate exploitation; and d) those representing gaps across disciplines or sub-disciplines that do not talk to each other or work together. We have built a free, open tool called "Mine the Gap!" that identifies and characterizes the "gaps" for any PubMed query, which can be accessed via the Anne O'Tate value-added PubMed search interface (http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/AnneOTate.cgi).

Entities:  

Keywords:  Medical Subject Headings; Scientific discovery; link prediction; literature based discovery; text mining

Year:  2017        PMID: 29271976      PMCID: PMC5736374          DOI: 10.3389/frma.2017.00003

Source DB:  PubMed          Journal:  Front Res Metr Anal        ISSN: 2504-0537


  11 in total

1.  MeSHSim: An R/Bioconductor package for measuring semantic similarity over MeSH headings and MEDLINE documents.

Authors:  Jing Zhou; Yuxuan Shui; Shengwen Peng; Xuhui Li; Hiroshi Mamitsuka; Shanfeng Zhu
Journal:  J Bioinform Comput Biol       Date:  2015-09-09       Impact factor: 1.122

2.  A quantitative model for linking two disparate sets of articles in MEDLINE.

Authors:  Vetle I Torvik; Neil R Smalheiser
Journal:  Bioinformatics       Date:  2007-04-26       Impact factor: 6.937

3.  Atypical combinations and scientific impact.

Authors:  Brian Uzzi; Satyam Mukherjee; Michael Stringer; Ben Jones
Journal:  Science       Date:  2013-10-25       Impact factor: 47.728

4.  MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms.

Authors:  T Theodosiou; I S Vizirianakis; L Angelis; A Tsaftaris; N Darzentas
Journal:  J Biomed Inform       Date:  2011-06-13       Impact factor: 6.317

5.  Link Prediction on a Network of Co-occurring MeSH Terms: Towards Literature-based Discovery.

Authors:  Andrej Kastrin; Thomas C Rindflesch; Dimitar Hristovski
Journal:  Methods Inf Med       Date:  2016-07-20       Impact factor: 2.176

6.  Quantifying Conceptual Novelty in the Biomedical Literature.

Authors:  Shubhanshu Mishra; Vetle I Torvik
Journal:  Dlib Mag       Date:  2016 Sep-Oct

7.  Arrowsmith two-node search interface: a tutorial on finding meaningful links between two disparate sets of articles in MEDLINE.

Authors:  Neil R Smalheiser; Vetle I Torvik; Wei Zhou
Journal:  Comput Methods Programs Biomed       Date:  2009-01-30       Impact factor: 5.428

8.  Long-distance interdisciplinarity leads to higher scientific impact.

Authors:  Vincent Larivière; Stefanie Haustein; Katy Börner
Journal:  PLoS One       Date:  2015-03-30       Impact factor: 3.240

9.  Anne O'Tate: A tool to support user-driven summarization, drill-down and browsing of PubMed search results.

Authors:  Neil R Smalheiser; Wei Zhou; Vetle I Torvik
Journal:  J Biomed Discov Collab       Date:  2008-02-15

10.  Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation.

Authors:  Neil R Smalheiser; Gary Bonifield
Journal:  J Biomed Discov Collab       Date:  2016-04-06
View more
  3 in total

1.  Towards a characterization of apparent contradictions in the biomedical literature using context analysis.

Authors:  Graciela Rosemblat; Marcelo Fiszman; Dongwook Shin; Halil Kilicoglu
Journal:  J Biomed Inform       Date:  2019-08-29       Impact factor: 6.317

2.  Rediscovering Don Swanson: the Past, Present and Future of Literature-Based Discovery.

Authors:  Neil R Smalheiser
Journal:  J Data Inf Sci       Date:  2017-12

3.  A systematic review on literature-based discovery workflow.

Authors:  Menasha Thilakaratne; Katrina Falkner; Thushari Atapattu
Journal:  PeerJ Comput Sci       Date:  2019-11-18
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.