Literature DB >> 35623021

Machine Learning Approach to Facilitate Knowledge Synthesis at the Intersection of Liver Cancer, Epidemiology, and Health Disparities Research.

Travis C Hyams1, Ling Luo2, Brionna Hair1, Kyubum Lee3, Zhiyong Lu2, Daniela Seminara1.   

Abstract

PURPOSE: Liver cancer is a global challenge, and disparities exist across multiple domains and throughout the disease continuum. However, liver cancer's global epidemiology and etiology are shifting, and the literature is rapidly evolving, presenting a challenge to the synthesis of knowledge needed to identify areas of research needs and to develop research agendas focusing on disparities. Machine learning (ML) techniques can be used to semiautomate the literature review process and improve efficiency. In this study, we detail our approach and provide practical benchmarks for the development of a ML approach to classify literature and extract data at the intersection of three fields: liver cancer, health disparities, and epidemiology.
METHODS: We performed a six-phase process including: training (I), validating (II), confirming (III), and performing error analysis (IV) for a ML classifier. We then developed an extraction model (V) and applied it (VI) to the liver cancer literature identified through PubMed. We present precision, recall, F1, and accuracy metrics for the classifier and extraction models as appropriate for each phase of the process. We also provide the results for the application of our extraction model.
RESULTS: With limited training data, we achieved a high degree of accuracy for both our classifier and for the extraction model for liver cancer disparities research literature performed using epidemiologic methods. The disparities concept was the most challenging to accurately classify, and concepts that appeared infrequently in our data set were the most difficult to extract.
CONCLUSION: We provide a roadmap for using ML to classify and extract comprehensive information on multidisciplinary literature. Our technique can be adapted and modified for other cancers or diseases where disparities persist.

Entities:  

Mesh:

Year:  2022        PMID: 35623021      PMCID: PMC9225668          DOI: 10.1200/CCI.21.00129

Source DB:  PubMed          Journal:  JCO Clin Cancer Inform        ISSN: 2473-4276


  25 in total

1.  Interdisciplinary research by the numbers.

Authors:  Richard Van Noorden
Journal:  Nature       Date:  2015-09-17       Impact factor: 49.962

2.  Scientific literature: Information overload.

Authors:  Esther Landhuis
Journal:  Nature       Date:  2016-07-21       Impact factor: 49.962

3.  An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.

Authors:  Ling Luo; Zhihao Yang; Pei Yang; Yin Zhang; Lei Wang; Hongfei Lin; Jian Wang
Journal:  Bioinformatics       Date:  2018-04-15       Impact factor: 6.937

4.  Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes.

Authors:  Yujia Bao; Zhengyi Deng; Yan Wang; Heeyoon Kim; Victor Diego Armengol; Francisco Acevedo; Nofal Ouardaoui; Cathy Wang; Giovanni Parmigiani; Regina Barzilay; Danielle Braun; Kevin S Hughes
Journal:  JCO Clin Cancer Inform       Date:  2019-09

Review 5.  The global epidemiology of hepatocellular carcinoma: present and future.

Authors:  Katherine A McGlynn; W Thomas London
Journal:  Clin Liver Dis       Date:  2011-05       Impact factor: 6.126

6.  A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.

Authors:  David Westergaard; Hans-Henrik Stærfeldt; Christian Tønsberg; Lars Juhl Jensen; Søren Brunak
Journal:  PLoS Comput Biol       Date:  2018-02-15       Impact factor: 4.475

7.  Disparities in liver cancer occurrence in the United States by race/ethnicity and state.

Authors:  Farhad Islami; Kimberly D Miller; Rebecca L Siegel; Stacey A Fedewa; Elizabeth M Ward; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2017-06-06       Impact factor: 508.702

8.  Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.

Authors:  Hyuna Sung; Jacques Ferlay; Rebecca L Siegel; Mathieu Laversanne; Isabelle Soerjomataram; Ahmedin Jemal; Freddie Bray
Journal:  CA Cancer J Clin       Date:  2021-02-04       Impact factor: 508.702

9.  Toward systematic review automation: a practical guide to using machine learning tools in research synthesis.

Authors:  Iain J Marshall; Byron C Wallace
Journal:  Syst Rev       Date:  2019-07-11

10.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Authors:  Jinhyuk Lee; Wonjin Yoon; Sungdong Kim; Donghyeon Kim; Sunkyu Kim; Chan Ho So; Jaewoo Kang
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.