Literature DB >> 33414815

Automated Extraction of Information From Texts of Scientific Publications: Insights Into HIV Treatment Strategies.

Nadezhda Biziukova1, Olga Tarasova1, Sergey Ivanov1,2, Vladimir Poroikov1.   

Abstract

Text analysis can help to identify named entities (NEs) of small molecules, proteins, and genes. Such data are very important for the analysis of molecular mechanisms of disease progression and development of new strategies for the treatment of various diseases and pathological conditions. The texts of publications represent a primary source of information, which is especially important to collect the data of the highest quality due to the immediate obtaining information, in comparison with databases. In our study, we aimed at the development and testing of an approach to the named entity recognition in the abstracts of publications. More specifically, we have developed and tested an algorithm based on the conditional random fields, which provides recognition of NEs of (i) genes and proteins and (ii) chemicals. Careful selection of abstracts strictly related to the subject of interest leads to the possibility of extracting the NEs strongly associated with the subject. To test the applicability of our approach, we have applied it for the extraction of (i) potential HIV inhibitors and (ii) a set of proteins and genes potentially responsible for viremic control in HIV-positive patients. The computational experiments performed provide the estimations of evaluating the accuracy of recognition of chemical NEs and proteins (genes). The precision of the chemical NEs recognition is over 0.91; recall is 0.86, and the F1-score (harmonic mean of precision and recall) is 0.89; the precision of recognition of proteins and genes names is over 0.86; recall is 0.83; while F1-score is above 0.85. Evaluation of the algorithm on two case studies related to HIV treatment confirms our suggestion about the possibility of extracting the NEs strongly relevant to (i) HIV inhibitors and (ii) a group of patients i.e., the group of HIV-positive individuals with an ability to maintain an undetectable HIV-1 viral load overtime in the absence of antiretroviral therapy. Analysis of the results obtained provides insights into the function of proteins that can be responsible for viremic control. Our study demonstrated the applicability of the developed approach for the extraction of useful data on HIV treatment.
Copyright © 2020 Biziukova, Tarasova, Ivanov and Poroikov.

Entities:  

Keywords:  HIV; NER; data mining; named entity recognition; text mining; viremic control; virus-host interactions

Year:  2020        PMID: 33414815      PMCID: PMC7783389          DOI: 10.3389/fgene.2020.618862

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


  56 in total

1.  Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics.

Authors:  Riza Batista-Navarro; Rafal Rak; Sophia Ananiadou
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

2.  Wide-scope biomedical named entity recognition and normalization with CRFs, fuzzy matching and character level modeling.

Authors:  Suwisa Kaewphan; Kai Hakala; Niko Miekka; Tapio Salakoski; Filip Ginter
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

3.  LSTMVoter: chemical named entity recognition using a conglomerate of sequence labeling tools.

Authors:  Wahed Hemati; Alexander Mehler
Journal:  J Cheminform       Date:  2019-01-10       Impact factor: 5.514

Review 4.  New Challenges of HIV-1 Infection: How HIV-1 Attacks and Resides in the Central Nervous System.

Authors:  Victoria Rojas-Celis; Fernando Valiente-Echeverría; Ricardo Soto-Rifo; Daniela Toro-Ascuy
Journal:  Cells       Date:  2019-10-13       Impact factor: 6.600

5.  Gene Ontology Consortium: going forward.

Authors: 
Journal:  Nucleic Acids Res       Date:  2014-11-26       Impact factor: 19.160

6.  Inhibition of Heat Shock Protein 90 Prevents HIV Rebound.

Authors:  Pheroze Joshi; Ekaterina Maidji; Cheryl A Stoddart
Journal:  J Biol Chem       Date:  2016-03-08       Impact factor: 5.157

Review 7.  Understanding and preventing drug-drug and drug-gene interactions.

Authors:  Cara Tannenbaum; Nancy L Sheehan
Journal:  Expert Rev Clin Pharmacol       Date:  2014-04-19       Impact factor: 5.045

8.  The MicroRNA Interaction Network of Lipid Diseases.

Authors:  Abdul H Kandhro; Watshara Shoombuatong; Chanin Nantasenamat; Virapong Prachayasittikul; Pornlada Nuchnoi
Journal:  Front Genet       Date:  2017-09-22       Impact factor: 4.599

9.  Broad-coverage biomedical relation extraction with SemRep.

Authors:  Halil Kilicoglu; Graciela Rosemblat; Marcelo Fiszman; Dongwook Shin
Journal:  BMC Bioinformatics       Date:  2020-05-14       Impact factor: 3.169

Review 10.  Computer-Assisted and Data Driven Approaches for Surveillance, Drug Discovery, and Vaccine Design for the Zika Virus.

Authors:  Subhash C Basak; Subhabrata Majumdar; Ashesh Nandy; Proyasha Roy; Tathagata Dutta; Marjan Vracko; Apurba K Bhattacharjee
Journal:  Pharmaceuticals (Basel)       Date:  2019-10-16
View more
  1 in total

1.  Chemical named entity recognition in the texts of scientific publications using the naïve Bayes classifier approach.

Authors:  O A Tarasova; A V Rudik; N Yu Biziukova; D A Filimonov; V V Poroikov
Journal:  J Cheminform       Date:  2022-08-13       Impact factor: 8.489

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.