Literature DB >> 25916548

Mapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. counties.

Diana María Herrera-Ibatá1, Alejandro Pazos2, Ricardo Alfredo Orbegozo-Medina3, Francisco Javier Romero-Durán4, Humberto González-Díaz5.   

Abstract

Using computational algorithms to design tailored drug cocktails for highly active antiretroviral therapy (HAART) on specific populations is a goal of major importance for both pharmaceutical industry and public health policy institutions. New combinations of compounds need to be predicted in order to design HAART cocktails. On the one hand, there are the biomolecular factors related to the drugs in the cocktail (experimental measure, chemical structure, drug target, assay organisms, etc.); on the other hand, there are the socioeconomic factors of the specific population (income inequalities, employment levels, fiscal pressure, education, migration, population structure, etc.) to study the relationship between the socioeconomic status and the disease. In this context, machine learning algorithms, able to seek models for problems with multi-source data, have to be used. In this work, the first artificial neural network (ANN) model is proposed for the prediction of HAART cocktails, to halt AIDS on epidemic networks of U.S. counties using information indices that codify both biomolecular and several socioeconomic factors. The data was obtained from at least three major sources. The first dataset included assays of anti-HIV chemical compounds released to ChEMBL. The second dataset is the AIDSVu database of Emory University. AIDSVu compiled AIDS prevalence for >2300 U.S. counties. The third data set included socioeconomic data from the U.S. Census Bureau. Three scales or levels were employed to group the counties according to the location or population structure codes: state, rural urban continuum code (RUCC) and urban influence code (UIC). An analysis of >130,000 pairs (network links) was performed, corresponding to AIDS prevalence in 2310 counties in U.S. vs. drug cocktails made up of combinations of ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found with the original data was a linear neural network (LNN) with AUROC>0.80 and accuracy, specificity, and sensitivity≈77% in training and external validation series. The change of the spatial and population structure scale (State, UIC, or RUCC codes) does not affect the quality of the model. Unbalance was detected in all the models found comparing positive/negative cases and linear/non-linear model accuracy ratios. Using synthetic minority over-sampling technique (SMOTE), data pre-processing and machine-learning algorithms implemented into the WEKA software, more balanced models were found. In particular, a multilayer perceptron (MLP) with AUROC=97.4% and precision, recall, and F-measure >90% was found.
Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Entities:  

Keywords:  AIDS epidemiology; Box–Jenkins operators; Information theory; Shannon entropy; Urban influence code

Mesh:

Substances:

Year:  2015        PMID: 25916548     DOI: 10.1016/j.biosystems.2015.04.007

Source DB:  PubMed          Journal:  Biosystems        ISSN: 0303-2647            Impact factor:   1.973


  5 in total

Review 1.  The unequivocal preponderance of biocomputation in clinical virology.

Authors:  Sechul Chun; Manikandan Muthu; Judy Gopal; Diby Paul; Doo Hwan Kim; Enkhtaivan Gansukh; Vimala Anthonydhason
Journal:  RSC Adv       Date:  2018-05-18       Impact factor: 4.036

2.  Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues.

Authors:  Zhijun Liao; Xinrui Wang; Yeting Zeng; Quan Zou
Journal:  Sci Rep       Date:  2016-12-21       Impact factor: 4.379

3.  PTML Modeling for Pancreatic Cancer Research: In Silico Design of Simultaneous Multi-Protein and Multi-Cell Inhibitors.

Authors:  Valeria V Kleandrova; Alejandro Speck-Planche
Journal:  Biomedicines       Date:  2022-02-18

4.  In Silico Drug Repurposing for Anti-Inflammatory Therapy: Virtual Search for Dual Inhibitors of Caspase-1 and TNF-Alpha.

Authors:  Alejandro Speck-Planche; Valeria V Kleandrova; Marcus T Scotti
Journal:  Biomolecules       Date:  2021-12-04

Review 5.  A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects.

Authors:  Shiho Kino; Yu-Tien Hsu; Koichiro Shiba; Yung-Shin Chien; Carol Mita; Ichiro Kawachi; Adel Daoud
Journal:  SSM Popul Health       Date:  2021-06-05
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.