BACKGROUND: The extreme flexibility of the HIV type-1 (HIV-1) genome makes it challenging to build the ideal antiretroviral treatment regimen. Interpretation of HIV-1 genotypic drug resistance is evolving from rule-based systems guided by expert opinion to data-driven engines developed through machine learning methods. METHODS: The aim of the study was to investigate linear and non-linear statistical learning models for classifying short-term virological outcome of antiretroviral treatment. To optimize the model, different feature selection methods were considered. Robust extra-sample error estimation and different loss functions were used to assess model performance. The results were compared with widely used rule-based genotypic interpretation systems (Stanford HIVdb, Rega and ANRS). RESULTS: A set of 3,143 treatment change episodes were extracted from the EuResist database. The dataset included patient demographics, treatment history and viral genotypes. A logistic regression model using high order interaction variables performed better than rule-based genotypic interpretation systems (accuracy 75.63% versus 71.74-73.89%, area under the receiver operating characteristic curve [AUC] 0.76 versus 0.68-0.70) and was equivalent to a random forest model (accuracy 76.16%, AUC 0.77). However, when rule-based genotypic interpretation systems were coupled with additional patient attributes, and the combination was provided as input to the logistic regression model, the performance increased significantly, becoming comparable to the fully data-driven methods. CONCLUSIONS: Patient-derived supplementary features significantly improved the accuracy of the prediction of response to treatment, both with rule-based and data-driven interpretation systems. Fully data-driven models derived from large-scale data sources show promise as antiretroviral treatment decision support tools.
BACKGROUND: The extreme flexibility of the HIV type-1 (HIV-1) genome makes it challenging to build the ideal antiretroviral treatment regimen. Interpretation of HIV-1 genotypic drug resistance is evolving from rule-based systems guided by expert opinion to data-driven engines developed through machine learning methods. METHODS: The aim of the study was to investigate linear and non-linear statistical learning models for classifying short-term virological outcome of antiretroviral treatment. To optimize the model, different feature selection methods were considered. Robust extra-sample error estimation and different loss functions were used to assess model performance. The results were compared with widely used rule-based genotypic interpretation systems (Stanford HIVdb, Rega and ANRS). RESULTS: A set of 3,143 treatment change episodes were extracted from the EuResist database. The dataset included patient demographics, treatment history and viral genotypes. A logistic regression model using high order interaction variables performed better than rule-based genotypic interpretation systems (accuracy 75.63% versus 71.74-73.89%, area under the receiver operating characteristic curve [AUC] 0.76 versus 0.68-0.70) and was equivalent to a random forest model (accuracy 76.16%, AUC 0.77). However, when rule-based genotypic interpretation systems were coupled with additional patient attributes, and the combination was provided as input to the logistic regression model, the performance increased significantly, becoming comparable to the fully data-driven methods. CONCLUSIONS:Patient-derived supplementary features significantly improved the accuracy of the prediction of response to treatment, both with rule-based and data-driven interpretation systems. Fully data-driven models derived from large-scale data sources show promise as antiretroviral treatment decision support tools.
Authors: Marcin Kierczak; Krzysztof Ginalski; Michał Dramiński; Jacek Koronacki; Witold Rudnicki; Jan Komorowski Journal: Bioinform Biol Insights Date: 2009-10-05
Authors: Dineke Frentz; Charles A B Boucher; Matthias Assel; Andrea De Luca; Massimiliano Fabbiani; Francesca Incardona; Pieter Libin; Nino Manca; Viktor Müller; Breanndán O Nualláin; Roger Paredes; Mattia Prosperi; Eugenia Quiros-Roldan; Lidia Ruiz; Peter M A Sloot; Carlo Torti; Anne-Mieke Vandamme; Kristel Van Laethem; Maurizio Zazzi; David A M C van de Vijver Journal: PLoS One Date: 2010-07-09 Impact factor: 3.240
Authors: Mattia C F Prosperi; Michal Rosen-Zvi; André Altmann; Maurizio Zazzi; Simona Di Giambenedetto; Rolf Kaiser; Eugen Schülter; Daniel Struck; Peter Sloot; David A van de Vijver; Anne-Mieke Vandamme; Anders Sönnerborg Journal: PLoS One Date: 2010-10-29 Impact factor: 3.240
Authors: Allal Houssaïni; Lambert Assoumou; Anne Geneviève Marcelin; Jean Michel Molina; Vincent Calvez; Philippe Flandre Journal: AIDS Res Treat Date: 2012-04-03
Authors: Mattia C F Prosperi; Simona Di Giambenedetto; Iuri Fanti; Genny Meini; Bianca Bruzzone; Annapaola Callegaro; Giovanni Penco; Patrizia Bagnarelli; Valeria Micheli; Elisabetta Paolini; Antonio Di Biagio; Valeria Ghisetti; Massimo Di Pietro; Maurizio Zazzi; Andrea De Luca Journal: BMC Med Inform Decis Mak Date: 2011-06-14 Impact factor: 2.796