| Literature DB >> 31923277 |
Antonio Rivero-Juárez1, David Guijo-Rubio2, Francisco Tellez3, Rosario Palacios4, Dolores Merino5, Juan Macías6, Juan Carlos Fernández2, Pedro Antonio Gutiérrez2, Antonio Rivero1, César Hervás-Martínez2.
Abstract
Several European countries have established criteria for prioritising initiation of treatment in patients infected with the hepatitis C virus (HCV) by grouping patients according to clinical characteristics. Based on neural network techniques, our objective was to identify those factors for HIV/HCV co-infected patients (to which clinicians have given careful consideration before treatment uptake) that have not being included among the prioritisation criteria. This study was based on the Spanish HERACLES cohort (NCT02511496) (April-September 2015, 2940 patients) and involved application of different neural network models with different basis functions (product-unit, sigmoid unit and radial basis function neural networks) for automatic classification of patients for treatment. An evolutionary algorithm was used to determine the architecture and estimate the coefficients of the model. This machine learning methodology found that radial basis neural networks provided a very simple model in terms of the number of patient characteristics to be considered by the classifier (in this case, six), returning a good overall classification accuracy of 0.767 and a minimum sensitivity (for the classification of the minority class, untreated patients) of 0.550. Finally, the area under the ROC curve was 0.802, which proved to be exceptional. The parsimony of the model makes it especially attractive, using just eight connections. The independent variable "recent PWID" is compulsory due to its importance. The simplicity of the model means that it is possible to analyse the relationship between patient characteristics and the probability of belonging to the treated group.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31923277 PMCID: PMC6953863 DOI: 10.1371/journal.pone.0227188
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Characteristics of the Spanish HERACLES cohort (NCT02511496).
| Description | Variable | Values | Occurrences | Percentage |
|---|---|---|---|---|
| Met the Spanish criteria plan | No | 1287 | 43.77% | |
| Yes | 1653 | 56.22% | ||
| PWID Category | Lifetime PWID | 2169 | 73.78% | |
| OST PWID | 339 | 11.53% | ||
| Recent PWID | 47 | 1.60% | ||
| Never PWID | 385 | 13.10% | ||
| Presented major psychiatric disorders | No | 2886 | 98.16% | |
| Yes | 54 | 1.84% | ||
| Been in jail | No | 2823 | 96.02% | |
| Yes | 117 | 3.98% | ||
| Previous treatment experience | Naïve to therapy | 2053 | 69.83% | |
| With Peg-IFN/RBV | 725 | 24.66% | ||
| With DAAs/Peg-IFN/RBV | 162 | 5.51% | ||
| Liver fibrosis | Stage F0-F1 | 898 | 30.54% | |
| Stage F2 | 475 | 16.16% | ||
| Stage F3 | 787 | 26.77% | ||
| Stage F4 | 780 | 26.53% | ||
| Gender | Female | 491 | 16.70% | |
| Male | 2449 | 83.30% | ||
| Age | Continuous variable | min | 18 | |
| max | 76 | |||
| mean | 48.95 | |||
| HCV genotype | Genotype 1 | 1741 | 59.22% | |
| Genotype 2 | 27 | 0.92% | ||
| Genotype 3 | 484 | 16.46% | ||
| Genotype 4 | 688 | 23.40% | ||
| Received therapy for HCV infection | No | 988 | 33.60% | |
| Yes | 1952 | 66.40% |
Values of test CCR, MS and AUC, together with #conn, using PUNN, SUNN and RBFNN.
The independent variables selected for the best model are also shown. The best result is highlighted in bold face; the second best result is shown in italics.
| Mean±SD | ||||
| Model | # | |||
| PUNN (17-1,2,4-1) | ||||
| SUNN (17-1,2,4-1) | 0.762 ± 0.006 | 0.793 ± 0.002 | 14.10 ± 1.63 | |
| RBFNN (17-1,2,4-1) | 0.550 ± 0.008 | |||
| RBFNN2 (16-1,2,4-1) | 0.456 ± 0.127 | 0.014 ± 0.019 | 0.483 ± 0.028 | 7.33 ± 2.17 |
| Best Model | ||||
| Model | # | |||
| PUNN (17-1,2,4-1) | ||||
| SUNN (17-1,2,4-1) | 0.545 | 0.799 | 19 | |
| RBFNN (17-1,2,4-1) | ||||
| RBFNN2 (16-1,2,4-1) | 0.514 | 0.000 | 0.573 | 9 |
| Independent variables considered: | ||||
| PUNN (17-1,2,4-1) | ||||
| SUNN (17-1,2,4-1) | ||||
| RBFNN (17-1,2,4-1) | ||||
Fig 1ROC curve for the three models proposed.
P-values of the Kolmogorov-Smirnov test applied to the generalisation set for CCR, MS, AUC and #conn.
| Variable | PUNN | SUNN | RBFNN |
|---|---|---|---|
| 0.138 | 0.200 | 0.003 | |
| 0.034 | 0.002 | < 0.001 | |
| 0.200 | < 0.001 | 0.028 | |
| # | 0.029 | 0.007 | 0.023 |
P-values of Snedecor’s F ANOVA I test ordered means for the multiple comparison Tukey test when considering CCR, MS, AUC and #conn for the models obtained.
| F ( | < 0.001 | < 0.001 |
| Ranking of averages | ||
| # | ||
| F ( | 0.221 | < 0.001 |
| Ranking of averages | No significant differences | |
(*) Significant differences were found for α = 0.05.
Main characteristics of the RBFNN model.
| Variables | ||
|---|---|---|
| Training | Generalisation | |
| 0.757 | 0.767 | |
| 0.522 | 0.550 | |
| 0.801 | 0.802 | |
|
|
| |
Fig 2Precision-Recall curve for the best model obtained.