| Literature DB >> 30763264 |
Víctor Sebastián-Pérez1, María Jimena Martínez2, Carmen Gil1, Nuria Eugenia Campillo1, Ana Martínez1, Ignacio Ponzoni2.
Abstract
Parkinson's disease is one of the most common neurodegenerative illnesses in older persons and the leucine-rich repeat kinase 2 (LRRK2) is an auspicious target for its pharmacological treatment. In this work, quantitative structure-activity relationship (QSAR) models for identification of putative inhibitors of LRRK2 protein are developed by using an in-house chemical library and several machine learning techniques. The methodology applied in this paper has two steps: first, alternative subsets of molecular descriptors useful for characterizing LRRK2 inhibitors are chosen by a multi-objective feature selection method; secondly, QSAR models are learned by using these subsets and three different strategies for supervised learning. The qualities of all these QSAR models are compared by classical metrics and the best models are discussed in statistical and physicochemical terms.Entities:
Keywords: Cheminformatics; LRRK2; Machine Learning; Parkinson’s disease; QSAR
Mesh:
Substances:
Year: 2019 PMID: 30763264 PMCID: PMC6798859 DOI: 10.1515/jib-2018-0063
Source DB: PubMed Journal: J Integr Bioinform ISSN: 1613-4516
Figure 1:Methodology carried out for generate the different QSAR models.
Structure of the chemical compounds in the database and their percentages of enzymatic inhibition.
| Structure | LRRK2 %inh @10 μM CI50 (μM) | Structure | LRRK2 %inh @10 μM CI50 (μM) | Structure | LRRK2 %inh @10 μM CI50 (μM) | Structure | LRRK2 %inh @10 μM CI50 (μM) |
|---|---|---|---|---|---|---|---|
| 14% | 2% | 39% | 11% | ||||
| 19% | 6% | 15% | 77% | ||||
| 0% | 9% | 3% | 78% | ||||
| 63% | 13% | 96% | 7% | ||||
| 11% | 96% | 7% | 66% | ||||
| 71% | 97% | 23% | 91% | ||||
| 30% | 17% | 0% | 61% | ||||
| 60% | 91% | 0% | 100% | ||||
| 14% | 92% | 23% | 91% | ||||
| 15% | 72% | 79% | 45% | ||||
| 12% | 94% | 74% | 65% | ||||
| 54% | 71% | 49% | 11% | ||||
| 71% | 99% | 40% | 36% | ||||
| 45% | 54% | 0% | 35% | ||||
| 66% | 95% | 0% | 62% | ||||
| 76% | 39% | 1% | 2% | ||||
| 36% | 51% | 1% |
Figure 2:Dispersion of the database molecules taking into account some of the QP properties calculated such as logP values and number of acceptor H-bonds. Color scale is defined by the stars values.
Best molecular descriptors (MDs) subsets obtained by DELPHOS in the feature selection process.
| Subset | MDs | Descriptor Type | Size | |
|---|---|---|---|---|
| M2 | MW | Constitutional indices | 4 | |
| MWC08 | Walk and path counts | |||
| BEHp2 | Burden eigenvalues | |||
| RDF105p | RDF descriptors | |||
| M3 | MW | Constitutional indices | 4 | |
| JGI2 | 2D autocorrelation | |||
| HATs6m | R2e | GETAWAY descriptors | ||
| M5 | MW | Constitutional indices | 5 | |
| IC0 | Information indices | |||
| ESpm09x | Edge adjacency indices | |||
| JGI3 | 2D autocorrelation | |||
| L3s | WHIM descriptors | |||
| M25 | MW | Constitutional indices | 13 | |
| HNar | ECC | Topological indices | ||
| GATs7e | 2D autocorrelations | |||
| VEZ1 | VEp2 | 2Dmatrix-based descriptors | ||
| DISPm | Geometrical descriptors | |||
| RDF105p | RDF descriptors | |||
| R8e | GETAWAY descriptors | |||
| B06[N-Br] | B07[C-Cl] | 2D atom pairs | ||
| F04[C-C] | F05[O-Cl] | 2D atom pairs |
Predictive accuracy of the best classification and regression QSAR models evaluated over the testing set.
| Model | Size | Best Classification QSAR Models | Best Regression QSAR Models | |||||
|---|---|---|---|---|---|---|---|---|
| Method | ACC | ROC | MCC | Method | CC | RRSE | ||
| M2 | 4 | RF | 68.8 | 0.69 | 0.40 | RF | 0.55 | 87.50 |
| M3 | 4 | RC | 0.68 | 74.69 | ||||
| M5 | 5 | RC | 75.0 | 0.95 | 0.52 | |||
| M25 | 13 | RC | 75.0 | 0.73 | 0.53 | RC | 0.44 | 92.34 |
Figure 3:Kendall correlation grade between the descriptors of M3 y M5 subsets.