| Literature DB >> 32380557 |
Mayumi Suzuki1, Takuma Shibahara1, Yoshihiro Muragaki2.
Abstract
BACKGROUND: Although advances in prediction accuracy have been made with new machine learning methods, such as support vector machines and deep neural networks, these methods make nonlinear machine learning models and thus lack the ability to explain the basis of their predictions. Improving their explanatory capabilities would increase the reliability of their predictions.Entities:
Mesh:
Year: 2020 PMID: 32380557 PMCID: PMC7446112 DOI: 10.1055/s-0040-1701615
Source DB: PubMed Journal: Methods Inf Med ISSN: 0026-1270 Impact factor: 2.176
Fig. 1Block diagram of analysis procedure. BAHSIC, backward elimination using Hilbert–Schmidt independence criteria; JS divergence, Jensen–Shannon divergence.
Hyperparameter search range
| Hyperparameter | Optimization range |
|---|---|
| Number of layers | 4, 5 |
| Activation function | Sigmoid, tanh, or ReLU |
| Number of hidden units | 100 to 500 |
| Dropout rate | 0.1 to 0.5 |
| Regularization function | L1, L2, or elastic-net |
Hyperparameter search results
| Hyperparameter | Optimization results |
|---|---|
| Number of layers | 4 |
| Activation function | ReLU |
| Number of hidden units | 143 |
| Dropout rate | 0.3 |
| Regularization function | L1 |
Prediction accuracy
| Accuracy | Precision | Recall | F-measure | |
|---|---|---|---|---|
| DNNs | 0.977 | 0.966 | 0.983 | 0.960 |
| SVM | 0.924 | 0.947 | 0.915 | 0.925 |
Abbreviations: DNNs, deep neural networks; SVM, support vector machine.
Top five feature variables used for prediction using BAHSIC
| Rank | Gene symbol | Gene description |
|---|---|---|
| 1 |
| Voltage-dependent anion channel 1 |
| 2 |
| Cytochrome P450 family 2 subfamily C member 8 |
| 3 |
| Pleckstrin homology domain interacting protein |
| 4 |
| Transient receptor potential cation channel subfamily M member 2 |
| 5 |
| Angel homolog 2 |
Abbreviation: BAHSIC, backward elimination using Hilbert-–Schmidt independence criteria.
Top five feature variables used for prediction using JS divergence
| Rank | Gene symbol | Gene description |
|---|---|---|
| 1 |
| TIA1 cytotoxic granule-associated RNA binding protein-like 1 |
| 2 |
| Pleckstrin homology domain interacting protein |
| 3 |
| Proliferating cell nuclear antigen |
| 4 |
| Voltage-dependent anion channel 1 |
| 5 |
| Nebulette |
Abbreviation: JS, Jensen–-Shannon.
Parameters of DAVID analysis
| Parameters | Settings | |
|---|---|---|
| DAVID version | 6.8 | |
| Classification stringency | Medium | |
| Kappa similarity | Similarity term overlap | 3 |
| Similarity threshold | 0.50 | |
| Classification | Initial group membership | 3 |
| Final group membership | 3 | |
| Multiple linkage threshold | 0.50 | |
| Enrichment thresholds | EASE | 1.0 |
| Display | Benjamini | |
Abbreviation: DAVID, Database for Annotation, Visualization, and Integrated Discovery.
Functional annotation clustering results which enrichment score of cluster was over 1.3 or top 3 enrichment score
|
| ||
|
|
|
|
| UP_KEYWORDS | DNA repair | 1.9E-3 |
| UP_KEYWORDS | DNA damage | 4.7E-3 |
| GOTERM_BP_DIRECT | DNA repair | 3.4E-2 |
| UP_KEYWORDS | DNA replication | 6.5E-2 |
| Cluster 2 | Enrichment score: 1.07 | |
| INTERPRO | Myb domain | 8.4E-4 |
| UP_SEQ_FEATURE | DNA-binding region: H-T-H motif | 6.0E-3 |
| INTERPRO | SANT/Myb domain | 1.9E-2 |
| SMART | SANT | 2.1E-2 |
| INTERPRO | Homeodomain-like | 2.3E-1 |
| Cluster 3 | Enrichment score: 0.95 | |
| GOTERM_CC_DIRECT | Nucleoplasm | 8.8E-3 |
| UP_KEYWORDS | Nucleus | 1.5E-2 |
| GOTERM_CC_DIRECT | Nucleus | 4.5E-2 |
| UP_KEYWORDS | Transcription | 1.7E-1 |
| (B) For the extraction results using BAHSIC | ||
| Cluster 1 | Enrichment score: 1.08 | p-value |
| KEGG_PATHWAY | Natural killer cell mediated cytotoxicity | 1.0E-2 |
| GOTERM_CC_DIRECT | Cell surface | 5.9E-2 |
| GOTERM_BP_DIRECT | Regulation of immune response | 2.5E-1 |
| Cluster 2 | Enrichment score: 1.07 | |
| GOTERM_MF_DIRECT | Potassium channel activity | 1.6E-2 |
| UP_KEYWORDS | Ion channel | 8.2E-2 |
| GOTERM_BP_DIRECT | Potassium ion transmembrane transport | 1.4E-1 |
| GOTERM_BP_DIRECT | Chemical synaptic transmission | 1.4E-1 |
| UP_KEYWORDS | Ion transport | 1.7E-1 |
| Cluster 3 | Enrichment Score: 0.95 | |
| KEGG_PATHWAY | GABAerqic synapse | 9.8E-2 |
| KEGG_PATHWAY | Morphine addition | 1.1E-1 |
| KEGG_PATHWAY | Retrograde endocannabinoid signaling | 1.3E-1 |
Abbreviations: BAHSIC, backward elimination using Hilbert-–Schmidt independence criteria; JS, Jensen–-Shannon.