| Literature DB >> 30120369 |
Enrique Barreiro1,2,3, Cristian R Munteanu1, Maykel Cruz-Monteagudo2,3, Alejandro Pazos4, Humbert González-Díaz5,6.
Abstract
Biological Ecosystem Networks (BENs) are webs of biological species (nodes) establishing trophic relationships (links). Experimental confirmation of all possible links is difficult and generates a huge volume of information. Consequently, computational prediction becomes an important goal. Artificial Neural Networks (ANNs) are Machine Learning (ML) algorithms that may be used to predict BENs, using as input Shannon entropy information measures (Shk) of known ecosystems to train them. However, it is difficult to select a priori which ANN topology will have a higher accuracy. Interestingly, Auto Machine Learning (AutoML) methods focus on the automatic selection of the more efficient ML algorithms for specific problems. In this work, a preliminary study of a new approach to AutoML selection of ANNs is proposed for the prediction of BENs. We call it the Net-Net AutoML approach, because it uses for the first time Shk values of both networks involving BENs (networks to be predicted) and ANN topologies (networks to be tested). Twelve types of classifiers have been tested for the Net-Net model including linear, Bayesian, trees-based methods, multilayer perceptrons and deep neuronal networks. The best Net-Net AutoML model for 338,050 outputs of 10 ANN topologies for links of 69 BENs was obtained with a deep fully connected neuronal network, characterized by a test accuracy of 0.866 and a test AUROC of 0.935. This work paves the way for the application of Net-Net AutoML to other systems or ML algorithms.Entities:
Year: 2018 PMID: 30120369 PMCID: PMC6098100 DOI: 10.1038/s41598-018-30637-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1General workflow of the Net-Net AutoML methodology.
Figure 2Distribution of Shk values (k = 2, 3, 5) for ANNs vs. BENs nodes.
Information indices ANNShk of the ANNs used as inputs to train the AutoML model.
| ANN | ANN | ANN | Shk(ANNj) | |||||
|---|---|---|---|---|---|---|---|---|
| No. | Topology | AUROC | k = 0 | k = 1 | k = 2 | k = 3 | k = 4 | k = 5 |
| 1 | MLP14:14-10-1:1 | 0.6 | 1.367 | 1.561 | 1.428 | 1.428 | 1.428 | 1.428 |
| 2 | MLP15:15-12-1:1 | 0.7 | 1.424 | 1.571 | 1.492 | 1.492 | 1.492 | 1.492 |
| 3 | MLP18:18-8-13-1:1 | 0.8 | 1.518 | 2.074 | 1.509 | 1.704 | 1.704 | 1.704 |
| 4 | MLP16:16-8-10-1:1 | 0.7 | 1.486 | 1.881 | 1.415 | 1.532 | 1.532 | 1.532 |
| 5 | MLP18:18-8-1:1 | 0.8 | 1.564 | 1.759 | 1.423 | 1.481 | 1.481 | 1.481 |
| 6 | MLP16:16-12-13-1:1 | 0.8 | 1.358 | 1.481 | 1.481 | 1.481 | 1.481 | 1.481 |
| 7 | MLP11:11-10-1:1 | 0.7 | 1.394 | 1.806 | 1.395 | 1.395 | 1.395 | 1.395 |
| 8 | LNN16:16-1:1 | 0.6 | 0.881 | 2.637 | 2.637 | 2.637 | 2.637 | 2.637 |
| 9 | LNN17:17-1:1 | 0.6 | 0.895 | 2.788 | 2.788 | 2.788 | 2.788 | 2.788 |
| 10 | LNN18:18-1:1 | 0.6 | 0.908 | 2.938 | 2.938 | 2.938 | 2.938 | 2.938 |
Statistics for the base line LDA Net-Net AutoML model.
| Model | Training Series | |||
|---|---|---|---|---|
| Param. | % | Class | Aij = 0 | Aij = 1 |
| Sp | 74.2 | Aij = 0 |
| 32745 |
| Sn | 70.5 | Aij = 1 | 37438 |
|
| Ac | 72.3 | Total | ||
|
|
| |||
| Sp | 76.0 | Aij = 0 |
| 10179 |
| Sn | 70.4 | Aij = 1 | 12465 |
|
| Ac | 73.2 | Total | ||
Note: rows: Observed classifications; columns: Predicted classifications; Aij = 1, calculation with high priority; Aij = 0 otherwise.
Accuracies of non-linear Net-Net classifiers.
| ML Classifier | Test Accuracy | Test AUROC |
|---|---|---|
| Bayesian Nets | 0.681 | 0.737 |
| Naive Bayes Nets | 0.586 | 0.636 |
| Logistic Regression | 0.618 | 0.668 |
| Decision Table | 0.516 | 0.552 |
| MLP 1H | 0.809 | 0.878 |
| MLP 2H | 0.827 | 0.902 |
| Random Forest | 0.832 | 0.914 |
| Bagging REP | 0.804 | 0.884 |
| Bagging MLP | 0.819 | 0.896 |
| AdaBoostM1 MLP | 0.821 | 0.884 |
| Deep FC Nets | 0.866 | 0.935 |
Note: please see Methods section for details on the classifier.