| Literature DB >> 34220738 |
Michal Ziemski1, Treepop Wisanwanichthan2, Nicholas A Bokulich1, Benjamin D Kaehler2.
Abstract
Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information.Entities:
Keywords: machine learning; marker-gene sequencing; metagenomics; microbiome; neural networks; taxonomic classification
Year: 2021 PMID: 34220738 PMCID: PMC8249850 DOI: 10.3389/fmicb.2021.644487
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Parameter values used for computationally intensive grid search on animal-distal-gut samples.
| Parameter | Values | ||
| n_estimators | 100 | 1,000 | – |
| max_depth | 16 | 64 | None |
| max_features | sqrt | None | – |
| Confidence | 0.6 | 0.7 | 0.8 |
FIGURE 1F-measure accuracy performance of RF and NB classifiers. Box-and-whisker plots indicate the median and quartile distributions of F-measures for each classifier and configuration, across 5-fold of CV. RF classifier configurations were tested via a grid search across the hyperparameters listed in the subset table. NB classifiers do not have equivalent parameters, and hence only NB is listed in the table beneath bars representing NB classifiers. Both RF and NB classifiers were tested at multiple confidence levels. None of the tested parameter sets outperformed the NBC at any of the confidence levels (Wilcoxon signed-rank test p < 0.05).
FIGURE 2F-measure accuracy performance of CNN and NB classifiers. (A) CNN architectures implemented in this benchmark. (B) Box-and-whisker plots indicate the median and quartile distributions of F-measures for each classifier and configuration, across 5-fold of CV. CNN classifier configurations were tested via a grid search across the hyperparameters listed in the subset table. NB classifiers do not have equivalent parameters, and hence only NB is listed in the table beneath bars representing NB classifiers. Both CNN and NB classifiers were tested at multiple confidence levels. None of the tested networks outperformed the NBC at a given confidence level (Wilcoxon test p < 0.05).
Parameter values used for grid search using the convolutional neural network.
| Parameter | Values | |||
| Filters | 64 | 128 | 256 | 512 |
| Kernel size | 3 | 5 | 7 | – |
| Confidence | 0.5 | 0.7 | 0.95 | – |
FIGURE 3“Perfect” classifiers demonstrate the upper bound of classifier performance for V4 and full 16S rRNA gene sequences. These classifiers only fail when two species share an identical sequence, assuming uniform weights. Taxonomic weighting slightly increases classification accuracy both for V4 and full-length sequences. Box-and-whisker plots indicate the median and quartile distributions of F-measures for each classifier, across 5-fold of CV. The top-performing NB, RF, and CNN classifiers (trained and tested on V4 sequences) are compared to the “perfect” classifiers to demonstrate that the upper bound of performance is already being approached. All differences were statistically significant when comparing the top-performing classifiers to “perfect” classifiers evaluated with similar parameters (Wilcoxon rank sum test, p < 0.05). (Weights: w, weighted; u, uniform; 16S V4, 150–150 nt fragment of 16S rRNA V4 region; FL, all of the V4 region).