| Literature DB >> 23281712 |
Yin Wang1, Yuhua Zhou, Yixue Li, Zongxin Ling, Yan Zhu, Xiaokui Guo, Hong Sun.
Abstract
BACKGROUND: Bacterial 16S Ribosomal RNAs profiling have been widely used in the classification of microbiota associated diseases. Dimensionality reduction is among the keys in mining high-dimensional 16S rRNAs' expression data. High levels of sparsity and redundancy are common in 16S rRNA gene microbial surveys. Traditional feature selection methods are generally restricted to measuring correlated abundances, and are limited in discrimination when so few microbes are actually shared across communities.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23281712 PMCID: PMC3524076 DOI: 10.1186/1752-0509-6-S3-S12
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1FMS algorithm flowchart.
Figure 2Learning curves of FMS algorithm for feature merging in 3-class problem (a), feature deletion in 3-class problem (b), feature merging in 2-class problem (c) and feature deletion in 2-class problem (d).
Classification ability on pneumonia data in 3-class problem
| Method | Error rate | Dimension | Feature number | Note | |
|---|---|---|---|---|---|
| On training data | On test data | ||||
| svm/FMS | 0.1895 | 0.2637 | 29 | 129 | |
| svm/mRMR | 0.2267 | 0.3103 | 38 | 38 | |
| svm/KruskalWallis | 0.1984 | 0.3816 | 107 | 107 | |
| svm/InformationGain | 0.2425 | 0.3684 | 28 | 28 | |
| svm/χ2 statistic | 0.2127 | 0.4308 | 125 | 125 | |
| svm | 0.2841 | 0.4017 | 137 | 137 | |
| kNN/FMS | 0.2013 | 0.3406 | 112 | 133 | k = 1 |
| kNN/mRMR | 0.2635 | 0.3774 | 130 | 130 | k = 1 |
| kNN/KruskalWallis | 0.2492 | 0.3795 | 134 | 134 | k = 1 |
| kNN/InformationGain | 0.2635 | 0.3774 | 130 | 130 | k = 1 |
| kNN/χ2 statistic | 0.2537 | 0.4128 | 124 | 124 | k = 1 |
| kNN | 0.2635 | 0.3774 | 137 | 137 | k = 1 |
Classification ability on pneumonia data in 2-class problem.
| Method | Error rate | Dimension | Feature number | Note | |
|---|---|---|---|---|---|
| On training data | On test data | ||||
| svm/FMS | 0.0922 | 0.1279 | 42 | 123 | |
| svm/mRMR | 0.1313 | 0.1977 | 36 | 36 | |
| svm/KruskalWallis | 0.1081 | 0.1628 | 62 | 62 | |
| svm/InformationGain | 0.1456 | 0.186 | 54 | 54 | |
| svm/χ2 statistic | 0.1561 | 0.186 | 127 | 127 | |
| svm | 0.1611 | 0.1977 | 137 | 137 | |
| kNN/FMS | 0.1279 | 0.2393 | 20 | 130 | k = 1 |
| kNN/mRMR | 0.2532 | 0.3372 | 54 | 54 | k = 1 |
| kNN/KruskalWallis | 0.1861 | 0.3343 | 25 | 25 | k = 4 |
| kNN/InformationGain | 0.2248 | 0.3256 | 107 | 107 | k = 1 |
| kNN/χ2 statistic | 0.336 | 0.4535 | 107 | 107 | k = 1 |
| kNN | 0.346 | 0.4419 | 137 | 137 | k = 1 |
Figure 3The expression profiles of original pneumonia data for 3-class problem (a), data after treated by FMS for 3-class problem (b); original pneumonia data for 2-class problem (c) and data after treated by FMS for 2-class problem (d). Rows are microbiotas and columns are disease classes. From left to right are 30 normal, 32 CAP, 71 HAP samples for 3-class problem, and 30 normal 103 pneumonia samples for 2-class problem.
Figure 4Phylogenetic relationship of microbiota signatures in 3-class problem. The microbiota signatures with best discrimination ability were labeled with green star.
Figure 5Phylogenetic relationship of microbiota signatures in 2-class problem. The microbiota signatures with best discrimination ability were labeled with green star.