| Literature DB >> 24018484 |
Alexander Statnikov1, Alexander V Alekseyenko, Zhiguo Li, Mikael Henaff, Guillermo I Perez-Perez, Martin J Blaser, Constantin F Aliferis.
Abstract
Psoriasis is a common chronic inflammatory disease of the skin. We sought to use bacterial community abundance data to assess the feasibility of developing multivariate molecular signatures for differentiation of cutaneous psoriatic lesions, clinically unaffected contralateral skin from psoriatic patients, and similar cutaneous loci in matched healthy control subjects. Using 16S rRNA high-throughput DNA sequencing, we assayed the cutaneous microbiome for 51 such matched specimen triplets including subjects of both genders, different age groups, ethnicities and multiple body sites. None of the subjects had recently received relevant treatments or antibiotics. We found that molecular signatures for the diagnosis of psoriasis result in significant accuracy ranging from 0.75 to 0.89 AUC, depending on the classification task. We also found a significant effect of DNA sequencing and downstream analysis protocols on the accuracy of molecular signatures. Our results demonstrate that it is feasible to develop accurate molecular signatures for the diagnosis of psoriasis from microbiomic data.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24018484 PMCID: PMC3965359 DOI: 10.1038/srep02620
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Classification accuracy (AUC) and the number of taxa in the molecular signatures of psoriasis constructed by the GLL and SVM algorithms using the V3–V5 16S rRNA locus data (Panel A) and using the V1–V3 16S rRNA locus data (Panel B)*
| 1 | 2 | 3 | 4 | |
|---|---|---|---|---|
| Classification accuracy (AUC) | Number of selected taxa | |||
| Classification task | ||||
*Results for molecular signatures with statistically significant classification accuracy (at 5% alpha-level adjusted for multiple comparisons) are shown with bold font. Column 4 reports the number of selected taxa when the GLL method was applied to the entire dataset (all samples). Column 3 reports the number of taxa selected on average in all training sets of five-fold cross-validation (repeated 100 times with different splits into five folds). These represent results of the application of GLL to (5 × 100 = 500) training sets of the dataset with 80% of samples.
Figure 1Classification accuracy (AUC) versus number of selected features/taxa for 37 feature selection methods averaged over 4 classification tasks (PN vs. CC, PL vs. CC, PL vs. PN, and CC vs. PL and PN) in data based on the V3–V5 16S rRNA locus. Methods from the same algorithmic family are shown with the same markers in the figure.
The pink area contains methods that have nominally higher classification accuracy (AUC) than GLL. The green area contains methods that have selected fewer taxa than GLL. The red dash-dotted line indicates a Pareto frontier constructed over non-GLL methods. Methods on the Pareto frontier are such that no other non-GLL method has both higher AUC and a smaller number of selected features averaged over the four classification tasks.
Figure 2Classification accuracy (AUC) versus number of selected features/taxa for 37 feature selection methods for each of the four classification tasks (PN vs. CC, PL vs. CC, PL vs. PN, and CC vs. PL and PN) in data based on the V3–V5 16S rRNA gene locus.
Methods from the same algorithmic family are shown with the same markers in the figure. The pink area contains methods that have nominally higher classification accuracy (AUC) than GLL. The green area contains methods that have selected fewer taxa than GLL. The red dash-dotted line indicates a Pareto frontier constructed over non-GLL methods. Methods on the Pareto frontier are such that no other non-GLL method has both higher AUC and a smaller number of selected features for each classification task.
Number of samples for each class in 16S rRNA gene data from the V1–V3 and the V3–V5 loci
| Psoriasis, lesion (PL) | Psoriasis, normal (PN) | Healthy controls (CC) | |
|---|---|---|---|
| V1–V3 locus | 51 | 51 | 49 |
| V3–V5 locus | 22 | 22 | 22 |