| Literature DB >> 23095127 |
Francesco Sambo1, Emanuele Trifoglio, Barbara Di Camillo, Gianna M Toffolo, Claudio Cobelli.
Abstract
BACKGROUND: Multifactorial diseases arise from complex patterns of interaction between a set of genetic traits and the environment. To fully capture the genetic biomarkers that jointly explain the heritability component of a disease, thus, all SNPs from a genome-wide association study should be analyzed simultaneously.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23095127 PMCID: PMC3439675 DOI: 10.1186/1471-2105-13-S14-S2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Schematics of the BoNB algorithm: B Bootstrap samples {X(1) . . . X(} are drawn from a GWAS training dataset X; B Naïve Bayes Classifiers (NBC) are trained on the Bootstrap samples, with the novel procedure for attribute ranking and selection; predictions of unseen subjects from a GWAS test dataset are carried out independently by each NBC and class probabilities are then averaged; biomarker selection is carried out with the novel permutation-based procedure, exploiting Out-of-Bag (OOB) samples.
Figure 2Box plots of MCC (left panel) and classification accuracy (right panel) of the standard Naïve Bayes classifier, HyperLASSO and BoNB on ten random subsamplings of the WTCCC T1D dataset. The dashed lines represent the classification performance of a majority classifier.
Figure 3Precision .
SNPs selected as attributes for at least 5% of the Naïve Bayes Classifiers by BoNB on the WTCCC T1D dataset, with B = 200 Bootstrap samples and classifiers.
| SNP | Chr | Gene | Relation | %NBCs | MU (median) |
|---|---|---|---|---|---|
| rs9266774 | 6 | MICA | upstream | 5.5 | 0.011 |
| rs9784858 | 6 | TAP2 | intron | 5 | 0.008 |
First column: dbSNP RS ID. Second column: SNP chromosome. Third and fourth column: annotated gene and relation with the SNP. Fifth column: percentage of Naïve Bayes Classifiers that included the SNP as attribute. Sixth column: median of the marginal utility of the SNP. SNPs selected as genetic biomarkers by the permutation procedure are marked in bold.
Figure 4Naïve Bayes attribute score .
Contingency table of a SNP, with the genotype codes 0 for the homozygous pair of minor alleles, 1 for the heterozygous pair and 2 for the homozygous pair of major alleles.
| genotype | 0 | 1 | 2 | |
| cases | ||||
| controls | ||||
Each element in the contingency table reports the number of subjects with the corresponding genotype and phenotype. n0, n1 and n2 are the column sums, nand nare the row sums and n is the total subject count for the SNP.
Figure 5Box plots of the MCC obtained by BoNB on ten random subsamplings of the WTCCC T1D dataset, for .