| Literature DB >> 26374744 |
Argiris Sakellariou1,2, George Spyrou3.
Abstract
BACKGROUND: So far many algorithms have been proposed towards the detection of significant genes in microarray analysis problems. Several of those approaches are freely available as R-packages though their engagement in gene expression analysis by non-bioinformaticians is usually a frustrating task. Besides, only some of those packages offer a complete suite of tools starting from initial data import and ending to analysis report. Here we present an R/Bioconductor package that implements a hybrid gene selection method along with a bunch of functions to facilitate a thorough and convenient gene expression profiling analysis.Entities:
Mesh:
Year: 2015 PMID: 26374744 PMCID: PMC4572678 DOI: 10.1186/s12859-015-0719-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The available options in the preprocessing functional unit
| Abbreviation | Description |
|---|---|
| rawdata | The initial gene expression values |
| mc | The values after ‘mean-centering’ normalization |
| z | The values after ‘z-score’ normalization |
| q | The values after ‘quantile’ normalization |
| cl | The values after ‘cyclic loess’ normalization |
| mcL2 | The values after log2 transformation and ‘mean-centering’ normalization |
| zL2 | The values after log2 transformation and ‘z-score’ normalization |
| qL2 | The values after log2 transformation and ‘quantile’ normalization |
| clL2 | The values after log2 transformation and ‘cyclic loess’ normalization |
Fig. 1Density plots of normalized intensity values
Classification performance of gene exemplars per preprocessing method
| Method | Exemplars | AUC | MCC | ACC | TNR | TPR |
|---|---|---|---|---|---|---|
| clL2 | 15 | 0.94 | 0.84 | 92.0 | 0.88 | 1.00 |
| mcL2 | 40 | 0.88 | 0.82 | 92.0 | 1.00 | 0.75 |
| qL2 | 40 | 0.88 | 0.82 | 92.0 | 1.00 | 0.75 |
| z | 17 | 0.81 | 0.62 | 83.0 | 0.88 | 0.75 |
| mc | 28 | 0.81 | 0.62 | 83.0 | 0.88 | 0.75 |
| cl | 17 | 0.75 | 0.63 | 83.0 | 1.00 | 0.50 |
| q | 14 | 0.69 | 0.41 | 75.0 | 0.88 | 0.50 |
| zL2 | 39 | 0.62 | 0.43 | 75.0 | 1.00 | 0.25 |
Fig. 2From annotation to pathways. The 15 exemplars (probe ids) match to ten different ‘entrez ids’, which in turn are found to be related to four pathways
Fig. 3A network graph of the weighted local degree of centrality
Comparison of features between mAPKL and other related R-packages
| Package | Data | Significance analysis | Classification | Annot. analysis | Network chars | Reporting | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| Import | Use eSet | Norm. | Sampling | Pred. | Perf. metrics | |||||
| mAPKL | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| edge | No | Yes | Yes | No | Yes | No | No | No | No | No |
| limma | Yes | No | Yes | No | Yes | No | No | Yes | No | No |
| multtest | No | Yes | No | No | Yes | No | No | No | No | No |
| plsgenomics | No | No | Yes | No | Yes | Yes | No | No | No | No |
| randomForest | No | No | No | No | Yes | Yes | No | No | No | No |
| samr | No | No | Yes | No | Yes | No | No | No | No | No |
| st | No | No | No | No | Yes | No | No | No | No | No |
| caret | No | No | No | Yes | Yes | Yes | Yes | No | No | No |
| ClassifyR | No | Yes | No | No | Yes | Yes | Yes | No | No | No |
| CMA | No | Yes | No | Yes | Yes | Yes | Yes | No | No | No |
| MCRestimate | No | Yes | No | No | Yes | Yes | Yes | No | No | No |
| MLInterfaces | No | Yes | No | Yes | Yes | Yes | Yes | No | No | No |
Classification performance among feature selection methods for a subset of 15 top-ranked genes (probe ids)
| Method | AUC | MCC | ACC | TNR | TPR |
|---|---|---|---|---|---|
| m | 0.94 | 0.84 | 92.0 | 0.88 | 1.00 |
| cat | 0.88 | 0.82 | 92.0 | 1.00 | 0.75 |
| ODP | 0.88 | 0.82 | 92.0 | 1.00 | 0.75 |
| RF | 0.81 | 0.62 | 83.0 | 0.88 | 0.75 |
| maxT | 0.75 | 0.63 | 83.0 | 1.00 | 0.50 |
| PLS | 0.69 | 0.41 | 75.0 | 0.88 | 0.50 |
| eBayes | 0.62 | 0.43 | 75.0 | 1.00 | 0.25 |
| SAM | 0.62 | 0.43 | 75.0 | 1.00 | 0.25 |