| Literature DB >> 22966941 |
Andrey Alexeyenko1, Woojoo Lee, Maria Pernemalm, Justin Guegan, Philippe Dessen, Vladimir Lazar, Janne Lehtiö, Yudi Pawitan.
Abstract
BACKGROUND: Gene-set enrichment analyses (GEA or GSEA) are commonly used for biological characterization of an experimental gene-set. This is done by finding known functional categories, such as pathways or Gene Ontology terms, that are over-represented in the experimental set; the assessment is based on an overlap statistic. Rich biological information in terms of gene interaction network is now widely available, but this topological information is not used by GEA, so there is a need for methods that exploit this type of information in high-throughput data analysis.Entities:
Mesh:
Year: 2012 PMID: 22966941 PMCID: PMC3505158 DOI: 10.1186/1471-2105-13-226
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
List of acronyms and their meanings
| AGS | Altered gene set | A list of differentially expressed genes in experiment |
| FGS | Functional gene set | A list of genes that were previously known to be related |
| GEA | Gene enrichment analysis | A method of counting overlaps between AGS and FGS |
| GSEA | Gene set enrichment analysis | A method using the maximum overlap between AGS and FGS |
| NEA | Network enrichment analysis | A method of measuring network connectivity between AGS and FGS |
| FNEA | Fixed network enrichment analysis | A NEA method when AGS is a list of |
| MNEA | Maximum network enrichment analysis | A NEA method to avoid dependence on |
Figure 1The distribution of the p-values of the statistics computed from the randomized network. If the randomization works well, then the distribution should be uniform.
Figure 2A schematic diagram of counting network links and a real example of network links.(A) A simple example of how links are counted between genes in AGS and FGS. The ’x’ symbol indicates a fixed number k of genes; (B) A realistic example of network links between 257 deregulated proteins in a tumor (diamonds) and 10 genes known to be involved in the epithelial-mesenchymal transition (circles). Network nodes without links to the AGS or FGS genes are not shown. The graph is generated using a graphics tool in the FunCoup web site (http://funcoup.sbc.su.se).
Network enrichment analysis of lung cancer-related gene sets to patient-specific altered gene sets
| Ding et al., 2008 | ||||
| All (623 genes) | 66 | 114 | 1 | 102 |
| Drivers (26) | 74 | 95 | 21 | 16 |
| Mutated ≤1 (153) | 39 | 102 | 2 | 75 |
| COSMIC | ||||
| Detected >1 (110) | 34 | 33 | 22 | 30 |
| Detected once (median)∗ (110) | 18 | 52 | 9 | 28.5 |
| KEGG05223 non-small cell lung cancer (56) | 48 | 99 | 2 | 40 |
| Largest contrast GEA > FNEA | ||||
| KEGG00362 Benzoate degradation via hydroxylation (3) | 4 | 27 | 123 | 31 |
| KEGG00513 High-mannose type N-glycan biosynthesis (4) | 2 | 5 | 123 | 24 |
| KEGG00780 Biotin metabolism (4) | 1 | 13 | 123 | 11 |
| KEGG00300 Lysine biosynthesis (5) | 0 | 6 | 123 | 34 |
| KEGG00785 Lipoic acid metabolism (4) | 0 | 2 | 123 | 18 |
| Largest contrast FNEA > GEA | ||||
| GO0001666 Response to hypoxia (163) | 119 | 114 | 11 | 105 |
| KEGG05200 Pathways in cancer (254) | 112 | 118 | 6 | 99 |
| GO0001525 Angiogenesis (129) | 115 | 113 | 10 | 118 |
| KEGG04010 MAPK signaling pathway (286) | 107 | 114 | 4 | 82 |
| KEGG04020 Calcium signaling pathway (183) | 99 | 108 | 5 | 60 |
KEGG05223 is non-small cell lung cancer pathway. ‘COSMIC’ genes include the lung-cancer somatic mutations in the COSMIC database, but exclude those in [26]. ∗ indicates the median from 50 randomly sampled sets.
Figure 3Average of estimated FDRs versus the number of AGS-FGS pairs that are declared significant.(a) FNEA (solid) versus GEA (dashed) (b) MNEA (solid) and GSEA (dashed). The average values were calculated over 123 individuals.