| Literature DB >> 18346969 |
Katleen De Preter1, Roland Barriot, Frank Speleman, Jo Vandesompele, Yves Moreau.
Abstract
The search for feature enrichment is a widely used method to characterize a set of genes. While several tools have been designed for nominal features such as Gene Ontology annotations or KEGG Pathways, very little has been proposed to tackle numerical features such as the chromosomal positions of genes. For instance, microarray studies typically generate gene lists that are differentially expressed in the sample subgroups under investigation, and when studying diseases caused by genome alterations, it is of great interest to delineate the chromosomal regions that are significantly enriched in these lists. In this article, we present a positional gene enrichment analysis method (PGE) for the identification of chromosomal regions that are significantly enriched in a given set of genes. The strength of our method relies on an original query optimization approach that allows to virtually consider all the possible chromosomal regions for enrichment, and on the multiple testing correction which discriminates truly enriched regions versus those that can occur by chance. We have developed a Web tool implementing this method applied to the human genome (http://www.esat.kuleuven.be/~bioiuser/pge). We validated PGE on published lists of differentially expressed genes. These analyses showed significant overrepresentation of known aberrant chromosomal regions.Entities:
Mesh:
Year: 2008 PMID: 18346969 PMCID: PMC2367735 DOI: 10.1093/nar/gkn114
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Sets of genes that are adjacent on the chromosome. Genes are ordered on the chromosome by their start position (in base pairs). Each pair of genes defines an interval i.e. a set of adjacent genes.
Figure 2.Filtering redundant chromosome regions significantly enriched in genes differentially expressed in tissues of Down syndrome patients on chromosome 21: (A) regions are displayed from left to right by increasing P-value significance (decreasing P-values) and are plotted as –10 log P-value; (B) the same regions plotted by the percentage of genes of interest; (C) enriched regions are filtered for redundancy and plotted by P-value: see rules 4 and 5 in text; (D) the same regions plotted by the percentage of enrichment.
Chromosome regions significantly enriched in genes differentially expressed in subtypes of B-CLL
| Chr. | Band(s) | Coordinates | Genes of interest/genes in the region |
|---|---|---|---|
| 6 | p21.32–22.2 | 26,163,912; 33,851,518 | 13/58 |
| 6 | p22.1 | 27,208,799; 27,941,634 | 7/15 |
| 6 | p22.1 | 27,883,200; 27,941,634 | 5/7 |
| 6 | q14.3–23.2 | 86,216,527; 132,690,949 | 12/77 |
| 11 | q14.1–24.3 | 77,603,590; 127,897,218 | 18/124 |
| 11 | q23.1–23.3 | 111,117,019; 117,775,136 | 8/28 |
| 12 | p13.31–24.33 | 7,233,850; 131,915,071 | 64/408 |
| 12 | p11.21–q13.11 | 31,117,786; 44,641,909 | 5/10 |
| 12 | q23.3 | 104,025,639; 106,630,469 | 4/5 |
| 12 | q24.31 | 120,230,432; 121,194,727 | 4/6 |
| 17 | p13.1–13.3 | 594,403; 8,006,662 | 19/90 |
| 17 | p13.1 | 7,084,456; 8,006,662 | 11/25 |
| 17 | p13.2 | 3,746,634; 4,742,127 | 5/11 |
Figure 3.PGE Chromosome view of regions significantly enriched in genes differentially expressed in subtypes of B-CLL (Table 1).
Figure 4.Regions significantly enriched in genes differentially expressed in neuroblastoma tumors.