| Literature DB >> 14611661 |
Frances S Turner1, Daniel R Clutterbuck, Colin A M Semple.
Abstract
Here we present POCUS (prioritization of candidate genes using statistics), a novel computational approach to prioritize candidate disease genes that is based on over-representation of functional annotation between loci for the same disease. We show that POCUS can provide high (up to 81-fold) enrichment of real disease genes in the candidate-gene shortlists it produces compared with the original large sets of positional candidates. In contrast to existing methods, POCUS can also suggest counterintuitive candidates.Entities:
Mesh:
Substances:
Year: 2003 PMID: 14611661 PMCID: PMC329128 DOI: 10.1186/gb-2003-4-11-r75
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
The full results of POCUS analysis for 29 OMIM diseases, over locus sizes of 100, 500 and 1,000 IDs at a threshold of 0.8
| Disease (representative OMIM number) | Number of genes* | Genes sharing† | Correctly identified‡ | Non-disease genes§ | Total number of genes¶ | Enrichment¥ | |||||||||||
| 100 | 500 | 1,000 | 100 | 500 | 1,000 | 100 | 500 | 1,000 | 100 | 500 | 1,000 | ||||||
| Parkinson's disease (168600) | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 62 | 264 | 571 | 1 | 1 | 0 | |||
| Lupus erythematosus, systematic (152700) | 3 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 43 | 262 | 547 | 0 | 1 | 1 | |||
| Glaucoma, primary open angle, juvenile-onset (137750) | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 59 | 279 | 527 | 1 | 1 | 1 | |||
| Bardet Biedl (209900) | 4 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 75 | 390 | 776 | 0 | 1 | 1 | |||
| Meningioma, familial (607174) | 4 | 2 | 1 | 0 | 0 | 6 | 17 | 10 | 69 | 363 | 722 | 2.46 | 0 | 0 | |||
| Acute myelogenous leukemia, familial (601626) | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 84 | 385 | 771 | 1 | 1 | 1 | |||
| Basal cell carcinoma (605462) | 4 | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 83 | 418 | 810 | 20.8 | 1 | 1 | |||
| Adrenoleukodystrophy, autosomal neonatal form (202370) | 4 | 4 | 2 | 0 | 0 | 0 | 3 | 3 | 76 | 369 | 733 | 19 | 0 | 0 | |||
| Epidermolysis bullosa letalis (226700) | 4 | 4 | 3 | 2 | 2 | 2 | 1 | 4 | 73 | 380 | 728 | 11 | 63.3 | 60.67 | |||
| Familial adenomatous polyposis (175100) | 4 | 4 | 3 | 2 | 0 | 8 | 12 | 4 | 64 | 371 | 727 | 4.36 | 13.3 | 0 | |||
| Ovarian carcinoma (167000) | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 70 | 360 | 733 | 1 | 1 | 1 | |||
| Hypertension (145500) | 5 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 95 | 472 | 920 | 0 | 1 | 1 | |||
| Alzheimer's disease (104300) | 5 | 4 | 3 | 0 | 0 | 0 | 0 | 0 | 135 | 460 | 875 | 27 | 1 | 1 | |||
| Charcot-Marie-Tooth disease, types 1A-1F (118200) | 5 | 4 | 3 | 0 | 0 | 0 | 0 | 0 | 98 | 449 | 937 | 19.6 | 1 | 1 | |||
| Gastric cancer (137215) | 5 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 87 | 483 | 932 | 1 | 1 | 1 | |||
| Cystic fibrosis (219700) | 5 | 5 | 1 | 0 | 0 | 2 | 0 | 0 | 99 | 458 | 900 | 6.6 | 1 | 1 | |||
| Inflammatory bowel disease (266600) | 5 | 5 | 2 | 0 | 2 | 3 | 0 | 3 | 99 | 506 | 1,013 | 7.92 | 1 | 81.04 | |||
| Long-segment Hirschsprung disease (142623) | 5 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 102 | 468 | 972 | 1 | 1 | 1 | |||
| Leber congenital amaurosis (204000) | 6 | 5 | 5 | 0 | 0 | 4 | 0 | 0 | 125 | 508 | 1,120 | 11.6 | 1 | 1 | |||
| Maturity onset diabetes of the young (606391) | 6 | 5 | 2 | 0 | 0 | 0 | 0 | 0 | 111 | 551 | 1,078 | 18.5 | 1 | 1 | |||
| Prostate cancer (176807) | 6 | 5 | 0 | 0 | 0 | 2 | 0 | 0 | 128 | 550 | 1,157 | 0 | 1 | 1 | |||
| Colorectal cancer, hereditary nonpolyposis (114500) | 6 | 6 | 5 | 6 | 6 | 4 | 14 | 17 | 115 | 560 | 1,095 | 10.6 | 28 | 47.61 | |||
| Epiphyseal dysplasia, multiple types 1-5 (132400) | 7 | 6 | 6 | 3 | 0 | 4 | 0 | 0 | 135 | 596 | 1,232 | 11.6 | 85.1 | 1 | |||
| Muscular dystrophy, limb-girdle, autosomal recessive (601173) | 7 | 5 | 2 | 0 | 0 | 1 | 0 | 0 | 124 | 634 | 1,282 | 11.8 | 1 | 1 | |||
| Diabetes mellitus, non-insulin dependent (125853) | 8 | 6 | 2 | 0 | 0 | 0 | 0 | 0 | 155 | 719 | 1,420 | 19.4 | 1 | 1 | |||
| Breast cancer (114480) | 9 | 7 | 3 | 0 | 0 | 2 | 0 | 0 | 170 | 819 | 1,592 | 11.3 | 1 | 1 | |||
| Retinitis pigmentosa (268000) | 10 | 8 | 6 | 0 | 0 | 3 | 7 | 6 | 197 | 977 | 1,897 | 13.1 | 0 | 0 | |||
| Cardiomyopathy, familial hypertrophic (192600) | 11 | 11 | 9 | 7 | 4 | 7 | 7 | 15 | 194 | 1,011 | 2,029 | 9.92 | 46 | 38.83 | |||
| Thyroid carcinoma, papillary (188550) | 11 | 11 | 0 | 0 | 0 | 1 | 0 | 0 | 217 | 1,059 | 2,142 | 0 | 1 | 1 | |||
* The total number of genes for a disease. †The number of disease genes that share IDs. ‡The number of disease genes above the threshold at the three locus sizes. §The number of non-disease genes above the threshold at the three locus sizes. ¶The total number of genes present at the loci considered. ¥The enrichment of disease genes in genes above the threshold compared with the initial loci, zeros denote diseases where only non-disease genes were above the theshold.
Figure 1POCUS results per locus for positive control sets of disease genes: the percentage of loci for each of three outcomes is plotted against locus size (100, 500, and 1,000 IDs) at two threshold scores (0.5 and 0.8). The outcome 'No genes exceed threshold' corresponds to the rate of false negatives, 'Only non-disease genes exceed threshold' corresponds to the rate of false positives, and 'Disease gene exceeds threshold' corresponds to the rate of true positives.