| Literature DB >> 21668950 |
Zané Lombard1, Chungoo Park, Kateryna D Makova, Michèle Ramsay.
Abstract
BACKGROUND: Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches--the examination of similarities to known disease genes and/or the evaluation of functional annotation of genes. Each of these approaches has its own caveats. Here we employ a previously described method of candidate gene prioritization based mainly on gene annotation, in accompaniment with a technique based on the evaluation of pertinent sequence motifs or signatures, in an attempt to refine the gene prioritization approach. We apply this approach to X-linked mental retardation (XLMR), a group of heterogeneous disorders for which some of the underlying genetics is known.Entities:
Mesh:
Year: 2011 PMID: 21668950 PMCID: PMC3142252 DOI: 10.1186/1745-6150-6-30
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Annotation terms identified to be pertinent to XLMR using a literature- and data-mining approach
| ANNOTATION TERM CATEGORIES1 | |||
|---|---|---|---|
| Developmental | Development | Seizures | |
| Liver | Transcription | Epilepsy | Behaviour/Neurological |
| Central nervous system | Metabolism | Acidosis | Nervous system related |
| Respiratory | Phosphorylation | Microcephaly | Embryogenesis |
| Cerebellum | Brain development | Tremor | |
| Kidney | |||
| Hippocampus | Pre-Embryonic | ||
| Spinal cord | Embryonic | ||
| Cerebral cortex | Fetal | ||
| Testis | |||
| Brain stem | |||
| Peripheral nerve | 2TS8-9 Ectoderm | ||
| Cerebrum | TS10-13 Neural Ectoderm | ||
| Substantia nigra | TS14-26 CNS | ||
| Cardiovascular | |||
| Adrenal gland | |||
| Thyroid | |||
| Ovary | |||
| Amygdala | |||
| Musculoskeletal | |||
| Ganglion | |||
| Hypothalamus | |||
1These terms were used to extract gene lists that were compared to all X chromosome genes in a binary filtering process. Annotation terms are divided into four categories based on their ontological classification.
2TS - Theiller stage: A term used to denote the stage of development of a mouse as described by Theiler in "The House Mouse: Atlas of Mouse Development" (Springer-Verlag, New York, 1989)
Figure 1Prioritization of genes on the X chromosome as XLMR candidates using a binary filtering process. The effective cumulative coverage of XLMR genes from lowest to highest ranked categories is depicted (left Y-axis) as well as the percentage of genes that are XLMR-linked within each of the categories (right Y-axis).
Number of genes considered for sequence-based prioritization
| XAR | XCR | TOTAL | |
|---|---|---|---|
| XLMR | 25 | 56 | |
| Non-XLMR | 110 | 376 | |
Number of genes for training and success rates of LDA
| Set Analyzed | Parameter | 10 kb (Genes) | 50 kb (Genes) | 100 kb (Genes) |
|---|---|---|---|---|
| Training and test set of genes in XAR | τ | 0.96 | 0.43 | 0.4 |
| Success in XLMR | 100% (19) | 100% (15) | 100% (9) | |
| Success in non-XLMR | 45% (84) | 91% (40) | 96% (26) | |
| Training and test set of genes in XCR | τ | 0.87 | 0.75 | 0.62 |
| Success in XLMR | 87% (38) | 100% (16) | 100% (7) | |
| Success in non-XLMR | 52% (257) | 82% (141) | 96% (74) |
τ is a tuning parameter, which was selected to maximize the sum of correct classification rates for XLMR and non-XLMR sets
Figure 2LDA classification success rates for different values of the tuning parameter. (A) All XAR genes were used for training and test sets. (B) All XCR genes were used for training and test sets. Leave-one-out cross-validation was utilized to calculate correct classification rates. Dots indicate optimal values of τ. More detailed information is given in Table 3.
Number of genes with ten or more matched categories using the annotation approach that were classified correctly by sequence-based LDA method
| Genes classified successfully | ||||
|---|---|---|---|---|
| Length of contigs | Number of genes tested | 10 kb (Genes) | 50 kb (Genes) | 100 kb (Genes) |
| 101 | 54.5% (55) | 52.5% (53) | 52.5% (53) | |
| 42 | 59.5% (25) | 78.6% (33) | 88.1% (37) | |
| 22 | 59.1% (13) | 81.8% (18) | 86.4% (19) | |
Classification of genes with > 50 kb contigs that were present in at least ten annotation categories
| No of annotation terms matched (/40 | HGNC symbol | XLMR gene1 | 50 kb2 | 100 kb2 | Dist (kb)3 | Strata |
|---|---|---|---|---|---|---|
| 24 | 1 | NX | NX | 71.7 | XAR | |
| 23 | 0 | NX | NX | 68 | XAR | |
| 19 | 1 | X | X | 61.4 | XAR | |
| 19 | 0 | NX | NX | 157.8 | XCR | |
| 19 | 1 | X | NX | 60.7 | XCR | |
| 18 | 0 | NX | NX | 133 | XCR | |
| 18 | 0 | X | NX | 90.8 | XCR | |
| 18 | 0 | X | NX | 90.8 | XCR | |
| 17 | 1 | NX | X | 259.5 | XCR | |
| 17 | 0 | NX | NX | 285.5 | XAR | |
| 16 | 0 | NX | NX | 182.7 | XCR | |
| 16 | 0 | NX | NX | 80.3 | XCR | |
| 16 | 0 | NX | NX | 56.7 | XCR | |
| 16 | 0 | X | NX | 64.4 | XCR | |
| 15 | 1 | X | X | 253.8 | XAR | |
| 15 | 0 | NX | NX | 155 | XCR | |
| 15 | 1 | X | NX | 224.7 | XCR | |
| 15 | 0 | NX | NX | 132.8 | XCR | |
| 15 | 0 | NX | NX | 124 | XCR | |
| 14 | 0 | NX | NX | 323.4 | XCR | |
| 14 | 0 | X | NX | 190.3 | XAR | |
| 14 | 0 | NX | NX | 154.8 | XCR | |
| 13 | 0 | NX | NX | 102.9 | XCR | |
| 13 | 0 | NX | NX | 58.6 | XAR | |
| 13 | 0 | NX | NX | 237.4 | XAR | |
| 13 | 0 | NX | NX | 65.4 | XCR | |
| 13 | 0 | NX | NX | 77 | XCR | |
| 12 | 0 | NX | NX | 103 | XCR | |
| 12 | 0 | X | NX | 431.8 | XCR | |
| 12 | 0 | NX | X | 115.6 | XCR | |
| 12 | 0 | NX | X | 115.6 | XCR | |
| 12 | 0 | NX | NX | 64.2 | XCR | |
| 12 | 0 | NX | NX | 302.1 | XAR | |
| 11 | 0 | X | NX | 84.5 | XCR | |
| 11 | 0 | NX | NX | 83.1 | XCR | |
| 11 | 0 | NX | NX | 93.4 | XCR | |
| 11 | 0 | X | NX | 160.1 | XCR | |
| 11 | 0 | NX | NX | 72.2 | XCR | |
| 11 | 0 | NX | NX | 88.2 | XCR | |
| 10 | 0 | NX | NX | 300.5 | XCR | |
| 10 | 0 | NX | NX | 72.4 | XAR | |
| 10 | 0 | NX | NX | 60.9 | XCR |
1Indicates if gene is a known XLMR (= 1) or not (= 0)
2X indicates that a gene was classified as an XLMR gene and NX signifies classificiation as a non-XLMR gene.
3Distance from the TSS of gene to its closest neighbor gene
Nine genes highlighted as XLMR candidates by both the annotation and sequence motif method
| HGNC symbol | Description | Location | Function |
|---|---|---|---|
| Family with sequence similarity 156, member A | Xp11.23 | Function Unknown | |
| Family with sequence similarity 156, member B | Xp11.22 | Function Unknown | |
| Ubiquitously-expressed transcript | Xp11.23-p11.22 | Plays a role in facilitating receptor-induced transcriptional activation [ | |
| Transducin (beta)-like 1X-linked | Xp22.3 | Plays an essential role in transcription activation mediated by nuclear receptors [ | |
| Melanoma antigen family D, 4 | Xp11 | Mainly tumour cell proliferation [ | |
| Melanoma antigen family D, 4B | Xp11 | Mainly tumour cell proliferation [ | |
| Zinc finger, C4H2 domain containing | Xq11.1 | Hepatocellular carcinoma-associated antigen [ | |
| Apelin | Xq25 | Neuropeptide involved in the regulation of body fluid homeostasis and cardiovascular functions [ | |
| Member of RAS oncogene family | Xq25 | Involved in serum response element mediated gene transcription [ |