| Literature DB >> 23374135 |
Kai Willadsen1, Minh Duc Cao, Janet Wiles, Sureshkumar Balasubramanian, Mikael Bodén.
Abstract
BACKGROUND: Among repetitive genomic sequence, the class of tri-nucleotide repeats has received much attention due to their association with human diseases. Tri-nucleotide repeat diseases are caused by excessive sequence length variability; diseases such as Huntington's disease and Fragile X syndrome are tied to an increase in the number of repeat units in a tract. Motivated by the recent discovery of a tri-nucleotide repeat associated genetic defect in Arabidopsis thaliana, this study takes a cross-species approach to investigating these repeat tracts, with the goal of using commonalities between species to identify potential disease-related properties.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23374135 PMCID: PMC3617014 DOI: 10.1186/1471-2164-14-76
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Ratio of TNR sequence frequencies to genomic background. Differences shown are the log-ratio of the frequency of TNRs with the specific sequences identified vs. whole-genome order-two Markov backgrounds. TNR sequence frequencies vary markedly across different organisms. In all organisms, TNR sequence distribution was very different from the background, and organisms also have very different distributions from one another.
Figure 2Identification and division of tri-nucleotide repeats in coding regions. Tri-nucleotide repeats in coding regions were identified from genomic scans using Tandem Repeat Finder (see Methods for details). The TNRs are then localised to coding regions according to genomic feature from RefSeq annotations.
Top-5 over-represented GO/Biological process terms in exonic repeat-associated genes by species
| GO term | E-value | Term description |
| GO:0050789 | 2.35E-06 | regulation of biological process |
| GO:0060255 | 8.91E-06 | regulation of macromolecule metabolic process |
| GO:0050794 | 9.14E-06 | regulation of cellular process |
| GO:0019222 | 9.40E-06 | regulation of metabolic process |
| GO:0048522 | 1.49E-05 | positive regulation of cellular process |
| GO term | E-value | Term description |
| GO:0016070 | 7.32E-10 | RNA metabolic process |
| GO:0090304 | 1.59E-09 | nucleic acid metabolic process |
| GO:0044260 | 1.78E-09 | cellular macromolecule metabolic process |
| GO:0009889 | 2.61E-09 | regulation of biosynthetic process |
| GO:0043170 | 2.74E-09 | macromolecule metabolic process |
| GO term | E-value | Term description |
| GO:0007265 | 2.22E-08 | Ras protein signal transduction |
| GO:0046578 | 7.32E-08 | regulation of Ras protein signal transduction |
| GO:0050794 | 1.29E-07 | regulation of cellular process |
| GO:0009966 | 5.92E-07 | regulation of signal transduction |
| GO:0051056 | 1.11E-06 | regulation of small GTPase mediated signal transduction |
| GO term | E-value | Term description |
| GO:0048856 | 5.57E-106 | anatomical structure development |
| GO:0048731 | 3.19E-100 | system development |
| GO:0007275 | 8.55E-95 | multicellular organismal development |
| GO:0032502 | 9.65E-95 | developmental process |
| GO:0048513 | 1.10E-90 | organ development |
| GO term | E-value | Term description |
| GO:0032502 | 3.76E-45 | developmental process |
| GO:0007399 | 4.92E-42 | nervous system development |
| GO:0007275 | 2.98E-41 | multicellular organismal development |
| GO:0048856 | 2.61E-39 | anatomical structure development |
| GO:0048869 | 3.83E-39 | cellular developmental process |
| GO term | E-value | Term description |
| GO:0007399 | 1.15E-20 | nervous system development |
| GO:0030030 | 5.48E-16 | cell projection organization |
| GO:0032989 | 6.39E-16 | cellular component morphogenesis |
| GO:0048666 | 2.81E-15 | neuron development |
| GO:0000902 | 3.76E-15 | cell morphogenesis |
Note that for Drosophila melanogaster, Mus musculus and Homo sapiens, the top-5 terms are development-related, yet many regulation-related terms appear at statistically significant levels (not shown).
Division of TNR- and variant-encoded homo-AA proteins
| | |||
|---|---|---|---|
| 96 | 224 | 299 | |
| 337 | 985 | 1285 | |
| 67 | 834 | 892 | |
| 404 | 2083 | 2252 | |
| 253 | 1369 | 1530 | |
| 342 | 1416 | 1661 |
Note that a protein may contain both TNR- and variant-encoded homo-AA tracts. The number TNR-encoded proteins may be lower than the number of TNR tracts in exonic regions because a stricter criterion was applied to determine TNR-encoded homo-AA tracts, which did not allow for interruptions.
Over-represented GO terms in TNR encoded homo-AA tract containing proteins
| GO:0006996 | 2.90E-02 | organelle organization | |
| GO:0005917 | 1.45E-02 | nephrocyte diaphragm | |
| | GO:0034333 | 1.45E-02 | adherens junction assembly |
| | GO:0036058 | 1.45E-02 | filtration diaphragm assembly |
| | GO:0036059 | 1.45E-02 | nephrocyte diaphragm assembly |
| | GO:0036056 | 1.45E-02 | filtration diaphragm |
| GO:0051276 | 3.09E-02 | chromosome organization |
All over-represented GO terms in TNR-encoded homo-AA tract-containing proteins found in all species when using all homo-AA proteins as a statistical background. All p-values given are Bonferroni-corrected.
Figure 3TNR- vs. variant-encoded amino acid repeats in multiple organisms. Top: The log-ratio of the proportion of amino acid repeats for TNR- vs. variant-encoded tracts. Bottom: The log-ratio of the length of amino acid repeats for TNR- vs. variant-encoded tracts. Significant (p<0.05) differences are identified by bars with a black outline.