| Literature DB >> 20441586 |
Francesco Catania1, Michael Lynch.
Abstract
BACKGROUND: In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20441586 PMCID: PMC2874801 DOI: 10.1186/1471-2148-10-129
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
The number of hit 3' UTRs and the most frequent common function of the host genes are shown for the each of the most recurrent sequence motifs.
| Sequence motif | Observed hits | Expected hits | Protein function* | |
|---|---|---|---|---|
| GUACAUUA | 127 | 13 (3.67) | Ribosomal (88.5%) | |
| U | 47 | 11 (3.12) | Ribosomal (61.5%) | |
| ACAAUCAU | 36 | 15 (3.52) | - | |
| UAUGCAAA | 35 | 12 (3.82) | - | |
| UUAUGCAA | 34 | 12 (4.25) | - | |
| AU | 33 | 12 (3.30) | Ribosomal (72.4%) | |
| UUUAUGCA | 32 | 12 (3.65) | - | |
| AUAUGCAA | 30 | 14 (4.21) | - | |
| U | 30 | 12 (3.16) | Ribosomal (65.2%) | |
| 29 | 12 (3.90) | Ribosomal (77.8%) | ||
| UUGCAAUA | 29 | 13 (3.37) | - | |
| UAUGCAAU | 25 | 13 (3.25) | - | |
| UAU | 25 | 13 (3.49) | Ribosomal (61.1%) |
Sequences that overlap the most abundant motif (GUACAUUA) are underlined. The expected average number of hits (standard deviation is in parentheses) is obtained by screening 25 sets of randomly generated DNA sequences, whose length and nucleotide composition are identical tothe 3' UTRs sequences ofthe original dataset. * = estimate based on genes with annotated function
List of P. tetraurelia non -ribosomal protein-coding genes that contain the 3' UTR motif GUACAUUA.
| GENE MODEL | MOLECULAR FUNCTION | BLAST score | BLAST E-value |
|---|---|---|---|
| GSPATP00003019001 | Hypothetical protein | 60 | 7e-009 |
| GSPATP00005027001 | Eukaryotic translation initiation factor | 123 | 7e-028 |
| GSPATP00016207001 | Guanine nucleotide-binding protein | 229 | 8e-060 |
| GSPATP00039830001 | Asparagine synthetase | 112 | 7e-025 |
| GSPATP00000284001 | Asparagine synthetase | 263 | 1e-069 |
| GSPATP00018093001 | AMP-binding enzyme | 337 | 8e-092 |
| GSPATP00015762001 | Membrane transporter | 97 | 1e-019 |
| GSPATP00033344001 | DNA-directed RNA polymerase I | 980 | 0.0 |
| GSPATP00026659001 | RNA-binding (PUF) protein | 105 | 1e-022 |
| GSPATP00031576001 | Phosphatidylserine decarboxylase | 162 | 1e-039 |
| GSPATP00005327001 | Nucleolar protein NOP58 | 337 | 4e-092 |
Figure 1Nucleotide composition of the common . The core region (positions 2:9) reflects the highly preserved 3' UTR mammalian motif.
The number of occurrences of the GUACAUUA and its single nucleotide degenerate variants is examined in Paramecium and four additional species (T. thermophila, D. melanogaster, A. thaliana and H. sapiens).
| Species | Number of ribosomal 3'UTRs | GUACAUUA (count) | GUACAUUA single nucleotide degenerate variants (count)† |
|---|---|---|---|
| 472 | 100 (1) | 246 (38) | |
| 81 | 1 (1) | 16 (35) | |
| 186 | 1 (2) | 19 (37) | |
| 432 | 2 (3) | 28 (77) | |
| 283 | 1 (3) | 43 (83) |
A motif is considered over- represented if its frequency in the set of ribosomal 3' UTRs is significantly higher (P < 0.01) than its frequency in the set of non-ribosomal 3' UTRs. Average numbers of hits expected by chance arein parentheses. † = multiple hits are included
Figure 2Average . Average diversity values are examined between gene pairs where both copies contain the DUAYAWUW motif and gene pairs where at least one of the copies does not contain this motif.
Figure 3Structure of the two candidate precursor miRNAs in . a) candidate pre-miRNA (location in macronuclear genome: scaffold_567:869-915); b) candidate pre-miRNA (this is a portion of an EST [cDNA clone LK0ADA28YP05; collected at conjugation (beginning of meiosis)] that can be only partially mapped to the 3' end region of three 60S ribosomal protein-coding gene L1 that are located in scaffold_161: 34738-35490; scaffold_253: 730-1589; and scaffold_151: 51790-52544). Color scale displays base-pairs probability.
Highly conserved motifs in worms and/or flies [6] detected in P. tetraurelia 3' UTRs.
| Motif | Observed hits | Expected hits | Distance (bp) from translation termination codon |
|---|---|---|---|
| UAAAUAAAU | 165 | 121 (0.97) | 26.46 (37.08) |
| UAUAUAUA | 689 | 246 (3.56) | 23.52 (25.23) |
| UGCAUUU | 146 | 64 (1.76) | 35.46 (44.36) |
| UGUGUAU | 106 | 53 (0.99) | 26.98 (32.07) |
| UUUUUAUA | 175 | 284 (2.14) | 28.77 (40.40) |
| UGUACAUU | 47 | 11 (1) | 21.83 (20.20) |
| GUACAAU | 109 | 33 (2.46) | 21.55 (20.10) |
| UCAAUAAA | 107 | 68 (1.27) | 29.94 (28.01) |
| UACUAAC | 12 | 35 (0.81) | 32.33 (19.81) |
| UUGCAUA | 130 | 61 (2.59) | 28.75 (41.93) |
The list includes the top 10 k-mers for which the frequency of occurrence in P. tetraurelia 3' UTRs most deviate from the corresponding frequency expected by chance. Values in parentheses are standard deviations.