| Literature DB >> 20532223 |
Noa Sela1, Britta Mersch, Agnes Hotz-Wagenblatt, Gil Ast.
Abstract
Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20532223 PMCID: PMC2879366 DOI: 10.1371/journal.pone.0010907
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Bias toward exonization at the 5′ end of the CDS.
TE-derived exons (left panels) and alternatively spliced cassette exons that did not originated from TEs (right panels) are shown in normalized locations along the CDS in increments of 0.1 (exon locations were normalized between 0 and 1, see Materials and Methods) for (A) human and (B) mouse. The x-axis is the normalized CDS location and the y-axis is the number of alternative exons.
Figure 2Density of SNPs within all transposed elements in the human genome.
The average SNP frequency in the TE-body and the flanking sequences is shown in a sliding window of 50 bp. All frequencies are normalized to a frequency per 100 bp. The center of the TE is located at position 0.
Figure 3Density of SNPs within all transposed elements in the mouse genome.
The average SNP frequency in the TE and the flanking sequence is shown in a sliding window of 50 bp. All frequencies are normalized to a frequency per 100 bp. The center of the TE is located at position 0.
Densities of SNPs in exonized TEs and all TEs in the human and in the mouse genomes.
| Human | Mouse | ||||
| TE family | SNP density exonized | SNP density all | TE family | SNP density exonized | SNP density all |
|
| 0.45 | 0.53 | B1 | 0.29 | 0.12 |
| L1 | 0.37 | 0.42 | B2 | 0.27 | 0.12 |
| L2 | 0.33 | 0.34 | B4 | 0.16 | 0.14 |
| MIR | 0.28 | 0.33 | L1 | 0.15 | 0.10 |
| CR1 | 0.51 | 0.33 | L2 | 0.21 | 0.16 |
| LTR | 0.31 | 0.37 | MIR | 0.30 | 0.17 |
| DNA | 0.23 | 0.35 | LTR | 0.25 | 0.11 |
| DNA | 0.0 | 0.14 | |||
SNPs in splice sites of exonized TEs in the human genome.
| Gene id | Chr./strand | Start–end | TE family | SNP info | Position | Sequence in other species |
| RCSD1 | chr1/+ | 164341465–607 |
| rs1890128 (A/G) | 1st pos. donor | Chimp–GRhesus-G |
| FAM35A | chr10/+ | 88900743–863 |
| rs3129523 (A/T) | 2nd pos. donor | Chimp–TRhesus-A |
| TSFM | chr12/+ | 56463664–702 |
| rs2014886 (A/G) | 2nd pos. donor | Chimp–GRhesus-G |
| ETFA | chr15/− | 74389327–460 |
| rs2469213 (C/T) | 1st pos. acc. | Chimp–CRhesus-C |
| DPP9 | chr19/− | 4670214–336 |
| rs3059236 (-/TTTA) | 2nd pos. acc. | new insertion no chimp/rhesus info. |
| ZNF544 | chr19/+ | 63440426–512 |
| rs12979599 (A/G) | 1st pos. acc. | Chimp–GRhesus–no |
| LOC63929 | chr22/+ | 39581181–297 |
| rs5758111 (A/G) | 2nd pos. acc. | Chimp–ARhesus-A |
| ACTG2 | chr2/+ | 74041405–549 | L2 |
| 1st pos. donor | Chimp–GRhesus-G |
| CANT1 | chr17/− | 74505824–963 | L2 |
| 1st pos. acc. | Chimp–CRhesus–CMouse-C |
| AK129982 | chr8/+ | 12346635–774 | LTR | rs1988623 (A/G) | 1st pos. donor | Chimp–TRhesus–no |
SNPs with population specific data are in bold.
SNPs in splice sites of exonized TEs in the mouse genome.
| Gene id | Chr./strand | Start–end | TE family | SNP info | Position |
| Csrp2bp | chr2/+ | 143828541–730 | B2 | rs29540199 (C/T) | 2nd pos. donor |
| Zfp644 | chr5/− | 105752526–733 | B1 | rs33626312 (C/T) | 1st pos. donor |
| Rbm6 | chr9/− | 107929610–717 | LTR | rs33287617 (C/T) | 2nd pos. donor |
Population frequency data for the human SNPs which occurred in the splice sites of the exonized TEs.
| Genotype detail | Alleles | |||||
|
|
|
|
|
|
|
|
| rs1721244 (A/G, donor 1st position) | CEU | 0.27 | 0.57 | 0.17 | 0.55 | 0.45 |
| HCB | 0.31 | 0.5 | 0.19 | 0.56 | 0.44 | |
| JPT | 0.1 | 0.46 | 0.44 | 0.33 | 0.67 | |
| YRI | 0.21 | 0.56 | 0.23 | 0.49 | 0.51 | |
CEU–European, HCB–Asian, JPT–Asian, YRI–Sub-Saharan African.
Alu exons edited at 3′ss.
| # |
| Exon coordinates | Gene | ESTs/cDNA accessions confirming the editing | Location of the closest intronic | Other editing sites within the exon |
| 1 |
| chr1:52,768,028–52,768,145 | ZCCHC11-zinc finger, CCHC domain containing 11 isoform | BU178489–retinoblastoma | Upstream | No other editing sites |
| 2 |
| chr5:61,653,166–61,653,305 | KIF2A–Homo sapiens kinesin heavy chain member 2A | AA834569-germinal center b cell tissue | Downstream | One another editing site within the exon |
| 3 |
| chr6:24,489,146-24,489,281 | DCDC2-doublecortin domain containing 2 | BP332729-renal proximal tubule | Upstream | One another editing site within the exon |
| 4 |
| -chr16:36,579-36,721 | POLR3K-DNA directed RNA polymerase III polypeptide K | CR994793–t-lymphocytes | Upstream | No other editing sites |
| 5 |
| chr17:37,652,211-37,652,327 | STAT5B-Homo sapiens signal transducer and activator of transcription 5B | DA223574–brain | Downstream | Two more editing sites within the exon |
| 6 |
| chr19:40,262,281-40,262,395 | LOC100128675 -Homo sapiens hypothetical LOC100128675 non-coding RNA | NR_024561AK124779DA216531DA216526–all from the brain | Downstream | Three more editing sites within the exon |
Based on version hg18 of the human genome.
Figure 4Exonization of Alu element in NR_024561 dependent on RNA editing.
Editing was inferred from alignment of cDNAs to human genomic DNA. (A) Schematic illustration of exons 2 to 4 of the non-coding gene NR_024561. Exons are depicted as blue boxes. The Alu-exon, derived from AluJo (marked AEx; shown by purple box), is in an antisense orientation and is shown in the middle. The intronic, sense-orientation Alu sequence (AluS) is 731 base-pairs downstream of the exonized Alu. Sense and antisense Alus are expected to form double-stranded RNA, thus allowing RNA editing. RNA editing changes an AA dinucleotide into a functional AG 3′ splice site (lower panel). RNA editing also occurs in three positions in the Alu-derived exon (E1, E2, and E3). (B) Predicted folding of the sense and antisense Alu sequences (upper and lower lines, respectively). Adenosines that undergo editing are marked by red. Splice sites utilized for Alu exonization are marked as 5′ss and 3′ss on the alignment. (C) Alignment of this region from four species: human, gorilla, orangutan, and rhesus. The 5′ splice site, 3′ splice site, and the three editing positions are marked in yellow.