| Literature DB >> 27934698 |
Shengjun Tan1, Margarida Cardoso-Moreira2, Wenwen Shi1, Dan Zhang1,3, Jiawei Huang1,3, Yanan Mao1, Hangxing Jia1,3, Yaqiong Zhang1, Chunyan Chen1,3, Yi Shao1,3, Liang Leng1, Zhonghua Liu4, Xun Huang4, Manyuan Long5, Yong E Zhang1,3.
Abstract
In a broad range of taxa, genes can duplicate through an RNA intermediate in a process mediated by retrotransposons (retroposition). In mammals, L1 retrotransposons drive retroposition, but the elements responsible for retroposition in other animals have yet to be identified. Here, we examined young retrocopies from various animals that still retain the sequence features indicative of the underlying retroposition mechanism. In Drosophila melanogaster, we identified and de novo assembled 15 polymorphic retrocopies and found that all retroposed loci are chimeras of internal retrocopies flanked by discontinuous LTR retrotransposons. At the fusion points between the mRNAs and the LTR retrotransposons, we identified shared short similar sequences that suggest the involvement of microsimilarity-dependent template switches. By expanding our approach to mosquito, zebrafish, chicken, and mammals, we identified in all these species recently originated retrocopies with a similar chimeric structure and shared microsimilarities at the fusion points. We also identified several retrocopies that combine the sequences of two or more parental genes, demonstrating LTR-retroposition as a novel mechanism of exon shuffling. Finally, we found that LTR-mediated retrocopies are immediately cotranscribed with their flanking LTR retrotransposons. Transcriptional profiling coupled with sequence analyses revealed that the sense-strand transcription of the retrocopies often lead to the origination of in-frame proteins relative to the parental genes. Overall, our data show that LTR-mediated retroposition is highly conserved across a wide range of animal taxa; combined with previous work from plants and yeast, it represents an ancient and ongoing mechanism continuously shaping gene content evolution in eukaryotes.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27934698 PMCID: PMC5131818 DOI: 10.1101/gr.204925.116
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
The 15 D. melanogaster polymorphic retrocopies
Figure 1.Schematic representations of a subset of the retrocopies identified. (A) CG17604_r in D. melanogaster. (B) CG4799_CG11924_r in D. melanogaster. (C) ENSANGG00000011308 in mosquito. (D) ENSMUSG00000083549 in mouse. LTR retrotransposons and retrocopies are marked in gray and blue, respectively. Microsimilarity stretches are marked in yellow, and de novo sequence insertions in green. The red line in A marks the “TAAAAAACAAGC-GGTTG” motif. (CDS) coding sequences; (UTR) untranslated regions.
Insertion sites of nine retrocopies
The five newly originated Drosophila retrocopies
Figure 2.The expression of retrocopies in fruit fly and mouse. (A) Whole-body RT-PCR results for 10 retrocopies validated in Supplemental Figure 2, which are expressed in both male (M) and female (F). The primer sequences and expected product sizes are listed in Supplemental Table 9. For the chimeric genes, CG5119_CR42443_r and CG4799_CG11924_r, we also designed primers flanking the fusion point between the two parental genes, the product of which is marked with “C.” (B) Tissue profiling via qRT-PCR. CG2662_r has no detectable expression in ovary. The Pearson correlation (r) between the retrocopies and corresponding retrotransposons across the three tissues is displayed above. (C) Tissue level RT-PCR results for four of 10 retrocopies encoded by line 208 across Testis (T), Ovary (O), and Head (H). (D) Expression profile of retrocopies encoded by the mouse genome. Tissues are hierarchically clustered on the basis of expression similarity across genes. Retrocopies are not clustered but instead are sorted by age as approximated by nucleotide divergence. To reveal relative moderate expression, the color-code scheme is truncated at the FPKM cutoff of 1 (i.e., all expression higher than 1 is shown in dark blue).
Figure 3.Strand-specific RNA-seq-based quantification. On the basis of newly generated and public strand-specific RNA-seq data, we quantified the expression of LTR-mediated retrocopies in fruit fly and mosquito (A), zebrafish (B), chicken (C), and mouse (D). For fruit fly, given the high similarity between polymorphic retrocopies and parental genes, only switch points spanning reads were counted, and FPM was calculated. For all other cases, FPKM was calculated. For mosquito, embryo, the third instar larvae, the fourth instar larvae, and pupae were profiled. For zebrafish, 4-, 6- and 8-h post fertilization (pf) stages were profiled.
Figure 4.A schematic representation of the template switch model for LTR-mediated retroposition. (VLP) virus-like particle; (RT) reverse transcriptase; (PR) protease; (IN) integrase. The black boxes in LTR retrotransposons correspond to the LTRs and the light green box to the ORF. The blue boxes correspond to exons encoded by the host genome and the white boxes to introns. The double template switch occurs in the VLP, generating a chimeric cDNA (multiple switches can occur) (Derr et al. 1991; Schacherer et al. 2004; Maxwell and Curcio 2007). After integration, the chimeric sequences can be transcribed as pseudo-LTR retrotransposons, which could be subject to further cycles of retroposition (Elrouby and Bureau 2010).