Literature DB >> 22675029

Parallel relaxation of stringent RNA recognition in plant and mammalian L1 retrotransposons.

Kazuhiko Ohshima.   

Abstract

L1 elements are mammalian non-long terminal repeat retrotransposons, or long interspersed elements (LINEs), that significantly influence the dynamics and fluidity of the genome. A series of observations suggest that plant L1-clade LINEs, just as mammalian L1s, mobilize both short interspersed elements (SINEs) and certain messenger RNA by recognizing the 3'-poly(A) tail of RNA. However, one L1 lineage in monocots was shown to possess a conserved 3'-end sequence with a solid RNA structure also observed in maize and sorghum SINEs. This strongly suggests that plant LINEs require a particular 3'-end sequence during initiation of reverse transcription. As one L1-clade LINE was also found to share the 3'-end sequence with a SINE in a green algal genome, I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized the specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22675029      PMCID: PMC3472496          DOI: 10.1093/molbev/mss147

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


L1 elements are mammalian non–long terminal repeat (LTR) retrotransposons, or long interspersed elements (LINEs), that drive genome evolution in diverse ways. They constitute a large proportion of the genome, shaping both individual genes and the genome as a whole (Weiner et al. 1986; Brosius 1991). L1s mobilize nonautonomous sequences such as short interspersed element (SINE) RNA and cytosolic messenger RNA (mRNA) by recognizing the 3′-poly(A) tail of the template RNA, resulting in enormous SINE amplification (Dewannieux et al. 2003) and processed pseudogene formation (Esnault et al. 2000; Ohshima et al. 2003; Babushok et al. 2007; Ohshima and Igarashi 2010). In other words, L1s seem to initiate reverse transcription in a “relaxed” manner (Okada et al. 1997). The 3′-end sequences of various SINEs originated from corresponding LINEs other than L1 (Ohshima et al. 1996), however, and to date, ∼20 of these SINE/LINE pairs have been identified (Ohshima and Okada 2005). As the 3′-untranslated regions (UTRs) of several LINEs have been shown to be essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template (Okada et al. 1997; Kajikawa and Okada 2002). A systematic database and literature survey identified 58 SINEs, more than twice the number already identified, each sharing a common 3′-end sequence with the partner LINE (supplementary table S1, supplementary fig. S1, Supplementary Material online). Although more than 800 L1-clade LINEs appeared in the database, only three SINEs with L1 tails were found in this study. This observation suggests that, in general, L1-clade LINEs differ from other LINEs with respect to 3′-end recognition (supplementary fig. S2, Supplementary Material online). Figure 1 shows the number of LINEs belonging to each LINE clade according to biological taxa (supplementary table S2, Supplementary Material online). The genomes of land plants (mainly flowering plants) exclusively harbor only L1-clade LINEs (RTE-clade LINEs are also found in several species). Moreover, although a significant number of SINEs, more than half of which end in poly(A) repeats, have been identified in the genomes of flowering plants (supplementary table S3, Supplementary Material online), only three SINE/LINE pairs have been discovered: namely, maize ZmSINE2 and ZmSINE3 (LINE1-1_ZM; Baucom et al. 2009) and tobacco TS SINE (RTE-1_STu; this study; supplementary fig. S3, Supplementary Material online). Interestingly, many processed pseudogenes have been reported in flowering plants (Faris et al. 2001; Zhang et al. 2005; Benovoy and Drouin 2006; Nurhayati et al. 2009). As mammalian L1s are thought to recognize the 3′-poly(A) tail of RNA when forming processed pseudogenes (Esnault et al. 2000), it is possible that plant LINE machinery is similar to mammalian L1s (Lenoir et al. 2001). That is, by presumably recognizing the 3′-poly(A) tail of RNA, plant L1-clade LINEs thereby mobilize SINEs with a poly(A) tail and mRNA. In accordance with this hypothesis, almost all L1-clade LINEs in flowering plants were shown to end in poly(A) repeats and all RTE-clade LINEs in (TTG)n or (TTGATG)n (table 1). Poly(T)-ending SINEs: p-SINEs and Au-like SINEs (supplementary table S3, Supplementary Material online) would be mobilized by the LINE machinery that recognize a poly(U) repeat of RNA at the 3′-terminus, although such LINE has never been reported in plants.
F

The number of LINE families belonging to each LINE clade according to biological taxa. LINE clades in which the partner LINE of a SINE was identified are shown. Remaining clades are grouped as “Others” (Repbase 16.10). “other vertebrates”: nonmammalian vertebrates; “land plants”: mostly flowering plants.

Table 1.

3′-Repeats of Plant LINE Families.

SpeciesLINE cladeFamilies3′-repeat
(A)nOther repeatsNone
Flowering plantsL123322409
RTE707a0
Green algaeL1152b8c5
RandI808d0
RTEX606e0

a(TTG)n and (TTGATG)n.

bL1-1_CR (Chlamydomonas) and Zepp (Chlorella).

c(CATA)n, (CA)n, (CAA)n, and (TAA)n.

d(ATT)n and (CTATTT)n.

e(CA)n, (CAA)n, (CCAT)n, (ACAATG)n, and (CTTGTAA)n.

The number of LINE families belonging to each LINE clade according to biological taxa. LINE clades in which the partner LINE of a SINE was identified are shown. Remaining clades are grouped as “Others” (Repbase 16.10). “other vertebrates”: nonmammalian vertebrates; “land plants”: mostly flowering plants. 3′-Repeats of Plant LINE Families. a(TTG)n and (TTGATG)n. bL1-1_CR (Chlamydomonas) and Zepp (Chlorella). c(CATA)n, (CA)n, (CAA)n, and (TAA)n. d(ATT)n and (CTATTT)n. e(CA)n, (CAA)n, (CCAT)n, (ACAATG)n, and (CTTGTAA)n. Figure 2 shows the results from comprehensive phylogenetic analysis of L1-clade LINEs (supplementary fig. S4 and supplementary table S4, Supplementary Material online). Three important points were revealed. First, L1-clade LINEs from distinct taxa, namely, land plants, green algae, and vertebrates, formed monophyletic groups. Statistical support for the monophyly of land plants and green algae was high, with bootstrap values of 100 and 97, respectively (82 and 83; maximum likelihood [ML] method; supplementary fig. S5, Supplementary Material online). Monophyly of the vertebrate F and M lineages (Ichiyanagi et al. 2007), however, was not supported by the ML method (supplementary fig. S5, Supplementary Material online). Second, the L1 lineages from these three taxa formed a monophyletic group (55/45; neighbor-joining [NJ]/ML methods) among diverged LINE clades such as RTE and CR1. The Tx1 LINE, with target-specific insertion, was also found in this clade, as observed in previous studies (Kojima and Fujiwara 2004; Ichiyanagi et al. 2007). The Tx1 and vertebrate F lineage formed a monophyletic group with high confidence (94/85). Third, comparison with the species phylogeny revealed that plant L1-clade LINEs consist of at least three deeply branching lineages that have descended from the common ancestor of monocots and eudicots (ME1-3; supplementary fig. S6, Supplementary Material online). These three lineages must have arisen more than 130 million years ago, around the approximate divergence of monocots and eudicots (Moore et al. 2007).
F

Phylogenetic relationships among the L1-clade LINEs. LINE-clades are shown in bold italics. Several lineages in which a stringent or relaxed L1 was found are indicated by asterisks: (*1) LINE1-1_ZM (stringent), (*2) L1-1_CR (stringent), and (*3) L1HS (relaxed). The phylogenetic relationships among 146 LINEs were inferred using the amino acid sequences of ORF2 proteins from plant L1 entries in the database (Repbase 15.08; Viridiplantae) and from other LINEs (Ohshima and Okada 2005). A total of 404 positions made up the final data set. The linearized NJ consensus tree obtained from bootstrap analysis with 1,000 replications is shown (an ML consensus tree formed with the same data set is available as supplementary fig. S5, Supplementary Material online). The evolutionary distances were computed using the Jones-Taylor-Thornton (JTT) matrix-based method. For clarity, some clades were collapsed with filled triangles, the widths of which were in proportion to the number of LINEs. The full expanded tree is shown in supplementary figure S4, Supplementary Material online. Bootstrap values are only shown for nodes with scores > 45.

Phylogenetic relationships among the L1-clade LINEs. LINE-clades are shown in bold italics. Several lineages in which a stringent or relaxed L1 was found are indicated by asterisks: (*1) LINE1-1_ZM (stringent), (*2) L1-1_CR (stringent), and (*3) L1HS (relaxed). The phylogenetic relationships among 146 LINEs were inferred using the amino acid sequences of ORF2 proteins from plant L1 entries in the database (Repbase 15.08; Viridiplantae) and from other LINEs (Ohshima and Okada 2005). A total of 404 positions made up the final data set. The linearized NJ consensus tree obtained from bootstrap analysis with 1,000 replications is shown (an ML consensus tree formed with the same data set is available as supplementary fig. S5, Supplementary Material online). The evolutionary distances were computed using the Jones-Taylor-Thornton (JTT) matrix-based method. For clarity, some clades were collapsed with filled triangles, the widths of which were in proportion to the number of LINEs. The full expanded tree is shown in supplementary figure S4, Supplementary Material online. Bootstrap values are only shown for nodes with scores > 45. One group of LINEs in a monocot L1 lineage (monocot 1a in fig. 2) retained a conserved 3′-end sequence (supplementary fig. S7, Supplementary Material online). Average pairwise divergence of this region (the last 45 nucleotides) among the LINEs was only 0.144 (standard error [SE], 0.043), whereas that for the entire sequence was 0.570 (SE, 0.012). Interestingly, maize SINEs (ZmSINE2 and ZmSINE3) with 3′-end sequences very similar to that of the above LINE, LINE1-1_ZM, were recently reported (Baucom et al. 2009). This study also revealed possession of similar 3′-end sequences by several sorghum SINEs (supplementary fig. S8, Supplementary Material online). Comparison of the 3′-end sequences from these SINEs and LINEs revealed that part of the sequence (ca., 50 nucleotides) is apparently related, presumably having been derived from a common ancestral L1 sequence (supplementary fig. S9, Supplementary Material online). The putative transcript from this region was also shown to form a possible hairpin structure (supplementary fig. S10, Supplementary Material online). Compensatory mutations were observed in the stem-forming sequences, confirming a secondary structure (supplementary figs. S7 and S10, Supplementary Material online). Several nucleotides were strongly conserved in the 3′-flanking region of the stem (5′-CGAG-3′) and in the loop (5′-UCU-3′), though the stem-forming nucleotides were variable. This stem-loop structure is commonly observed in the 3′-end sequences of stringent-type LINEs and SINEs (Osanai et al. 2004; Nomura et al. 2006). These results strongly suggest that, at least in this lineage, plant LINEs require a particular 3′-end sequence of stringent type. The last example of a SINE/LINE pair in the L1-clade was found in a green alga. The 3′-end sequence (ca., 80 nucleotides) of Chlamydomonas SINEX-3_CR (Cognat et al. 2008) was very similar to that of L1-1_CR, with both ending in poly(A) repeats (supplementary fig. S11, Supplementary Material online). As land plants emerged from green algae (Karol et al. 2001), the following is proposed for 3′-end recognition of plant L1-clade LINEs (fig. 3). It is possible that the ancestral L1-clade LINE in the genome of the common ancestor of green plants possessed stringent, nonmammalian-type RNA recognition properties. During the course of plant evolution, a L1 lineage(s) then lost the ability to specifically recognize the RNA template for reverse transcription, thereby introducing relaxed 3′-end recognition in land (flowering) plants as in mammals. As horizontal transfer of LINEs between eukaryotes is rare (Kordiš and Gubenšek 1998; Malik et al. 1999), the discontinuous distribution of L1-clade LINEs with low specificity (i.e., mammalian L1s and plant ME2/ME3) suggests a type of parallel evolution.
F

Proposed model for the 3′-end recognition of L1-clade LINEs. The ancestral L1-clade LINE in the ancestral green plant possessed a stringent, nonmammalian-type RNA recognition property. During the course of plant evolution, a L1 lineage(s) lost the ability to specifically recognize the RNA template for reverse transcription, introducing relaxed 3′-end recognition in land plants. Processed pseudogenes have been reported in eudicots, monocots, and mammals. ME1-3: plant L1 lineages; (e): eudicots; (m): monocots; M, F: vertebrate L1 lineages; (m): mammals; (f): fish.

Proposed model for the 3′-end recognition of L1-clade LINEs. The ancestral L1-clade LINE in the ancestral green plant possessed a stringent, nonmammalian-type RNA recognition property. During the course of plant evolution, a L1 lineage(s) lost the ability to specifically recognize the RNA template for reverse transcription, introducing relaxed 3′-end recognition in land plants. Processed pseudogenes have been reported in eudicots, monocots, and mammals. ME1-3: plant L1 lineages; (e): eudicots; (m): monocots; M, F: vertebrate L1 lineages; (m): mammals; (f): fish. The ancestral L1-clade LINE might have required both the 3′-end sequence and the terminal poly(A) repeats. A few L1 lineages might then have lost specific interaction with the 3′-UTR of the template RNA, retaining some role for the 3′-repeats. As listed in table 1, most plant L1-clade LINEs have poly(A) repeats at their 3′-termini as in mammalian L1s. However, 3′-poly(A) repeats are not necessarily a hallmark of relaxed 3′-end recognition. For example, although silkworm SART1, an R1-clade LINE, uses stringent-type recognition (its 3′-UTR is essential for retroposition), it ends in poly(A) repeats (Takahashi and Fujiwara 2002; Osanai et al. 2004), which are necessary for efficient and accurate retroposition (Osanai et al. 2004). L1 LINEs have contributed significantly to the architecture and evolution of mammalian genomes, whereas LTR retrotransposons are overwhelmingly found in certain flowering plants. Understanding the independent origins of flexible 3′-end recognition may help us determine what distinguishes the fate of retroposons in the eukaryotic genome and why it has succeeded so well in certain genomes (Zhang and Wessler 2004; Heitkam and Schmidt 2009; Hollister et al. 2011).

Supplementary Material

Supplementary figures S1–S11 and tables S1–S4 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
  30 in total

1.  Chromosome mapping and phylogenetic analysis of the cytosolic acetyl-CoA carboxylase loci in wheat.

Authors:  J Faris; A Sirikhachornkit; R Haselkorn; B Gill; P Gornicki
Journal:  Mol Biol Evol       Date:  2001-09       Impact factor: 16.240

2.  Transplantation of target site specificity by swapping the endonuclease domains of two LINEs.

Authors:  Hidekazu Takahashi; Haruhiko Fujiwara
Journal:  EMBO J       Date:  2002-02-01       Impact factor: 11.598

3.  LINEs mobilize SINEs in the eel through a shared 3' sequence.

Authors:  Masaki Kajikawa; Norihiro Okada
Journal:  Cell       Date:  2002-11-01       Impact factor: 41.582

4.  Essential motifs in the 3' untranslated region required for retrotransposition and the precise start of reverse transcription in non-long-terminal-repeat retrotransposon SART1.

Authors:  Mizuko Osanai; Hidekazu Takahashi; Kenji K Kojima; Mitsuhiro Hamada; Haruhiko Fujiwara
Journal:  Mol Cell Biol       Date:  2004-09       Impact factor: 4.272

5.  Cross-genome screening of novel sequence-specific non-LTR retrotransposons: various multicopy RNA genes and microsatellites are selected as targets.

Authors:  Kenji K Kojima; Haruhiko Fujiwara
Journal:  Mol Biol Evol       Date:  2003-08-29       Impact factor: 16.240

Review 6.  Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information.

Authors:  A M Weiner; P L Deininger; A Efstratiadis
Journal:  Annu Rev Biochem       Date:  1986       Impact factor: 23.643

7.  The evolutionary origin and genomic organization of SINEs in Arabidopsis thaliana.

Authors:  A Lenoir; L Lavie; J L Prieto; C Goubely; J C Coté; T Pélissier; J M Deragon
Journal:  Mol Biol Evol       Date:  2001-12       Impact factor: 16.240

8.  LINE-mediated retrotransposition of marked Alu sequences.

Authors:  Marie Dewannieux; Cécile Esnault; Thierry Heidmann
Journal:  Nat Genet       Date:  2003-08-03       Impact factor: 38.330

9.  Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea.

Authors:  Xiaoyu Zhang; Susan R Wessler
Journal:  Proc Natl Acad Sci U S A       Date:  2004-04-02       Impact factor: 11.205

10.  Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates.

Authors:  Kazuhiko Ohshima; Masahira Hattori; Tetsusi Yada; Takashi Gojobori; Yoshiyuki Sakaki; Norihiro Okada
Journal:  Genome Biol       Date:  2003-10-28       Impact factor: 13.583

View more
  9 in total

1.  Mechanism by which a LINE protein recognizes its 3' tail RNA.

Authors:  Yoshinori Hayashi; Masaki Kajikawa; Takuma Matsumoto; Norihiro Okada
Journal:  Nucleic Acids Res       Date:  2014-08-20       Impact factor: 16.971

2.  Conserved 3' UTR stem-loop structure in L1 and Alu transposons in human genome: possible role in retrotransposition.

Authors:  Daria Grechishnikova; Maria Poptsova
Journal:  BMC Genomics       Date:  2016-12-03       Impact factor: 3.969

3.  LINEs Contribute to the Origins of Middle Bodies of SINEs besides 3' Tails.

Authors:  Kenji K Kojima
Journal:  Genome Biol Evol       Date:  2018-01-01       Impact factor: 3.416

4.  Ancient traces of tailless retropseudogenes in therian genomes.

Authors:  Angela Noll; Carsten A Raabe; Gennady Churakov; Jürgen Brosius; Jürgen Schmitz
Journal:  Genome Biol Evol       Date:  2015-02-26       Impact factor: 3.416

5.  Cross-Kingdom Commonality of a Novel Insertion Signature of RTE-Related Short Retroposons.

Authors:  Eri Nishiyama; Kazuhiko Ohshima
Journal:  Genome Biol Evol       Date:  2018-06-01       Impact factor: 3.416

Review 6.  Factors Regulating the Activity of LINE1 Retrotransposons.

Authors:  Maria Sergeevna Protasova; Tatiana Vladimirovna Andreeva; Evgeny Ivanovich Rogaev
Journal:  Genes (Basel)       Date:  2021-09-30       Impact factor: 4.096

7.  A 3' Poly(A) Tract Is Required for LINE-1 Retrotransposition.

Authors:  Aurélien J Doucet; Jeremy E Wilusz; Tomoichiro Miyoshi; Ying Liu; John V Moran
Journal:  Mol Cell       Date:  2015-11-12       Impact factor: 17.970

8.  RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity.

Authors:  Kazuhiko Ohshima
Journal:  Int J Evol Biol       Date:  2013-08-01

9.  Hagfish genome reveals parallel evolution of 7SL RNA-derived SINEs.

Authors:  Kenji K Kojima
Journal:  Mob DNA       Date:  2020-05-22
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.