| Literature DB >> 23984183 |
Abstract
A substantial number of "retrogenes" that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3'-end sequences of various SINEs originated from a corresponding LINE. As the 3'-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require "stringent" recognition of the 3'-end sequence of the RNA template. However, the 3'-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3'-poly(A) repeats. Since the 3'-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution.Entities:
Year: 2013 PMID: 23984183 PMCID: PMC3747384 DOI: 10.1155/2013/424726
Source DB: PubMed Journal: Int J Evol Biol ISSN: 2090-052X
Figure 1Schematic representation of the formation of a processed pseudogene.
Figure 2Schematic representation of a SINE and a LINE that have the same 3′-end sequence. Three-dimensional protein structures are taken from the L1-encoded ORF1 protein [94] and the reverse transcriptase of human immunodeficiency virus type 1 [95].
Identification of SINE/LINE pairs [34].
| SINE | Species | Promoter | LINE tail | Description of SINE/LINE pair | ||
|---|---|---|---|---|---|---|
|
| ||||||
| MIR (CORE-SINEs: | [ | All mammals | tRNA | L2 | [ | [ |
| CORE-SINEs (MIR3/Ther-2) | [ | Mammals | tRNA | L3 | [ | [ |
| CORE-SINEs (Mar-1/MAR1_MD) | [ | Marsupials | tRNA | RTE-3_MD | [ | [ |
| MAR4 | [ | Opossum and wallaby, | (5′-end of RTE) | RTE-2 (MD, ME) | [ | [ |
| RTESINE1 | [ | Opossum, | (5′-end of RTE) | RTE-1_MD | [ | [ |
| Ped-1 | [ | Springhare, | 5S rRNA | BovB_Pca | [ | [ |
| Ped-2 | [ | Springhare, | tRNA (ID SINE) | BovB_Pca | [ | [ |
| Bov-tA | [ | Ruminants | tRNAGlu | Bov-B | [ | [ |
| Bov-A2 | [ | Ruminants | (5′-end of BovB) | Bov-B | [ | [ |
| SINE2-1_EC | [ | Horse, | tRNA | RTE-1_EC | [ | [ |
| Afro SINEs | [ | All Afrotherians | tRNA | RTE1 (LA, Pca) | [ | [ |
| RTE1-N1_LA | [ | Elephant, | (5′-end of RTE) | RTE1_LA | [ | [ |
| SINE2-1_Pca | [ | Hyrax, | tRNA | RTE1_Pca | [ | [ |
|
| ||||||
|
| ||||||
| TguSINE1 | [ | Zebra finch, | tRNAIle | CR1-X | [ | [ |
| Tortoise Pol III/SINE | [ | Tortoises and turtles, | tRNALys | PsCR1 | [ | [ |
| Sauria SINE | [ | Lizard, | tRNA | Anolis Bov-B | [ | [ |
| Anolis SINE 2 | [ | Lizard, | (Box A & B) | Anolis LINE 2 | [ | [ |
| SINE2-1B_Acar/ | [ | Lizard, | tRNA | Vingi-2_Acar | [ | [ |
|
| ||||||
|
| ||||||
| V-SINEs (SINE2-1_XT) | [ | Frog, | tRNA | L2-4_XT | [ | [ |
| CORE-SINEs (MIR_Xt) | [ | Frog, | tRNA | L2-5_XT | [ | [ |
|
| ||||||
|
| ||||||
| Sma I | [ | Chum and pink salmon, | tRNALys | SalL2 | [ | [ |
| Fok I | [ | Charr, | tRNALys | SalL2 | [ | [ |
| SlmI | [ | All salmonids, | tRNALeu | RSg-1 | [ | [ |
| CORE-SINEs (Hpa I) | [ | All salmonids, | tRNA | RSg-1 | [ | [ |
| CORE-SINEs | [ | Cichlid fish, | tRNA | CiLINE2 | [ | [ |
| CORE-SINEs (UnaSINE1, UnaSINE2) | [ | Eel, | tRNA | UnaL2 | [ | [ |
| HAmo SINE | [ | Carp, | tRNA | HAmoL2 | [ | [ |
| DeuSINEs | [ | Mammals, chicken, | 5S rRNA | CR1-4_DR (CR1-7, CR1-9, CR1-13) | [ | [ |
| DeuSINEs | [ | Coelacanth and dogfish shark, | tRNA | CR1-4_DR-like | [ | [ |
| DeuSINEs (OS-SINE1) | [ | Salmon and trout, | 5S rRNA | RSg-1 | [ | [ |
| V-SINEs (HE1) | [ | Sharks and rays, | tRNA | HER1 | [ | [ |
| V-SINEs (DANA) | [ | Zebrafish, | tRNA | CR1-3DR/ZfL3 | [ | [ |
| V-SINEs (Lun1) | [ | Lungfish, | tRNA | LfR1 | [ | [ |
| SINEX-1_CM/SINE2-1_CM | [ | Elephant shark, | tRNA | CR1-2_CM | DQ524334 | [ |
|
| ||||||
|
| ||||||
| DeuSINEs (BflSINE1) | [ | Amphioxus, | tRNA | Crack-16_BF | [ | [ |
|
| ||||||
|
| ||||||
| SURF1/SINE2-4c_SP | [ | Sea urchin, | tRNA | CR1-4_SP | [ | [ |
| DeuSINEs (SINE2-3_SP) | [ | Sea urchin, | tRNA | CR1Y_SP (CR1X_SP) | [ | [ |
| SINE2-8_SP | [ | Sea urchin, | tRNA | L2-1_SP/CR1-3_SP | [ | [ |
|
| ||||||
|
| ||||||
| Gecko | [ | Mosquito, | tRNA | I-74_AAe (MosquI, I-58, I-59, I-62, I-64, | [ | [ |
|
| ||||||
|
| ||||||
| Nve-Nin-DC-SINE-1 | [ | Sea anemone, | tRNA | L2-22_NV | [ | [ |
| Nve-Nin-DC-SINE-2 | [ | Sea anemone, | tRNA | CR1-5_NV | [ | [ |
| Nve-Nin-DC-SINE-3 | [ | Sea anemone, | tRNA | CR1-15_NV | [ | [ |
| SINE2-1_NV | [ | Sea anemone, | tRNA | CR1-16_NV | [ | [ |
| SINE2-5_NV | [ | Sea anemone, | tRNA | Rex1-24_NV | [ | [ |
|
| ||||||
|
| ||||||
| Mg-SINE | [ | Rice blast fungus, | tRNA | MgL/MGR583 | AF018033 | [ |
| SINE2-1_BG | [ | Powdery mildew fungus, | tRNA | Tad1-24_BG (HaTad1-3, 1-5) | [ | [ |
|
| ||||||
|
| ||||||
| EdSINE1 (SINE-lile) | [ | Amoeba, | Unknown | R4-1_ED | [ | [ |
| R4-N1_ED (SINE-lile) | [ | Amoeba, | Unknown | R4-1_ED | [ | [ |
| EhLSINE1/ehapt2 (SINE-lile) | [ | Amoeba, | Unknown | EhLINE1/EhRLE1 | [ | [ |
| EhLSINE2 (SINE-like) | [ | Amoeba, | Unknown | EhLINE2/EhRLE3 | [ | [ |
|
| ||||||
|
| ||||||
| TS | [ | Tobacco, | tRNA | RTE-1_Stu | [ | [ |
| ZmSINE2/SINE2_SBi | [ | Maize, | tRNA | LINE1-1_ZM | [ | [ |
| ZmSINE3 | [ | Maize, | tRNA | LINE1-1_ZM | [ | [ |
|
| ||||||
|
| ||||||
| SINEX-1_CR | [ |
| Unknown | RandI-2/ | [ | [ |
| SINEX-2_CR | [ |
| Unknown | RandI-2 (RandI-3) | [ | [ |
| SINEX-3_CR | [ |
| tRNA | L1-1_CR | [ | [ |
| SINEX-4_CR | [ |
| Unknown | RandI-2 (RandI-3) | [ | [ |
| SINEX-5_CR/SINEX-6_CR | [ |
| tRNA | RandI-5 | [ | [ |
(*1) Subfamilies.
Figure 3Sequence comparison of tobacco TS SINE with its partner LINE. The entire sequence of the TS SINE was aligned with the 3′-end sequence (~200 nucleotides) of a potato RTE-clade LINE. Dots and hyphens represent identical nucleotides and gaps, respectively. The tRNA-related region of the SINE is underlined, with the promoter sequences for RNA pol III (A & B boxes) highlighted in red. Nucleotide positions are shown on the right.
Figure 4Relationship between the number of SINE/LINE pairs and the number of LINEs in each clade. The vertical axis shows the number of SINEs with a LINE tail [34]. The horizontal axis shows the number of LINEs belonging to each clade. The linear regression line, determined by the least squares approach, is shown, except for L1. R 2 indicates the coefficient of determination. CR1-clade LINEs (580 families) and L2-clade LINEs (10 families) were summed due to their confusing nomenclature.
Figure 5Molecular phylogeny and pattern of nucleotide substitutions of the S5a-derived region of PIPSL from all hominoid lineages [61]. The branches are drawn in proportion to the number of substitutions, with nonsynonymous (n) and synonymous (s) substitutions shown above each branch (n : s). An ancestral PIPSL lineage (indicated by bold lines) gradually accumulated 19 nonsynonymous and 2 synonymous substitutions. Since the split from the ancestral lineage, all the respective lineages have accumulated synonymous substitutions, except for gibbons, which still have a high n : s ratio (17 : 2).
Figure 6The number of LINE families belonging to each LINE clade according to biological taxa [34]. LINE clades in which the partner LINE of a SINE was identified are shown. The remaining clades are grouped as “Others” (Repbase 16.10); land plants: mostly flowering plants.
3′-Repeats of plant SINE families [34].
| SINE | Species | 3′-Repeat | LINE tail | Reference for SINEs |
|---|---|---|---|---|
|
| ||||
| SINEX-1_CR |
| (ATT) | RandI-2/DualenCr3 | [ |
| SINEX-2_CR |
| (CTTT) | RandI-2 (RandI-3) | [ |
| SINEX-3_CR |
| (A) | L1-1_CR | [ |
| SINEX-4_CR |
| (ATT) | RandI-2 (RandI-3) | [ |
| SINEX-5_CR/SINEX-6_CR |
| (ATT) | RandI-5 | [ |
|
| ||||
|
| ||||
| Au | Angiosperms and a gymnosperm | (T)2–5 | Nd | [ |
| ZmSINE1 (Au-like) |
| (T) | Nd | [ |
| SINE2-1_ZM (Au-like) |
| (T)3 | Nd | [ |
| SINE-5_Mad (Au-like) |
| (T)3 | Nd | [ |
|
| ||||
|
| ||||
| p-SINE1 |
| (T) | Nd | [ |
| p-SINE2 |
| (T) | Nd | [ |
| p-SINE3 |
| (T) | Nd | [ |
| ZmSINE2.1*/SINE2-1a_SBi |
| (T) | LINE1-1_ZM | [ |
| ZmSINE2.2* |
| (T) | LINE1-1_ZM | [ |
| ZmSINE2.3* |
| (T) | LINE1-1_ZM | [ |
| SINE2-1_SBi (ZmSINE2-like) |
| (T) | LINE1-1_ZM | [ |
| SINE2-1c_SBi (ZmSINE2-like) |
| (T) | LINE1-1_ZM | [ |
| ZmSINE3 |
| (A) | LINE1-1_ZM | [ |
| OsSN1/F524 |
| (A) | Nd | [ |
| OsSN2/SINE2-12_SBi |
| (A) | Nd | [ |
| OsSN3 |
| (A) | Nd | [ |
| SINE9_OS/SINE2-11_SBi (OsSN-like) |
| (A) | Nd | [ |
|
| ||||
|
| ||||
| TS |
| (TTG) | RTE-1_STu | [ |
| SB1-15 (S1/AtSN/RAthE/BoS) |
| (A) | Nd | [ |
| LJ_SINE-1 |
| (A) | Nd | [ |
| LJ_SINE-2 |
| (A) | Nd | [ |
| LJ_SINE-3 |
| (A) | Nd | [ |
| MT_SINE-1 |
| (A) | Nd | [ |
| MT_SINE-2 |
| (A) | Nd | [ |
| MT_SINE-3 |
| (A) | Nd | [ |
| SINE-1_Mad |
| (A) | Nd | [ |
| SINE-2_Mad |
| (A) | Nd | [ |
| SINE-4_Mad |
| (A) | Nd | [ |
| SINE2-1_PTr |
| (A) | Nd | [ |
| SINE2-2_PTr |
| (A) | Nd | [ |
*subfamilies. Nd: no data.
3′-Repeats of plant LINE families [34].
| Species | LINE clade | Families | 3′-Repeat | ||
|---|---|---|---|---|---|
| (A) | Other repeats | None | |||
| Flowering plants | L1 | 233 | 224 | 0 | 9 |
| RTE | 7 | 0 | 7∗b | 0 | |
|
| |||||
| L1 | 15 | 2∗a | 8∗c | 5 | |
| Green algae | RandI | 8 | 0 | 8∗d | 0 |
| RTEX | 6 | 0 | 6∗e | 0 | |
∗aL1-1_CR (Chlamydomonas), Zepp (Chlorella).
∗b(TTG)n, (TTGATG)n.
∗c(CATA)n, (CA)n, (CAA)n, (TAA)n.
∗d(ATT)n, (CTATTT)n.
∗e(CA)n, (CAA)n, (CCAT)n, (ACAATG)n, (CTTGTAA)n.
Figure 7Sequence comparisons of the 3′-end sequences of L1-clade LINEs and monocot SINE families. The 3′-end sequences of the monocot 1a (consensus), LINE1-1_ZM, and SINE2 (consensus) were aligned [34]. Vertical lines and hyphens represent identical nucleotides and gaps, respectively. A conserved region between the LINEs and SINEs is boxed. R: A/G, Y: C/T, S: C/G, W: A/T, N: any nucleotide.
Figure 8Secondary structure models for the 3′-end sequences of L1s and monocot SINEs. The putative transcripts form putative hairpin structures. Compensatory mutations (1–5) are shown by red rectangles. Conserved nucleotides are indicated by blue circles. The minimum free energy levels were −10.8 or −12.6 (kcal/mol) for L1s (monocot 1a and LINE1-1, resp.) and (−12.5)–(−13.7) for SINEs (ZmSINE2.3: −15.4 and SINE2-1c: −17.7). The structures were deduced using mfold [96].
Figure 9Proposed model for the 3′-end recognition of L1-clade LINEs. The ancestral L1-clade LINE in the ancestral green plant possessed a stringent, nonmammalian-type RNA recognition property. During the course of plant evolution, an L1 lineage lost the ability to recognize specifically the RNA template for reverse transcription, thereby introducing relaxed 3′-end recognition in land plants. ME1–3: plant L1 lineages; M, F: vertebrate L1 lineages.