Literature DB >> 28580197

Transposable elements in Drosophila.

Abstract

Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster.

Entities: Chemical Disease Gene Species

Keywords: LTR retrotransposons; P elements; TEs; TIR transposons; helitrons; non-LTR retrotransposons; retrovirus; transposons

Year: 2017 PMID： 28580197 PMCID： PMC5443660 DOI： 10.1080/2159256X.2017.1318201

Source DB: PubMed Journal: Mob Genet Elements ISSN： 2159-2543

Introduction

Transposable elements (TEs) exist in the genomes of organisms across all 3 domains of life. Also referred to as “jumping genes,” TEs move, or transpose, to different locations throughout the genomes in which they reside. As mobile genetic elements, TEs are both drivers of evolution and potentially harmful mutagens that may insert within gene-encoding sequences. Interestingly, the C-value paradox, or the lack of correlation between genome size and organism complexity, may be addressed by the presence of TEs, as genome size appears to correlate with TE abundance. One study found significant alteration in genome size among several species of Drosophila, which correlated with variation in the amount of repeating sequences, such as TEs. Differences in genome size as a result of TE content may have significant functional effects on Drosophila and other eukaryotic organisms. Data indicate that genome size correlates with body size, sperm length and duration of development in Drosophila species. This suggests that TEs may indirectly impose highly variable effects on their hosts through genome expansion or contraction, potentially influencing traits of evolutionary significance. TEs comprise about 45% of the human genome and at least 50% of the maize genome. Transposition of these elements has been linked to over 75 human diseases including hemophilia A, breast cancer, colorectal cancer, amyotrophic lateral sclerosis, and frontotemporal lobar degeneration. Furthermore, TEs also potentially contribute to neurologic development as well as neurologic diseases and disorders. Because of their prevalence and disease-causing potential, it is important to understand how TEs transpose and how their mobilization is regulated in eukaryotic organisms. Most TEs in the human genome, however, are completely inactive, indicating the need for a model organism in which to study these elements. The Drosophila melanogaster genome is one of the best studied eukaryotic genomes and while only about 20% of the genome consists of TEs, at least 30% of these elements are full length and believed to be active. As such, D. melanogaster is a promising model organism for the study of eukaryotic TEs. Since the discovery of TEs in maize by Barbara McClintock in the 1940s, it was proposed that these elements be classified into 2 major groups (Fig. 1): DNA transposons (class II elements) and retrotransposons (class I elements). Within these groups are numerous families of TEs, defined primarily by sequence similarity, and still many unclassified TEs. Some TEs are separated into unique subclasses due to structural elements or transposition mechanisms that are uncharacteristic of other TEs. TEs may also be classified as autonomous or non-autonomous, depending on whether they transpose independently or require the machinery of autonomous TEs for mobilization. Generally, non-autonomous DNA transposons are regarded as inactive, though this is not always the case, while non-autonomous retrotransposons often utilize the machinery of autonomous retrotransposons for mobilization.

Figure 1.

classes of Transposons. Shows classes, subclasses and groups of TEs described in this review. (A) classes, subclasses and groups of class I (RNA) transposons are shown in light blue. LTR is Long Terminal Repeat, SINE is Short Interspersed Nuclear Elements, LINE is Long Interspersed Nuclear Elements (B) classes, subclasses and families of class II (DNA) transposons are shown in light blue. Only the Tc1/mariner and P families are shown for simplicity. TIR is Terminal Inverted Repeats. DNA transposons, or terminal inverted repeat (TIR) transposons, consist of a transposase gene flanked by TIRs, and move via a cut-and-paste mechanism. TIRs are repeating sequences found at both ends of these elements, and are inverted with respect to each other. The transposase is responsible for excising the transposon and inserting it into a new location. No active DNA transposons have been identified in humans due to lack of functional transposases, but at least 16% of the DNA transposons in D. melanogaster are full length and potentially active, including 1360, hobo, Bari1, pogo, and P elements. Helitrons, a subclass of DNA transposons also present in D. melanogaster and other eukaryotic genomes, mobilize by a different mechanism than TIR transposons, using rolling-circle replication with a single stranded DNA intermediate. The regulation of DNA transposons in somatic cells is poorly understood, though some regulatory mechanisms have been identified for P elements in D. melanogaster. Furthermore, regulatory mechanisms have been identified in Drosophila germline cells to prevent mobilization of all transposable elements, as harmful transposition events in these cell lines are likely to negatively impact the viability of progeny. Retrotransposons, or RNA transposons, are classified as either long-terminal repeat (LTR) retrotransposons or non-LTR retrotransposons, depending on the presence or absence of LTRs flanking genes required for element mobilization. The only active TEs identified in humans are non-LTR retrotransposons: Autonomous LINE-1 (long interspersed nuclear element-1) elements, which comprise 17% of the human genome, and non-autonomous Alu and SVA elements (Fig. 1). In Drosophila, however, at least 21% of non-LTR retrotransposons and 45% of LTR retrotransposons are full-length and potentially active, such as the LINE-like elements TART, jockey and Juan and the LTR retrotransposons roo, copia, blood, gypsy, and mdg1. Retrotransposons use a copy-and-paste mechanism by first generating an RNA intermediate that is then reverse transcribed by an element encoded reverse transcriptase (RT) into a new DNA copy that is inserted elsewhere in the genome. Retrotransposons are regulated in Drosophila somatic cells by heterochromatin formation that is mediated by endogenous small interfering RNAs (esiRNAs) generated from retrotransposon-derived double-stranded (ds)RNA precursors by Dicer2 (Dcr2). Due to the relative inactivity of TEs in the human genome, this regulatory pathway has not been fully elucidated in human cell lines. This review will explore the mechanisms of transposition and regulation of these TEs in D. melanogaster.

Germline regulation of TEs

In D. melanogaster, movement of all TEs is tightly regulated in germline cells, where uncontrolled transposition events may impose significant genomic defects that would be inherited by successive generations. The primary germline regulatory pathway is mediated by PIWI (P-element induced wimpy testis), Aubergine (AUB), Argonaute 3 (AGO3) and small PIWI-interacting RNAs (piRNAs) that regulate TEs via RNAi and epigenetic mechanisms, including heterochromatin formation. The piRNAs derived from TEs, formerly referred to as repeat-associated small interfering RNAs (rasiRNAs), are generated by processing of sense and antisense transcripts from TEs into small RNAs by PIWI and AUB. Both the proteins and piRNAs required for this pathway are almost exclusively produced in germ cells. The generation of piRNAs is distinct from that of esiRNAs, as this pathway is independent of Dcr2 and relies solely on PIWI proteins not involved in esiRNA biogenesis. Most piRNAs are generated from transposon transcripts and form PIWI-piRNA complexes that function to silence transposon transcripts via RNAi. For a review of piRNA-mediated TE silencing in Drosophila, see ref. 31.

DNA transposons

TIR transposons

DNA transposons are often less than 5 kb in length and typically encode a single transposase gene (Fig. 2A). DNA transposons are divided into 2 sub-classes based on their transposition mechanisms. Sub-class I elements utilize the canonical cut-and-paste mechanism of TIR transposon transposition, and are divided into several superfamilies: Tc1/mariner, PIF/Harbinger, hAT, Mutator, Merlin, Transib, P, piggyBac, and CACTA. Sub-class II DNA transposons include Helitron and Maverick elements that utilize unique transposition mechanisms (see Helitrons section). DNA transposons are generally regarded as “extinct” in humans and other mammals as most are non-autonomous. D. melanogaster, however, has numerous active DNA transposons with full length TIRs and functional transposase genes. For example, the Bari elements of the Tc1/mariner superfamily of DNA transposons have been identified in D. melanogaster and many other Drosophila genomes. D. melanogaster Bari1 elements are autonomous DNA transposons with short TIRs, usually less than 40 nucleotides in length, although non-autonomous Bari1 elements with long TIRs have been identified in other Drosophila species. Transposition of Bari1 elements and other TEs in the Tc1/mariner superfamily is initiated by interactions between one or more direct repeats in the TIRs and the element encoded transposase (Fig. 2B). Dimerization of TIR bound transposases induces cleavage of the element from surrounding sequences (Fig. 2B). Like Tc1 elements, Bari1 elements then target TA sites and integration results in the duplication of these nucleotides at both ends (Fig. 2B). Target site duplications are characteristic of TIR transposon insertions and may be used to identify transposition events and distinguish between different families of TIR transposons.This transposition mechanism is used by most DNA transposons, including 1360, hobo, pogo, and P elements in D. melanogaster. However, the regulation of these transposition events in somatic cells is still poorly understood.

Figure 2.

TIR transposase and transposition mechanism. (A) TIR Transposases have an N-terminal DNA binding domain with HTH motifs and a C-terminal DDE or DDD catalytic domain. (B) For transposition, TIR transposases (purple circles) first bind to inverted repeats (red triangles, IR) flanking the element. Bound transposases then dimerize followed by cleavage of the element from surrounding sequences (black lines) and integration into a new target site (AT) resulting in target site duplication. The Tc1 and Bari1 transposase proteins consist of 2 domains: An N-terminal DNA binding domain containing helix-turn-helix motifs and a highly conserved nuclear localization signal, and a C-terminal catalytic domain with a DDE motif (Fig. 2A). The catalytic DDE motif, or DDD motif in some families of transposases, is required for the transposition of DNA transposons in sub-class I. These conserved motifs also allow the import of the transposases into the nucleus to bind TIRs, forming a complex that promotes cleavage of the entire double-stranded element. Non-autonomous DNA transposons, such as miniature inverted repeat transposable elements (MITEs), can also be mobilized in eukaryotic genomes. MITEs are short TIR transposons, generally less than 500 bp in length, that do not encode a functional transposase and often lack coding regions entirely. Numerous Bari-like MITEs have been detected in 9 Drosophila species, and are suspected to have originated from internal deletions in Bari elements. Many of these Bari-like MITEs in Drosophila have been amplified in their respective genomes. Furthermore, MITEs derived from mariner elements have been identified in 20 Drosophila species, and make up 23% of all mariner elements, supporting the hypothesis that these elements may be mobilized in trans by other autonomous TEs in the genome. Other MITEs may also arise from internal deletions in full-length TIR transposons or shorter non-autonomous TIR transposons and may continue to be amplified in the genome by the machinery of other autonomous TEs. Currently, little is known about the exact mechanisms by which MITEs are mobilized in host genomes, although trans-mobilization is the best supported hypothesis.

P elements

P elements are the best-studied DNA transposons in the D. melanogaster genome. Full-length autonomous P elements are 2.9 kb in length with 31 bp TIRs and 4 exons that encode a transposase when spliced. A similar element in the human genome, THAP9, is a confirmed DNA transposon with the ability to mobilize P elements in both Drosophila and human cell lines. Like other TIR transposons, P elements utilize a cut-and-paste mechanism of transposition and create target site duplications upon insertion. P elements are unique, however, in their abilities to amplify themselves in Drosophila germline cells due to preferential insertion at regions of the genome that bind the origin recognition complex and function as replication origins. By transposing during S phase from replicated genomic regions to un-replicated regions, P elements are copied, amplifying their presence in the genome with the assistance of the host DNA repair machinery. These elements belong to the same class as pogo and hobo elements, and play a significant role in hybrid dysgenesis syndrome, a phenomenon observed in the progeny of hybrid crosses of certain Drosophila strains (Fig. 3A). Drosophila strains are defined as either P type or M type, depending on whether hybrid dysgenesis results from crosses with the paternal or maternal parent. The phenomena observed in this syndrome include high rates of mutation, recombination, and sterility in the F1 hybrids of only P type male crosses with M type females (Fig. 3A). Alternatively, P type female crosses with M type males do not result in hybrid dysgenesis. In P strains, autonomous P elements are abundant and tightly regulated in germline cells, a condition referred to as the P cytotype. In M strains, however, there are no autonomous P elements nor regulatory pathways in germline cells to prevent transposition of P elements (Fig. 3A). Because the cytoplasmic conditions of the P cytotype are exclusively transmitted maternally, crosses between M type females, which lack the P cytotype, and P type males, which cannot pass on their P cytotype, result in unregulated P element mobilization in the germlines of the hybrid progeny (Fig. 3A). In addition to P elements, hobo elements and I elements (a family of non-LTR retrotransposons) play a similar role in hybrid dysgenesis when males and females of certain Drosophila strains are crossed.

Figure 3.

P element splicing and hybrid dysgenesis. (A) Hybrid dysgenesis results when M strain females are crossed with P strain males. Because the P element repressor (pink circles) is only transmitted by P cytotype females, progeny of the P strain male-M strain female cross have many mutations caused by germline P element transposition. These mutations often result in sterility (red X). (B) Exons 1-4 of P element transcripts are spliced to form a functional 87 kDa transposase (black lines). When intron 3 is not properly spliced, a stop codon (red star) generates a 66 kDa truncated repressor of P element transposition (pink lines). The regulation of P elements in D. melanogaster is better understood than that of other DNA transposons. P element transposition is regulated primarily by alternative splicing of the P element transposase mRNA (Fig. 3B). In germline cells, all 3 transposase introns are spliced out, producing a functional 87 kDa P element transposase. Alternatively, in somatic cells, splicing of the third intron is skipped, generating mRNA that encodes a non-functional protein due to an early stop codon in the third intron. The resulting 66 kDa truncated transposase is not only inactive, but represses the transposition of P elements in somatic cells. Some evidence suggests that this transposition repressor may also be generated in the germline during oogenesis, potentially contributing to the exclusively maternal transmission of the P cytotype in P strains of D. melanogaster, as this is one mechanism by which female germline cells may repress P element transposition. Other DNA transposons may also be regulated by a similar splicing mechanism, but this has not been confirmed and is not possible for some elements. The most abundant TIR transposon in D. melanogaster euchromatin, 1360 (also called Hoppel), is structurally and functionally similar to P elements, but lacks introns entirely, preventing regulation by any splicing mechanisms. However, unlike retrotransposons, DNA transposons likely do not utilize esiRNA mechanisms of repression in somatic cells as very few small RNAs are generated from these elements. Instead, DNA transposon regulation appears to occur by generation of a non-functional transposase, which may include alternative splicing mechanisms or result from mutations in the nuclear localization signal or the catalytic domain of the protein.

Helitrons

Helitrons belong to a unique subclass of DNA transposons with a distinct mechanism of transposition. Unlike other DNA transposons, Helitrons lack TIRs and encode a DNA helicase and replicator initiator (Rep) protein with nuclease and ligase functions, resembling the machinery of rolling-circle replicons (Fig. 4A). A subclass of Helitrons, called Helentrons, encode an additional apurinic-apyrimidinic endonuclease and may also mobilize non-autonomous Helentron-associated interspersed elements (HINEs). Helitrons are abundant in plant genomes and have been identified in many other eukaryotic genomes, including D. melanogaster in which 1% of the genome consists of non-autonomous Helitrons. Drosophila interspersed nuclear element-1 (DINE-1), the most abundant TE in the D. melanogaster genome, is a non-autonomous Helentron, distinct from HINEs due to the presence of unique structural features such as inverted repeats.

Figure 4.

Helitron enzymes and transposition mechanism. (A) Helitron transposons encode a protein with both DNA helicase and Replicator functions. (B) The Helitron is represented with purple and pink lines. The Replicator domain (pink circle) first binds to both donor (TC) and target (AT) creating nicks in both. The DNA helicase domain (purple circle) then displaces the donor strand. The Replicator domain cleaves the 3′ end of the element, promoting formation of a circular single-stranded DNA intermediate. Rep cleaves the circular single-stranded intermediate and promotes covalent bond formation between the 5′ and 3′ ends of the donor strand and target site. Host DNA replication generates a second DNA strand at both the donor and target sites. While the Replicator nicks the other end of the donor and facilitates attachment to the target site. The second strand of the element is generated at both the donor and target sites upon host DNA replication. Helitrons utilize a rolling-circle replication mechanism of transposition, which has recently been validated by experiments conducted with the Helraiser Helitron in bats. This model suggests that tyrosine residues of the Rep protein simultaneously nick the 5′ end of one Helitron strand at a conserved TC sequence and the AT sequence on the target site. The Helitron donor strand is displaced by the encoded helicase (Fig. 4B). Rep facilitates cleavage of the donor strand at a conserved hairpin signal in the 3′ end, which then attacks the 5′ end of the element, generating a circular, single-stranded DNA (ssDNA) intermediate (Fig. 4B). To complete transposition of the element, Rep cleaves the circular ssDNA intermediate to promote covalent bond formation between the 5′ and 3′ ends of the Helitron donor strand and the nicked target site. Helitrons preferentially insert at AT target sites, while Helentrons preferentially use TT target sites and neither element creates target site duplications upon insertion. Host DNA replication is responsible for generating the second strand at both the donor and target sites, permitting amplification of these elements. Helitrons and Helentrons may also capture host genes during transposition, which complicates their classification due to poor sequence similarity. This may occur when the 3′ end hairpin signal in the Helitron is bypassed, and strand displacement continues through nearby gene regions until a new termination signal is reached. These events are prominent in maize and have contributed to the evolution of the maize genome, but are not well characterized in D. melanogaster.

Retrotransposons

Retrotransposons, or RNA transposons, comprise more than 30% of the human genome and are the most abundant class of TEs in the D. melanogaster genome. Retrotransposons include LTR retrotransposons, non-LTR retrotransposons (LINEs and LINE-like elements), short interspersed nuclear elements (SINEs), and other similar TEs. Both LTR and non-LTR retrotransposons use similar mechanisms of transposition and regulation. Retroviruses may also be classified as retrotransposons as they mobilize via similar mechanisms, but are additionally able to infect other cells and organisms by horizontal gene transfer. Retrotransposons are primarily characterized by the presence of gag and pol genes that may be overlapping and require frameshifting to be translated, but may also be encoded in a single fused ORF (Fig. 5A). Retrotransposon genes resemble those of retroviral genomes in both structure and function, and some retrotransposons contain a third gene encoding the retroviral envelope (env) protein necessary for mobilization of retroelements outside of their host cells (Fig. 5A). Many of these retrotransposons are classified as endogenous retroviruses, or errantiviruses in Drosophila and other insects, as they either arose from retroviruses that lost infectivity or LTR retrotransposons that acquired env genes from exogenous sources.

Figure 5.

Retrotransposons and the LTR retrotransposition mechanism. (A) Non-LTR transposons encode both Gag and Pol, but are not flanked by LTRs. LTR retrotransposons contain gag and pol genes surrounded by LTRs. In addition to gag and pol genes, retroviruses encode an env gene. (B) Gag and pol of retrotransposon mRNAs are first translated into a polyprotein. The protease (PR) of the Pol cleaves the peptide into integrase (IN) and reverse transcriptase (RT) enzymes. The RT, retrotransposon and IN are then packaged into virus-like particles (VLPs) for import into the nucleus where retrotransposon cDNA is integrated into the genome (red X). The mechanisms by which VLP contents are localized to the nucleus and retrotransposon cDNA is integrated into the target site are unknown (?). The retrotransposon pol gene encodes a polyprotein, typically consisting of a protease, an integrase, and a reverse transcriptase (RT) with an RNase H domain and DNA polymerase activity (Fig. 5A). The RT is common to all autonomous retrotransposons, as it is required for reverse transcription of the RNA intermediate to generate DNA copies of these TEs. The protease is involved in processing of precursor proteins, such as the Pol polyprotein. The integrase is required for insertion of cDNA into the host genome. Gag is the primary component of virus-like nucleocapsid particles, formed by polymerization of Gag monomers, which provide a structural coat for components involved in the reverse transcription event of retrotransposon mobilization (Fig. 5B).

Retrotransposon regulation

In D. melanogaster, the regulation of retrotransposons in somatic cells is mediated by esiRNAs, which are generated by Dcr2 cleavage of long dsRNA precursors derived from convergent sense and antisense transcription of retrotransposons in the genome. Data support a model in which esiRNAs regulate retrotransposons in the nucleus via heterochromatin formation in D. melanogaster and other eukaryotic organisms, such a S. pombe. The mechanism by which this occurs has not been fully elucidated, but likely involves recruitment of heterochromatin-inducing factors by esiRNA complexes that may recognize their target RNA during active transcription of the TE. The use of small RNAs to induce heterochromatin formation is a common motif in transposon regulation, as both DNA transposons and retrotransposons are regulated in the Drosophila germline by the piRNA pathway. A similar siRNA-mediated pathway has been reported in humans to regulate LINE-1 retrotransposons via RNAi, but the dsRNA precursors are generated by a different mechanism than in D. melanogaster.

LTR retrotransposons

LTR retrotransposons are abundant in Drosophila melanogaster, as well as in humans. In D. melanogaster, there are 3 recognized groups of LTR retrotransposons (Gypsy, Copia, and BEL/Pao), consisting of 8 clades and at least 35 families. The Gypsy group is the largest, consisting of 27 families, separated into 5 subgroups: gypsy, ZAM, Idefix, 412, and blastopia. The Copia and BEL/Pao groups consist of just 4 and 5 families, respectively. The Gypsy and BEL/Pao groups may be distinguished from the Copia group by the arrangement of their pol ORFs. The protease and RT are followed by the integrase in Gypsy and BEL/Pao while the protease and integrase are followed by the RT in Copia. Mechanisms of transposition may vary slightly between these groups, but all contain LTRs, a feature also common to retroviruses. LTRs play a significant functional role in the mobilization of these elements. For both retrotransposons and retroviruses, LTRs interact directly with specific integrase domains for insertion into target regions of the genome. Additionally, LTRs are processed by the integrase before insertion. Joining of the LTR ends to the chromosomal DNA generates target site duplications much like those of DNA transposon insertions. Due to their structure, LTRs also permit recombination events in regions of the genome with high recombination rates and between similar elements in close proximity, often resulting in spontaneous mutations and the remainder of a single LTR (solo-LTR) at the recombination site. As a result, greater numbers of retrotransposon copies are detected in genomic regions with low recombination rates due to selection against these mutagenic events. Notably, several precise excisions of Gypsy elements from the D. melanogaster genome have been detected in strains known for mutations and spontaneous reversions, indicating a mechanism of retrotransposon excision other than recombination as no trace of the original element is left behind. One study concluded that these excisions are the result of the element's integrase directly removing the element from the genome and restoring the initial target site, potentially illuminating a new mechanism of retrotransposon mobilization. Because a large majority of LTR retrotransposons accumulate within the inaccessible heterochromatin of Drosophila chromosomes, their LTR sequences may be analyzed to approximate when these elements were inserted. Analyses of the D. melanogaster genome indicate that most heterochromatic copies of LTR retrotransposons integrated after the divergence of this species from D. simulans, approximately 5 million years ago. More recent analyses indicate that heterochromatic copies of these elements integrated in the D. melanogaster genome within the last 100,000 y while non-LTR retrotransposons are estimated to have integrated much earlier, some even millions of years before the divergence of D. melanogaster from D. simulans. LTR retrotransposon insertions are not limited to heterochromatic regions, and may even occur within protein-coding regions of the genome. One study found that one-third of the LTR retrotransposons in D. melanogaster euchromatin are integrated either directly in a gene or within 1000 bp of a gene. Most of these gene-associated insertions have occurred relatively recently, and because they tend to occur in highly conserved genes essential to cell survival, are selected against over time. Interestingly, more of these LTR retrotransposon insertions are associated with genes involved in signal transduction, morphogenesis, behavior and responses to external stimuli than genes involved in cell differentiation and metabolism. In addition to the functional effects of gene-associated LTR retrotransposon insertion, gene expression can be affected if these elements are inserted near promoter elements. A recent study found that several solo-LTR elements of the roo family are inserted near the transcription start site (TSS) of a candidate cold resistance gene (CG18446) in several strains of D. melanogaster, contributing cis-regulatory elements to the promoter of this gene and affecting transcription factor binding sites. One of these inserted solo-LTRs, FBti0019985, generated a new TSS for CG18446. Strains carrying this insert demonstrated upregulated expression of the gene in embryos as well as increased viability under both cold-stress and non-stress conditions relative to control strains. These observations demonstrate the potential of retrotransposon insertions to not only cause detrimental mutations in the genome, but to also contribute adaptive functions to their hosts. Several LTR retrotransposons contain a third gene downstream of gag and pol, the env gene of retroviral genomes, potentially permitting horizontal transmission to other cells and organisms. The env gene is often non-functional in LTR retrotransposons, although this is not always the case. Because these elements strongly resemble proviruses (retroviruses that have integrated in host genomic DNA) they can be difficult to classify. For example, the gypsy retrotransposon of D. melanogaster was the first identified endogenous retrovirus in invertebrates, as it can be horizontally transmitted. However, despite characterization of this element as an errantivirus, phylogenetic analyses based on RT sequences still group gypsy with other LTR retrotransposons. Several other LTR retrotransposons in Drosophila are similarly characterized due to the presence of the env gene, including 297, 17.6, tom, Idefix, ZAM, and tirant. One study concluded that the Gypsy group of errantiviruses in Drosophila obtained their env genes from insect baculoviruses, DNA viruses that exclusively infect insects and arthropods. This conclusion is supported by the observation that TED, a member of the gypsy family of LTR retrotransposons, can integrate into the genome of the baculovirus Autographa californica, permitting capture of baculoviral elements. Events like these may be responsible for the unusually high occurrence of horizontal transfer of TEs in Drosophila.

LTR retrotransposon transposition

Retroelements are first transcribed into gag-pol fusion transcripts followed by translation into Gag-Pol fusion protein products, sometimes by programmed translational frameshift. Gag-Pol peptides are then rapidly cleaved into individual protein products by the retroelement encoded protease (Fig. 5B). Programmed translational frameshift occurs in many retrotransposon transcripts near the end of the gag ORF due to a rare codon awaiting the arrival of its corresponding tRNA. Resulting translational pausing permits recognition of a more common, but frameshifted, codon, allowing efficient translation of the pol ORF in frame without interruption by the gag ORF stop codon. Following processing of the fusion protein, post-translationally modified Gag proteins polymerize to generate virus-like particles (VLPs) in the cytoplasm, capturing the retrotransposon transcript, RT, and integrase (Fig. 5B). A smaller, likely unmodified Gag of the LTR retrotransposon 1731 in D. melanogaster localizes to the nucleus, potentially contributing to the transfer of reverse transcription products from VLPs in the cytoplasm to the nucleus for insertion (Fig. 5B). However, nuclear localization signals have also been identified in integrases of several eukaryotic retrotransposons, much like the transposases of DNA transposons, and may play a role in the delivery of VLP contents to the nucleus. The same mRNA molecule can be both translated in the cytoplasm and captured in a VLP for use as the RT template. Some retrotransposons in D. melanogaster contain extended 5′ untranslated regions (UTRs) that regulate the transition of mRNA molecules from translating to packaging states. For example, LTR retrotransposon Idefix has a long 5′ UTR that generates an internal ribosome entry site in its mRNA to promote translation, but its translation can also be downregulated by Gag binding to the 5′ UTR to promote capture within VLPs. Within VLPs, single-stranded mRNAs are reverse transcribed to generate double-stranded DNA copies (Fig. 5B). The RT cleaves the RNA in RNA-DNA hybrids and also has DNA-dependent DNA polymerase activity (Fig. 5B). In yeast, the reverse transcription reaction of the LTR retrotransposon Ty1 is primed by the initiator methionine tRNA. However, other priming mechanisms have been observed, such as the self-priming mechanism of the Tf1 LTR retrotransposon. Much of this process is unclear and probably varies for different types and families of retrotransposons. Several mechanisms have been proposed for integration of retrotransposon cDNA into the host genome (Fig. 5B). While many retrotransposons demonstrate no specificity for target insertion sites, elements of the gypsy family of LTR retrotransposons in D. melanogaster show some target site preference. Integrase sequences are generally highly conserved among copies of the same retrotransposon, despite the atypically high variability of retrotransposon sequences, demonstrating their significant role in mobilization of these elements. Mechanisms by which integrases interact with genomic DNA for retrotransposon insertion are unclear, although chromatin accessibility and other structural features appear to play a role. A significant amount of research regarding integrase functions has been performed with the retroviral integrase of Human Immunodeficiency Virus type 1 (HIV-1) and the integrases of Ty1 and Ty3 LTR retrotransposons in yeast.

Non-LTR retrotransposons

Non-LTR retrotransposons, or LINE-like elements, have been classified into over 100 families, separated into 28 clades and 6 groups: R2, L1, RTE, I, Jockey and RandI. Non-LTR retrotransposons are structurally similar to LTR retrotransposons, but often lack some of the ORFs and protein domains encoded by LTR retrotransposons and do not contain LTRs at their 3′ and 5′ ends (Fig. 6A). The absence of LTRs suggests that these elements interact with their encoded proteins differently than LTR retrotransposons and may utilize different mechanisms of transposition (Fig. 6B). While non-LTR and LTR retrotransposons encode similar proteins and often generate target site duplications, the reverse transcription and integration events of non-LTR retrotransposon mobilization are unique, at least for the R2 group of elements that often lack promoters and only encode a single ORF with RT and endonuclease activities (Fig. 6A).

Figure 6.

Non-LTR retrotransposons utilize target primed reverse transcription (TPRT) for integration. (A) R2 non-LTR retrotransposons (blue) are flanked by 28S rRNA genomic sequences (pink). The single R2 ORF encodes an enzyme with reverse transcriptase (RT) and endonuclease (EN) activities (blue circle). Other non-LTR retrotransposons may encode these enzymes as 2 separate proteins (RT and integrase with EN activity). The 3′ UTR is important for integration of R2 retrotransposons into 28S rRNA genes. (B and C) Proposed models of non-LTR retrotransposon (B) and R2 (C) insertion. DNA is shown in black (including reverse transcribed flanking sequences), 28S rRNA sequences in pink and retrotransposon sequences (mRNA and DNA) in blue, following the color scheme in (A). (B1) Non-LTR retrotransposon transcripts first hybridize to 28S rRNA sequences (vertical pink lines) followed by initiation of TPRT by single-stranded nicking of the target DNA (yellow star) by the element's encoded endonuclease. (B2) Following reverse transcription of the element, element mRNA is degraded by R2 RT/EN (//). Integration of the 5′ end of the element is not well understood (yellow ?). (B3) Cleavage of the second strand (yellow star) may occur at the same location as the first strand, or 2 base pairs upstream or downstream of this site. (B4) R2 RT/EN also generates the complementary R2 strand at the target site to fully transpose the element. (C1, C2, C3) The initial steps of this alternative mechanism are identical to those described in B1 and B2 except they take place on 2 homologous targets simultaneously resulting in a Holliday junction intermediate (C4). The Holliday junction intermediate is resolved by R2 RT/EN (C4) followed by second strand synthesis resulting in fully-integrated R2 non-LTR retrotransposons in 2 new locations. Studies in both D. melanogaster and Bombyx mori (silkworm) have demonstrated that non-LTR retrotransposons utilize target primed reverse transcription (TPRT) for integration of new retrotransposon copies in the genome (Fig. 6B). Because R2 non-LTR retrotransposons are co-transcribed with their flanking 28S rRNA sequences (Fig. 6A), these elements target 28S rRNA genes for insertion. TPRT is initiated by single-stranded nicking of the target DNA by the element's encoded endonuclease (Fig. 6B). The generated 3′ hydroxyl group then primes reverse transcription of the RNA intermediate before cleavage of the second target DNA strand (Fig. 6B). Cleavage of the second strand may occur at the same location as the first strand, or 2 base pairs upstream or downstream of this site. Second strand cleavage location determines whether the target site is unchanged, deleted, or duplicated. The RT/EN encoded by R2 is responsible for both cleavage of the target DNA and reverse transcription of the element. R2 RT/EN also generates the complementary R2 strand at the target site as this enzyme demonstrates DNA-dependent DNA polymerase activity and can displace the RNA template (Fig. 6B). The 3′ UTR (Fig. 6A) of these elements is required for TPRT and is inserted during reverse transcription of the element, while integration of the 5′ end is highly variable and thought to involve DNA repair and homologous recombination with regions of the 28S rRNA gene. In the absence of upstream homologous 28S rRNA sequences, integration of these elements is either prevented or results in truncations of the 5′ UTR. Because the endonuclease interacts with both ends of the integrating RNA, protein dimerization may be required for R2 transposition. A recent study in D. melanogaster found that R2 endonuclease domains are homologous to those of FokI restriction enzymes and Holliday-junction resolvases. These associations led the authors to propose a new model of TPRT transposition for these elements (Fig. 6C). Two R2 elements may utilize their flanking 28S rRNA sequences to bind regions of homologous chromosomes, generating a Holliday junction structure that is resolved by a dimer of the elements' endonucleases (Fig. 6C). While target site preferences are variable among non-LTR retrotransposons, the TPRT and Holliday junction mechanisms may also be used by other non-LTR retrotransposons. A few closely related non-LTR retrotransposons present in all Drosophila genomes, HeT-A, TART, and TAHRE, play a significant role in telomere maintenance and use a unique mechanism to localize during transposition. These elements are targeted to telomeres by their encoded Gag proteins, permitting generation of telomeric tandem repeats and serving functions similar to that of telomerase. Unlike most non-LTR retrotransposons, HeT-A elements encode 2 overlapping, frameshifted gag ORFs and no pol ORF and can therefore only be mobilized in trans by the enzymes of other retroelements. Alternatively, TART and TAHRE elements contain 2 non-overlapping ORFs encoding both gag and pol genes, but are less abundant in Drosophila telomeres than HeT-A. The gag ORFs of HeT-A elements are generally highly variable in both length and sequence, but contain some conserved motifs present in the other telomeric retrotransposons, such as the zinc knuckle (CCHC) in the gag ORF, which may be repeated several times within the element. The Gag proteins of HeT-A and TART elements localize to the nucleus, where HeT-A Gags aggregate in telomeric regions during interphase, forming Het dots. TART Gags, though localized to the nucleus, only associate with telomeres in the presence of HeT-A Gags, demonstrating the significance of HeT-A Gags in telomere-targeting and potentially signifying TART as the RT donor for HeT-A transposition. The Gag protein of TAHRE shares sequence similarity with the Gag of HeT-A, and generally localizes to the nucleus, but only localizes to the Het dots around telomeres in the presence of HeT-A Gags. Researchers hypothesize that the telomere-targeting specificity of the Gag proteins contributes to the abundance of telomeric HeT-A, relative to TART and TAHRE, despite the absence of pol genes in HeT-A elements. Once targeted to telomeres, telomeric retrotransposons presumably integrate via TPRT, like other non-LTR retrotransposons, and may also be regulated by small RNA pathways to induce heterochromatin formation in these regions. Observations of telomeric retrotransposon regulation in the germline indicate that regulation of these elements is significant to development as sequences targeted by piRNAs in the HeT-A 3′ UTR are highly conserved in D. melanogaster and closely related species.

Discussion

The mechanisms by which TEs mobilize in Drosophila and other eukaryotic genomes reveal several common features. For example, the primary integrating enzymes encoded by these elements, transposases and integrases/endonucleases, permit cleavage of target sites in the host genome and promote insertion of TEs into these genomic locations. These enzymes also require interactions with specific structural elements flanking the ORFs of their respective TEs, such as TIRs, LTRs, and 3′ or 5′ UTRs. Furthermore, many TEs, including DNA transposons, have demonstrated an ability to amplify upon mobilization, either through an RNA intermediate in the case of retrotransposons or through timing transposition events with events of the host cell cycle in the case of some DNA transposons. The similarities between LTR and non-LTR retrotransposon mobilization are also apparent, such as the formation of VLPs in the cytosol via polymerization of encoded Gag proteins, an event resembling a stage of the retroviral life cycle. The connection between LTR retrotransposons and retroviruses is further characterized by the presence of LTRs in all of these elements and the presence of the retroviral env gene in many D. melanogaster LTR retrotransposons. However, the relationship between retrotransposons and retroviruses remains unclear as studies have demonstrated the propensity of retrotransposons to acquire env genes and function as retroviruses, yet the presence of these elements in eukaryotic genomes in the first place may be the result of horizontal transfer from ancient viruses or retroviruses. In addition to similar mechanisms of transposition, TEs are regulated by a common mechanism in D. melanogaster: small RNA biogenesis. While DNA transposons do not appear to be regulated by the esiRNA pathway which regulates retrotransposons in somatic cells, the piRNA regulatory pathway of germline cells regulates all TEs in D. melanogaster. Both regulatory pathways rely on convergent sense and antisense transcription of TEs to generate the precursors to the siRNAs of these pathways. While these pathways utilize distinct proteins for the processing of siRNA precursors, the generated siRNAs regulate TEs via similar mechanisms, such as heterochromatin formation. The factors that regulate mobilization of TEs in host genomes may significantly influence genome size evolution, as TE abundance correlates with genome size and is directly dependent upon the efficiency of transposition regulation and selection against deleterious transposition events over evolutionary time. Furthermore, the interplay between host silencing of TEs and transposition demonstrates a host-parasite co-evolution in which familiar TEs are better regulated by their hosts than newly introduced TEs. This relationship is exemplified by hybrid dysgenesis in Drosophila and dysregulated transposition of TEs resulting from other hybrid crosses in eukaryotes, as hybrid hosts are maladapted to the newly introduced TEs in their genomes. Functionally, TEs may have a broad range of impacts on their hosts. Most deleterious integrations of TEs into host genomes are negatively selected against over time, while some TE insertions may provide adaptive functions to their hosts, such as the insertion of the solo-LTR FBti0019985 in a candidate cold stress response gene of D. melanogaster. Furthermore, the role of TEs in Drosophila telomere maintenance demonstrates the ability of these elements to develop significant functional roles in their hosts to positively influence genome stability. This has strong implications for the role of TEs in the evolutionary development of host genomes, as selective forces act on these transpositional events, influencing the coevolution of the genome and its TEs. As more is learned about the origin of TEs and their regulation by host genomes, the evolution and roles of TEs in eukaryotic genomes will become better defined.

124 in total

1. Variation across species in the size of the nuclear genome supports the junk-DNA explanation for the C-value paradox.

Authors: M Pagel; R A Johnstone
Journal: Proc Biol Sci Date: 1992-08-22 Impact factor: 5.349

Review 2. Co-evolution between transposable elements and their hosts: a major factor in genome size evolution?

Authors: J Arvid Ågren; Stephen I Wright
Journal: Chromosome Res Date: 2011-08 Impact factor: 5.239

Review 3. Eukaryotic transposable elements and genome evolution.

Authors: D J Finnegan
Journal: Trends Genet Date: 1989-04 Impact factor: 11.639

4. L1 retrotransposition is suppressed by endogenously encoded small interfering RNAs in human cultured cells.

Authors: Nuo Yang; Haig H Kazazian
Journal: Nat Struct Mol Biol Date: 2006-08-27 Impact factor: 15.369

5. Analysis of P transposable element functions in Drosophila.

Authors: R E Karess; G M Rubin
Journal: Cell Date: 1984-08 Impact factor: 41.582

6. The unusual telomeres of Drosophila.

Authors: J M Mason; H Biessmann
Journal: Trends Genet Date: 1995-02 Impact factor: 11.639

7. The Drosophila tom retrotransposon encodes an envelope protein.

Authors: S Tanda; J L Mullor; V G Corces
Journal: Mol Cell Biol Date: 1994-08 Impact factor: 4.272

8. The gag coding region of the Drosophila telomeric retrotransposon, HeT-A, has an internal frame shift and a length polymorphic region.

Authors: M L Pardue; O N Danilevskaya; K Lowenhaupt; J Wong; K Erby
Journal: J Mol Evol Date: 1996-12 Impact factor: 2.395

9. Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks.

Authors: Koichiro Tamura; Sankar Subramanian; Sudhir Kumar
Journal: Mol Biol Evol Date: 2003-08-29 Impact factor: 16.240

10. Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila.

Authors: Julius Brennecke; Alexei A Aravin; Alexander Stark; Monica Dus; Manolis Kellis; Ravi Sachidanandam; Gregory J Hannon
Journal: Cell Date: 2007-03-08 Impact factor: 41.582

17 in total

1. The mobilome of Drosophila incompta, a flower-breeding species: comparison of transposable element landscapes among generalist and specialist flies.

Authors: Pedro M Fonseca; Rafael D Moura; Gabriel L Wallau; Elgion L S Loreto
Journal: Chromosome Res Date: 2019-05-22 Impact factor: 5.239

2. Complex Genetic Interactions between Piwi and HP1a in the Repression of Transposable Elements and Tissue-Specific Genes in the Ovarian Germline.

Authors: Artem A Ilyin; Anastasia D Stolyarenko; Nikolay Zenkin; Mikhail S Klenov
Journal: Int J Mol Sci Date: 2021-12-14 Impact factor: 5.923

3. TDP-43 prevents retrotransposon activation in the Drosophila motor system through regulation of Dicer-2 activity.

Authors: Giulia Romano; Raffaella Klima; Fabian Feiguin
Journal: BMC Biol Date: 2020-07-03 Impact factor: 7.431

4. Genome-wide profiling of piRNAs in the whitefly Bemisia tabaci reveals cluster distribution and association with begomovirus transmission.

Authors: Md Shamimuzzaman; Daniel K Hasegawa; Wenbo Chen; Alvin M Simmons; Zhangjun Fei; Kai-Shu Ling
Journal: PLoS One Date: 2019-03-12 Impact factor: 3.240

5. Drosophila Heterochromatin Stabilization Requires the Zinc-Finger Protein Small Ovary.

Authors: Leif Benner; Elias A Castro; Cale Whitworth; Koen J T Venken; Haiwang Yang; Junnan Fang; Brian Oliver; Kevin R Cook; Dorothy A Lerit
Journal: Genetics Date: 2019-09-26 Impact factor: 4.562

Review 6. The Role of HSP90 in Preserving the Integrity of Genomes Against Transposons Is Evolutionarily Conserved.

Authors: Valeria Specchia; Maria Pia Bozzetti
Journal: Cells Date: 2021-05-04 Impact factor: 6.600

7. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus).

Authors: Zihao Yuan; Tao Zhou; Lisui Bao; Shikai Liu; Huitong Shi; Yujia Yang; Dongya Gao; Rex Dunham; Geoff Waldbieser; Zhanjiang Liu
Journal: PLoS One Date: 2018-05-15 Impact factor: 3.240

8. Evolutionary insights in Amazonian turtles (Testudines, Podocnemididae): co-location of 5S rDNA and U2 snRNA and wide distribution of Tc1/Mariner.

Authors: Manoella Gemaque Cavalcante; Cleusa Yoshiko Nagamachi; Julio Cesar Pieczarka; Renata Coelho Rodrigues Noronha
Journal: Biol Open Date: 2020-04-28 Impact factor: 2.422

9. A dual role of dLsd1 in oogenesis: regulating developmental genes and repressing transposons.

Authors: Julie M J Lepesant; Carole Iampietro; Eugenia Galeota; Benoit Augé; Marion Aguirrenbengoa; Clemèntine Mercé; Camille Chaubet; Vincent Rocher; Marc Haenlin; Lucas Waltzer; Mattia Pelizzola; Luisa Di Stefano
Journal: Nucleic Acids Res Date: 2020-02-20 Impact factor: 16.971

10. Nuclear Argonaute Piwi Gene Mutation Affects rRNA by Inducing rRNA Fragment Accumulation, Antisense Expression, and Defective Processing in Drosophila Ovaries.

Authors: Anastasia D Stolyarenko
Journal: Int J Mol Sci Date: 2020-02-07 Impact factor: 5.923