Literature DB >> 31622354

Evolutionary history and classification of Micropia retroelements in Drosophilidae species.

Juliana Cordeiro1, Tuane Letícia Carvalho2, Vera Lúcia da Silva Valente3, Lizandra Jaqueline Robe2,4.   

Abstract

Transposable elements (TEs) have the main role in shaping the evolution of genomes and host species, contributing to the creation of new genes and promoting rearrangements frequently associated with new regulatory networks. Support for these hypotheses frequently results from studies with model species, and Drosophila provides a great model organism to the study of TEs. Micropia belongs to the Ty3/Gypsy group of long terminal repeats (LTR) retroelements and comprises one of the least studied Drosophila transposable elements. In this study, we assessed the evolutionary history of Micropia within Drosophilidae, while trying to assist in the classification of this TE. At first, we performed searches of Micropia presence in the genome of natural populations from several species. Then, based on searches within online genomic databases, we retrieved Micropia-like sequences from the genomes of distinct Drosophilidae species. We expanded the knowledge of Micropia distribution within Drosophila species. The Micropia retroelements we detected consist of an array of divergent sequences, which we subdivided into 20 subfamilies. Even so, a patchy distribution of Micropia sequences within the Drosophilidae phylogeny could be identified, with incongruences between the species phylogeny and the Micropia phylogeny. Comparing the pairwise synonymous distance (dS) values between Micropia and three host nuclear sequences, we found several cases of unexpectedly high levels of similarity between Micropia sequences in divergent species. All these findings provide a hypothesis to the evolution of Micropia within Drosophilidae, which include several events of vertical and horizontal transposon transmission, associated with ancestral polymorphisms and recurrent Micropia sequences diversification.

Entities:  

Year:  2019        PMID: 31622354      PMCID: PMC6797199          DOI: 10.1371/journal.pone.0220539

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Since Barbara McClintock’s first publication on maize transposable elements (TEs), these sequences went from junk to pivotal characters in the control and evolution of genomes. The discovery of unexpected high amounts of TEs in the genome of distinct species has pointed out toward functions of TEs on these genomes [1, 2, 3]. In fact, current knowledge indicates that TEs have been shaping the evolution of genomes and host species [4], contributing to the creation of new genes [5, 6] and promoting rearrangements frequently associated with new regulatory networks [7, 8, 9]. Moreover, there is evidence that TEs may assist in the control of embryonic development [9, 10] and genomic plasticity [11]. A large fraction of most eukaryotes genome is composed of TEs known as retroelements [12, 13, 14], some of which belong to the long terminal repeats (LTR) order. Phylogenetic analyses of such retroelements reveal an evolutionary history consisting mainly of vertical transposon transmissions (VTT) and intraspecific diversification [15]. However, autonomous TEs are able to invade naïve genomes through horizontal transposon transfers (HTT), in which they make copies of themselves and evade host defense systems before becoming fully silenced by genomic anti-TE mechanisms [16, 17]. Although HTTs are still considered rare events, mainly because we can only detect the successful ones, it seems that such events represent an important step in the TEs’ life cycle. This step enables them to evade the natural progression of their birth-and-death process [18, 19, 17, 16]. After the HTT event, TEs can have a wide range of positive and/or negative consequences in the host genome [20]; but mainly, they become a new set of sequences where evolution can take place, unveiling their relevance to host genome evolution [21, 22]. A growing number of studies have identified HTTs using distinct analysis strategies [15, 16, 23, 24, 25]. For instance, a patchy taxonomic distribution among monophyletic clusters of species is expected if TEs are moving horizontally rather than being vertically inherited. This patchy distribution associated with incongruences between species and TEs phylogenies, as well as an unexpected high nucleotide identity between TEs found in the genome of divergent species, widely strengthens the evidence for HTT [26, 17, 25, 27, 28]. According to these criteria, LTR retrotransposons account for approximately 20% of HTT events across the genomes of insect [16]. This value increases when only Drosophila genomes are analyzed, e.g. LTR retroelements account for 90% of the HTT events detected across the genomes of D. melanogaster, D. simulans and D. yakuba [29]. Micropia is a retrotransposon that belongs to the Ty3/Gypsy group of LTR retroelements [30], which is closely related to retroviruses [31, 32]. Micropia was first discovered in the lampbrush loops of the Drosophila hydei Y chromosomes. Until recently, there were only four well-characterized Micropia elements, and these were found in the genomes of D. hydei (named dhMiF2 and dhMiF8) and D. melanogaster (named Dm11 and Dm2) [33, 34, 35]. Recently, complete and probably active Micropia reference sequences were found in the genomes of D. simulans and D. sechellia [15]. Nevertheless, Micropia related sequences are also present in the genomes of several Drosophila and Zaprionus species, showing an irregular distribution pattern [36, 37, 38, 39, 40, 41]. In some species (like D. hydei), Micropia shows an effective transcription-based repression mechanism associated with antisense RNAs [37, 41, 42]. But there is no evidence of autonomous Micropia sequences for other species, for example D. melanogaster [41]. Here, our goals were to provide a hypothesis to the evolutionary history of Micropia retroelement sequences within Drosophilidae species genome, while trying to assist in the classification of this TE. At first, we analyzed its presence in the genome of natural populations from several species and sequenced the detected elements. Then, we expanded our data set based on searches for Micropia-like sequences within genomic databases. All these sequences were used to propose a hypothesis to the evolution of Micropia within Drosophilidae while assessing its subdivision and identifying several cases of HTTs.

Materials and methods

Species analyzed

For this study, we analyzed the presence/absence of Micropia sequences in the genomes of natural populations of 24 Drosophila species. These species were field-collected during 2000–2009 or obtained at the Tucson Drosophila Stock Center (current National Drosophila Species Stock Center at Cornell University) (Table 1). To this end, PCR-blot and Dot-blot searches (hereafter “in vitro searches”) were performed following the methodology described in In vitro searches: DNA manipulation, PCR-blot, Dot-blot, and sequencing (see below). In vitro searches were also previously performed for the other three species of the cardini group [39], and for D. melanogaster [34, 35] and D. hydei [33]; the sequences thus obtained were downloaded from GenBank. We also analyzed the presence/absence of Micropia sequences in 26 species, whose genomes are available at NCBI (blast.ncbi.nlm.nih.gov/Blast.cgi) or Flybase (flybase.org/blast/) websites (hereafter “in silico searches”), plus two species, D. suzukii and D. buzzatii, whose genomes are available at personal websites (http://spottedwingflybase.org/ and https://dbuz.uab.cat/welcome.php, respectively) (Table 1). These searches followed the criteria described in In silico searches: Genomic analysis (see below). Thus, D. buzzatii, and D. melanogaster were the only species for which both search strategies were applied. The classification scheme adopted for each of these species across this study follows the proposal of [43].
Table 1

Presence/absence of Micropia sequences in the genomes of Drosophilidae species.

Methodology employed and GeneBank accession numbers are also shown.

GenusSubgenusGroup speciesSpeciesPresence/absenceMethodologyGenBank acc. nos.
DrosophilaDorsilophabusckiiD. busckii+in silicosee S1 Table
DrosophilaDrosophilacardiniD. acutilabellaA+in vitroFJ748684*, FJ748685*, FJ748686*,FJ748687*, FJ748688*
D. arawakanaE-in vitro-
D. cardiniE+in vitroFJ748690*, FJ748691*, FJ748692*
D. cardinoidesE+in vitroEF090263*, EU149929*, EU149930*
D. dunniA-in vitro-
D. neocardiniE+in vitroEF090264*, EU149931*, EU149932*,EU149933*
D. neomorphaA+in vitroFJ748695*, FJ748696*, FJ748697*
D. nigrodunniA-in vitro-
D. parthenogeneticaA+in vitroFJ748698*, FJ748699*, GQ339587*,GQ339588*, GQ339589*, GQ339590*
D. polymorphaE+in vitroEF090265*, EF149934*, EF149935*,EF149936*, EF149937*
D. procardinoidesA+in vitroFJ748700*, FJ748701*, FJ748702*
D. similisA-in vitro-
funnebrisD. funnebrisA-in vitro-
guaramunuD. griseolineataD-in vitro-
D. maculifronsD-in vitro-
guaraniD. guaruD-in vitro-
D. ornatifonsD-in vitro-
immigransD. albomicans+in silicosee S1 Table
D. immigransD-in vitro-
tripunctataD. bandeirantorumB-in vitro-
D. mediodiffusaB-in vitro-
D. mediopictoidesB-in vitro-
D. mediopunctataB-in vitro-
D. paraguayensisC-in vitro-
D. paramediostriataB-in vitro-
D. tripunctataB-in vitro-
SiphlodorarepletaD. arizonae+in silicosee S1 Table
D. buzzatiiC+in vitro/in silicoFJ748689*, GQ339579*, GQ339580*,GQ339582*, see S1 Table
D. hydeiC+in vitroX13304*, X13305*
D. mercatorumC+in vitroFJ748693*, FJ748694*, GQ339583*,GQ339584*, GQ339585* GQ339586*
D. mojavensis+in silicosee S1 Table
D. navojoa+in silicosee S1 Table
D. zottii+in vitroFJ748703*, GQ339578*
virilisD. americana+in silicosee S1 Table
D. virilis+in silicosee S1 Table
SophophoramelanogasterD. ananassae+in silicosee S1 Table
D. bipectinata+in silicosee S1 Table
D. elegans+in silicosee S1 Table
D. erecta+in silicosee S1 Table
D. ficusphila+in silicosee S1 Table
D. kikkawai+in silicosee S1 Table
D. melanogasterE+in vitro/in silicoX14037*, X14173*, see S1 Table
D. rhopaloa+in silicosee S1 Table
D. sechellia+in silicosee S1 Table
D. simulans+in silicosee S1 Table
D. suzukii+in silicosee S1 Table
D. takahashii+in silicosee S1 Table
D. yakuba+in silicosee S1 Table
obscuraD. Miranda-in silico-
D. persimilis-in silico-
D. subobscura-in silico-
willistoniD. willistoni+in silicosee S1 Table
Haiwaiian Drosophila-D. grimshawi-in silico-
Phortica-variegataP. variegata-in silico-
Scaptodrosophila--S. lebanonensis+in silicosee S1 Table

*Sequences used as initial BLASTn queries. Capital letters refer to the fly collector/supplier:

ATucson Drosophila Stock Center (currently The National Drosophila Species Stock Center at Cornell University)

BDr. Luciano Basso da Silva

CDr. Marco Silva Gottschalk

DMSc. Jonas da Silva Doge

EDra. Daniela Cristina De Toni. Species vouchers are available at the Laboratório de Drosophilidae at Universidade Federal do Rio Grande do Sul.

Presence/absence of Micropia sequences in the genomes of Drosophilidae species.

Methodology employed and GeneBank accession numbers are also shown. *Sequences used as initial BLASTn queries. Capital letters refer to the fly collector/supplier: ATucson Drosophila Stock Center (currently The National Drosophila Species Stock Center at Cornell University) BDr. Luciano Basso da Silva CDr. Marco Silva Gottschalk DMSc. Jonas da Silva Doge EDra. Daniela Cristina De Toni. Species vouchers are available at the Laboratório de Drosophilidae at Universidade Federal do Rio Grande do Sul.

In vitro searches: DNA manipulation, PCR-blot, dot-blot, and sequencing

Genomic DNA was extracted through phenol-isoamyl-chloroform protocol according to [44]. It was used approximately 100 adult flies per species macerated in liquid nitrogen using individual new sterile grinders. PCR reactions were performed using Micropia primers to amplify the reverse transcriptase (RT) domain within the pol gene, as described in [39]. The following conditions were used for 25 μl PCR reactions: 25 ng of template DNA, 20 pmol of each primer, 0.2 mM of each nucleotide, 1.5 mM MgCl2 and 1 unit Taq DNA polymerase in 1x polymerase buffer (all from Invitrogen). Amplifications parameters were 95°C for 2 min, 35 cycles at 95°C for 30 s, 50–60°C for 30 s and 72°C for 1 min, followed by an extension step at 72°C for 10 min. Drosophila hydei genomic DNA was used as a positive control. In order to confirm the homology of the amplified fragments, a PCR-blot was prepared with the obtained PCR amplicons. The PCR products were separated by electrophoresis using a 1% agarose gel and transferred to nylon membranes (Hybond N+®, GE Healthcare), where hybridization was carried out using an 812 bp fragment of Micropia from D. hydei as the probe. This fragment ranges from nucleotide 1,777 to 2,589 of the D. hydei dhMiF2 sequence (GenBank acc. no. X133041), covering part of the RT sequence. The probe label and signal detection were performed using the Gene ImagesTM AlkPhos DirectTM labeling and detection system (GE Healthcare), according to the manufacturer's instructions. The membranes were hybridized at 55°C and exposed for 5 min. A Dot-blot procedure was also performed using genomic DNA. Denaturation was performed using 3 μg of genomic DNA in a final volume of 10 μl, which was directly applied onto a nylon membrane (Hybond N+®, GE Healthcare). As a positive control, 5 ng (in 10 μl) of the dhMiF2 probe was used. The probe labeling, signal detection, and hybridization temperature were performed as above. The Dot-blot revealing film underwent 3 min exposure. For sequencing, PCR amplicons from each species presenting positive signals for Micropia were separated by 1.5% agarose gel electrophoresis and purified using Illustra GFXTM PCR DNA and Gel Band Purification kit (GE Healthcare) according to the supplier's specifications. The fragments were cloned using pGEM®-T Easy Vector system (Promega). The obtained recombinant plasmids underwent a new PCR reaction using the universal M13 primers at a 55°C annealing temperature. The amplicons were purified using ExoI-SAP (GE Healthcare) and directly sequenced in a MegaBACETM500 (GE Healthcare). Forward and reverse strands were sequenced; ambiguities and compressions were resolved through assemblage in the Staden Package Gap4 program [45]. GenBank accession numbers are indicated in Table 1.

In silico searches: Genomic analysis

BLAST searches were performed at NCBI (blast.ncbi.nlm.nih.gov/Blast.cgi) and Flybase website (flybase.org/blast/), using default parameters against “Whole Genome Shotgun Contigs (WGS) database” limited by “organism”, in which each Drosophila species was selected. For D. buzzatii and D. suzukii, searches were performed against the scaffolds database, respectively, in the ‘Drosophila buzzatii Genome Project’ website (dbuz.uab.cat/welcome.php) and in the ‘Spotted Wing FlyBase’ website (spottedwingflybase.org/). The searches were finished in January 2018. The initial BLASTn queries consisted of Micropia reverse transcriptase (RT) nucleotide sequences obtained by previous studies [39, 33, 34 and 35] and retrieved from GeneBank (Table 1). The retrieved sequences obtained during the in silico searches showing scores higher than 50 and E-values lower than 1.0E-05 were downloaded, including 2 kb from both sides of each hit. After that, each retrieved sequence was aligned with the set of query sequences using ClustalW, as implemented in MEGA6 software [46]. Sequences that failed to align in this first step of multiple alignment underwent a second step of alignment (this time pairwise or even local alignment) against the query sequence which presented the highest score in the BLASTn searches (hereafter “best query” sequence). In this case, fragments presenting less than 300 bp of confirmed homology to its best query sequence were withdrawn from the alignment. Furthermore, after compressing the analyzed region, identical nucleotide sequences recorded for the same species were joined in a single sequence. A codon-based alignment was then performed using Muscle [47] as implemented in MEGA6 software. Gaps presented in this matrix were further resolved, in order to leave all sequences in-frame, to obtain the aligned amino acid matrix. All these translated sequences were then used as queries to perform exhaustive tBLASTn searches, using the same strategy described above. These two-BLAST-step strategy was performed to guarantee that the real diversity of Micropia sequences was retrieved from the genomes, enabling a better representation of these sequences in our data set. Supplementary S1 Table provides a list of BLASTn and tBLASTn results, whereas S1 File provides the set of nucleotide sequences retrieved through “in vitro” and “in silico” searches. The first analyzed matrix encompassed all sequences obtained under these criteria that presented a minimum overlap of 300bp to the previous nucleotide alignment, after a final codon-based alignment performed in Muscle (first filtering step, resulting in S2 and S3 Files). After completing the matrix, putative functional RT Micropia sequences were identified by translating each unaligned nucleotide sequence in the different reading frames. Once an Open Reading Frame (ORF) was detected, BLASTn searches further confirmed its identity.

Phylogenetic analysis and Micropia subfamilies

Phylogenetic analyses were performed using the amino acid alignment obtained after resolving all gaps and leaving all nucleotide sequences in-frame. Fifty amino acid sequences belonging to each of the five main clades recently established by Bargues and Lerat [15] for the Micropia/Sacco group within Ty3/Gypsy were selected from the alignment provided by the authors. These sequences were included as a “taxonomic framework” to guide conclusions related to new Micropia sequences in our phylogenetic analyses, in which a Copia-like transposable element sequence obtained from the D. melanogaster genome (GenBank access number X01472) was used as outgroup. This Copia-like retroelement belongs to the Ty1/Copia superfamily of LTR retrotransposons, which is closely allied to the Ty3/Gypsy retrotransposon sequence group [48]. Bayesian phylogenetic analysis (BA) was performed under a mixed model with gamma correction, as implemented in MrBayes3.1.2 software, through Cipres Computational Resources [49]. This Markov Chain Monte Carlo (MCMC) search was run for 10,000,000 generations, with trees saved every 1,000 after a burn-in of 2,500. The Posterior Probability (PP) of each clade on the 50% majority-rule consensus tree was calculated and the resulting tree was visualized in FigTree. The tree so obtained was used to detect intraspecific sequences sharing a most recent common ancestor (MRCA). In these cases, only the sequence with the shortest branch (the most similar to the inferred MRCA sequence) was maintained as representative of that clade in a new round of BA analysis (second filtering step, resulting in S4 File). The final tree was compared to the species tree, as compiled from previous studies [50, 51, 52, 53, 54, 55 and 56], which present only a limited overlap on sampled species. Subfamilies of the Micropia TE sequences were identified using the criterion established by Capy et al. [30], according to which reciprocally monophyletic sequences with less than 30% of divergence at the amino acid level could be grouped in the same TE subfamily. This analysis was performed in MEGA6, using Poisson amino acid substitution model.

dS and divergence time estimates

Pairwise synonymous distance (dS) values were estimated for Micropia in-frame nucleotide sequences (S5 File) and for three host nuclear genes sequences (S2 Table) using Nei and Gojobori [57] method, as implemented in MEGA6. Alcohol-dehydrogenase (Adh), alpha–methyldopa (Amd) and dopa-decarboxylase (Ddc) sequences were downloaded from GeneBank or retrieved from the species genomes using BLASTn searches (for GenBank or scaffold accession numbers, see S2 Table). In order to identify if the Micropia dS values were significantly lower than those observed for the host nuclear genes, accounting for differences in the number of synonymous sites, a one-tailed Fisher’s exact test was performed using R v.3.5.2 [58]. Divergence times were also eventually evaluated using dS estimates and a synonymous substitution rate of 0.016 substitutions per site per million years, as calculated for Drosophila genes with low codon usage bias [59].

Results

A total of 56 Drosophilidae species were analyzed for the presence/absence of Micropia sequences (Table 1). Thirty species were analyzed by in vitro searches and 28 species were analyzed through in silico searches. In vitro and in silico searches allowed to isolate 363 Micropia sequences plus one outgroup sequence (S3 Table and S1 File), which were further reduced to 247 plus one outgroup sequence (S2 and S3 Files) in the first filtering step. The second filtering step followed by the inclusion of the Micropia/Sacco sequences characterized by [15], leads to the alignment of 151 sequences (S4 File).

Patchy distribution of Micropia sequences in the Drosophilidae species genomes

We identified the presence of distinct Micropia related sequences in the genome of 34 Drosophilidae species (Table 1). In vitro signals of Micropia copies were encountered in D. melanogaster and in some species from cardini (8 of the 12 species tested) and repleta (4 of the 4 species tested) groups, despite the fact that 13 other species were also tested (Table 1, S1 and S2 Figs). Conversely, in silico searches enabled the isolation of Micropia sequences in the genomes of D. busckii, D. albomicans, D. willistoni and S. lebanonensis, and in species from the repleta (4 of the 4 species tested), virilis (2 of the 2 species tested) and melanogaster (12 of the 12 species tested) groups. No Micropia sequence could be found for D. grimshawi (picture wing group), D. funebris, D. immigrans or for any species of the guaramunu, guarani, obscura, and tripunctata groups. So, interesting intra-group polymorphisms in the status of presence/absence of Micropia sequences were solely identified for the cardini and immigrans groups. Fig 1 shows the species tree informing the presence and absence of Micropia related sequences in the genome of each of these species.
Fig 1

Phylogenetic reconstruction of species analyzed in this study.

Phylogenetic reconstruction was based on data compiled from previous studies [50, 51, 52, 53, 54, 55 and 56] which have a limited overlap on sampled species. Species name in black represents the presence of Micropia sequences and species name in grey represents the absence of such sequences. Distinct branch colors represent distinct subgenera within the Drosophila genus, and the classification follows [43]. Drosophila genus group species are also indicated to the right. Scaptodrosophila and Phortica are represented as outgroups of the Drosophila genus. The dashed line represents the potential phylogenetic position of D. zottii, since there is no molecular phylogeny neither any nuclear or mitochondrial gene available for this species.

Phylogenetic reconstruction of species analyzed in this study.

Phylogenetic reconstruction was based on data compiled from previous studies [50, 51, 52, 53, 54, 55 and 56] which have a limited overlap on sampled species. Species name in black represents the presence of Micropia sequences and species name in grey represents the absence of such sequences. Distinct branch colors represent distinct subgenera within the Drosophila genus, and the classification follows [43]. Drosophila genus group species are also indicated to the right. Scaptodrosophila and Phortica are represented as outgroups of the Drosophila genus. The dashed line represents the potential phylogenetic position of D. zottii, since there is no molecular phylogeny neither any nuclear or mitochondrial gene available for this species.

Phylogenetic analysis, Micropia diversity, and potential coding sequences

As several intraspecific sequences clustered together in the BA phylogenetic tree obtained for the whole set of Micropia sequences recovered after the first filtering step (S3 Fig), the alignment could be reduced from 248 (S3 File) to 151 sequences (S4 File). The final Micropia phylogenetic tree reinforced reciprocal monophyly of several sets of sequences and confirmed the identity of the retrieved sequences, which were clustered with Micropia sequences obtained by [15] (Fig 2). Further evaluation of the recovered tree topology reveals the presence of four main clusters, which are listed here in ascending order of divergence into the tree: the first, presenting the Sacco sequences obtained by [15]; the second, grouping representatives of the Blastopia and MDG3 sequences obtained by [15]; the third, presenting the Bicca element recovered by [15]; and the fourth recovering all the Micropia sequences in a major polytomic clade, including sequences obtained by [15].
Fig 2

Bayesian phylogenetic tree of the Drosophilidae Micropia sequences analyzed in this study after the second filtering step.

The phylogenetic tree was based on amino acid sequences following a mixed evolution model with gamma correction. Bargues and Lerat´s sequences [15] were included in the analysis. Numbers from 1 to 20 on the left represent the Micropia subfamilies recovered in our data. Filled circles after Micropia sequence names indicate sequences involved in possible HTT events based on one-tailed Fisher’s exact test involving pairwise comparisons of dS values between Micropia and nuclear genes (Adh in orange, Amd in pink, Ddc in purple; see S6 Table). Stars represent the four best-characterized Micropia elements (D. hydei dhMiF2 and dhMiF8; and D. melanogaster Dm11 and Dm2). The posterior probability of each clade is indicated beside its respective internal branch.

Bayesian phylogenetic tree of the Drosophilidae Micropia sequences analyzed in this study after the second filtering step.

The phylogenetic tree was based on amino acid sequences following a mixed evolution model with gamma correction. Bargues and Lerat´s sequences [15] were included in the analysis. Numbers from 1 to 20 on the left represent the Micropia subfamilies recovered in our data. Filled circles after Micropia sequence names indicate sequences involved in possible HTT events based on one-tailed Fisher’s exact test involving pairwise comparisons of dS values between Micropia and nuclear genes (Adh in orange, Amd in pink, Ddc in purple; see S6 Table). Stars represent the four best-characterized Micropia elements (D. hydei dhMiF2 and dhMiF8; and D. melanogaster Dm11 and Dm2). The posterior probability of each clade is indicated beside its respective internal branch. Following the criteria established by Capy et al. [30], we were able to recover 20 potential Micropia subfamilies based on monophyletic sequences (Fig 2) showing amino acid genetic divergences lower than 0.3 (Table 2 and S4 Table). Of these, nine subfamilies are monotypic and represented by a single sequence (subfamilies 2, 6, 8, 13, 16, 17, 18, 19 and 20). To the exception of subfamilies 4 and 15 (which were encountered only in species of the melanogaster group), all the remaining Micropia subfamilies are composed of species of distinct Drosophila species groups and subgenera.
Table 2

Mean pairwise amino acid genetic distances between Micropia subfamilies.

01020304050607080910111213141516171819
Subf. 020.357
Subf. 030.3130.345
Subf. 040.3270.3430.297
Subf. 050.3850.4070.3380.335
Subf. 060.4360.3970.4120.4230.315
Subf. 070.4100.4420.3240.3410.3760.376
Subf. 080.6900.7390.5330.6200.6250.4960.302
Subf. 090.4070.5210.3280.3600.3680.3510.1810.433
Subf. 100.4210.4070.3410.3450.3810.4000.2170.4510.237
Subf. 110.4680.4610.4310.4050.4540.4220.2740.5120.2840.271
Subf. 120.4350.4250.3600.3680.4800.4410.3660.6350.3840.3610.426
Subf. 130.6640.4010.5800.5740.5490.7510.5180.5860.5840.5350.5730.437
Subf. 140.4130.4410.3580.3820.3930.3740.3320.5770.3340.3170.3850.3540.539
Subf. 150.3810.5120.3460.3630.3980.4060.3230.5860.3180.3240.3850.3020.4400.304
Subf. 161.4990.8571.3451.4251.3381.4601.4371.3681.4011.3591.3021.4841.4081.3911.507
Subf. 170.4060.5110.3490.3750.4180.4260.2970.5860.3130.3270.3980.2920.4310.3360.1401.365
Subf. 180.4010.4490.3300.3750.4180.3900.3390.6110.3120.3180.3910.2880.4300.3100.1491.5330.137
Subf. 190.3760.5050.3340.3510.3710.3900.2660.5980.2850.2870.3370.3010.4650.3090.1191.3750.1570.160
Subf. 200.3660.4600.3300.3560.3090.4320.2330.4960.2380.2850.3670.2470.6510.2800.1451.5090.1380.1230.113
As a result, there are clear cases of incongruence between the species and TE's phylogenies (Figs 1 and 2, respectively), in which Micropia sequences found in the genomes of distantly related species are clustered in the same subfamily in the Micropia phylogeny, and copies within a unique genome do not share a unique and exclusive common ancestor. For example, subfamily 7 (Fig 2) comprises sequences within the genome of cardini and repleta group species, belonging to the Drosophila and Siphlodora subgenera, respectively, together with sequences encountered within the genome of D. willistoni, which belongs to the Sophophora subgenus. As concerns the presence of divergent copies within the same genome, the cases of D. buzzatii (repleta group), D. americana (virilis group) and D. willistoni (willistoni group) should be highlighted, since the Micropia sequences present in the genomes of these species are widely spread over the tree, nested in five, six and nine of the subfamilies, respectively. The analysis of potential coding sequences for the Micropia elements shown in the final tree (sequences of [15] were not included in this analysis, as well as the outgroup Copia-like sequence) shows that approximately 48% of them (48 from 100) putatively encode for reverse transcriptase enzyme (S5 Table). In fact, from the total set of 34 species with Micropia sequences evaluated here, only D. erecta, D. kikkaway, D. mojavensis, and D. polymorpha do not possess potentially encoding sequences.

dS estimates and identification of horizontal transposon transfer (HTT) events

The use of Adh, Amd and Ddc nuclear gene sequences held a total of 4,367, 4,370 and 4,558 pairwise dS comparisons, respectively (S6 Table). Micropia dS values were lower than those found for the host nuclear genes in 277 cases (significance at the Fisher’s exact test—with p-value < 0.05—were obtained for 96, 266 and 207 comparisons involving Adh, Amd and Ddc, respectively), revealing incompatible patterns with vertical transposon transmission (VTT). Thus, signals of HTTs account for 2.2%, 6.1% and 4.5% of the comparisons performed with Adh, Amd and Ddc, respectively. Fig 2 highlights all species involved in at least one case of significantly lower Micropia dS value. Indeed, only 19 of 97 sequences of Micropia for which the Fisher’s Exact Test could be performed do not present any signal of involvement in HTTs events (sequences of [15] were not included in this analysis, as well as that from the outgroup and from D. zotti, for which none of the three nuclear genes have been previously characterized). Concerning divergence times, most sequences presenting signals of HTT seem to have diverged during the last 20 mya (S6 Table).

Discussion

Micropia classification

By comparing our data with those of Bargues and Lerat’s [15], it is possible to show that our non-stringent methodology retrieved sequences belonging to Micropia within the Micropia/Sacco group of the Ty3/gypsy retrotransposable elements. Within this group, Micropia is recovered as a monophyletic lineage and sister to the Bica group of LTR retroelements. The Bayesian phylogeny of these sequences highlights the existence of a high array of divergent sequences, which are compatible with the subdivision of Micropia into specific groups. Nevertheless, the taxonomic status represented by these remains a matter of debate. In fact, except for the very well accepted criteria used to classify TEs in classes and subclasses proposed by [60], in general, there is no consensus over the criteria adopted to achieve TEs families and subfamilies [61]. Several authors used different strategies to identify new TE families and subfamilies, whether based on nucleotide and/or amino acid sequence similarities [30, 48, 62, 63, 64, 65, 66]. Given the abundance and diversity of TEs, a classification for eukaryotic TEs based uniquely on nucleotide similarities was proposed [48]. Nevertheless, given the absence of evolutionary criteria based on reciprocal monophyly, this system is yet widely controversial. So, we adopted here more conservative criteria, according to which different subfamilies are established based on reciprocal monophyly and divergence values higher than 0.3 at the amino acid level [30]. Adopting these criteria, our data shows the existence of at least 20 potential Micropia subfamilies that form the reciprocally monophyletic groups or monotypic lineages shown in Fig 2. Several of these subfamilies are spread over distinct Drosophila subgenera and species groups, although only subfamilies 7 and 12 could be sampled across species of Sophophora, Drosophila, and Siphlodora. In this sense, most sequences within the Drosophila subgenus species are clustered in subfamily 7, whereas sequences of Siphlodora are highly intermingled in the topology, but are predominantly nested in subfamilies 3, 7, 10 and 12. The other Micropia subfamilies are mostly comprised of sequences within species of the Sophophora subgenus, especially by sequences within the melanogaster group. Interestingly, sequences of Micropia used by [15] are distributed across nine of the 20 subfamilies here established, showing the wide diversity of Micropia sequences in Drosophilidae species genomes.

Micropia evolutionary history

In addition to this pattern of high diversity, our data also show that the evolutionary history of Micropia retroelement in Drosophila is characterized by several VTTs and HTTs events. Although VTTs may comprise the predominant form of transmission (94–98% of the events), HTTs is clearly an important way that these genomic parasites have to evade genomic extinction [17, 18]. In our data, the evidence for HTT in Micropia evolution came from three main sources: (i) the patchy distribution within Drosophilidae phylogeny, (ii) the incongruence between Micropia and species phylogenies, and (iii) the significantly lower dS values presented by some Micropia sequences in comparison to nuclear host genes [17, 26]. In the first line of evidence, PCR and Dot-Blot analyses provided some interesting results, especially when they were evaluated considering the results obtained through genomic data, aiming to get inferences about presence/absence patterns along the Drosophilidae phylogeny. Sequence analysis was further performed using amino acid data to reconstruct the Micropia phylogenetic relationships and using codon-aligned nucleotide data in order to measure synonymous distances. This whole set of results enable to envision a hypothesis about the evolution of Micropia sequences within Drosophilidae. The cardini group species was the best-represented Drosophila group in our analysis, and 80% of its species had their genome analyzed (12 from the 15 described species; [67]). Of these, eight species presented Micropia sequences. Conversely, the melanogaster and the repleta groups, for which several species have sequenced genomes, presented the higher percentage of species containing Micropia copies (100%). The number of isolated sequences is generally higher for species belonging to these groups, for which whole genome sequences are frequently available. Nevertheless, the use of in vitro methodologies to investigate the presence of TEs in non-model group species revealed here an important strategy to establish a robust evolutionary hypothesis for the element. For example, using such methodologies we were able to identify the absence of Micropia copies in the genome of several species belonging to distinct groups (funnebris, guaramunu, guarani, immigrans, and tripunctata), confirming, therefore, the patchy distribution of Micropia in the Drosophila subgenus. The cardini group species showed an interesting Micropia distribution pattern. Micropia sequences are present only in the genome of species occurring in the mainland, from south North America to southern South America [68]. The other four species, D. arawakana, D. dunni, D. nigrodunni, and D. similis, which seem to be devoid of Micropia (S1 Fig), are endemic to the Caribbean islands [68]. The clustering of the Micropia sequences presented by the mainland cardini species and their straightforward similarity in amino acid sequences suggest that the element has invaded the genome of these species around 1.5 mya, which is much more recent than the divergence times estimated for the target species (4–35 mya, as estimated by [52]). Considering this, it is interesting to note that 73% (8 of 11) of the Micropia RT sequences analyzed for the cardini group species seem to be capable of coding for reverse transcriptase enzyme, which is also evidence in favor of a recent invasion. This invasion apparently occurred through multiple HTTs, as can be inferred through the comparison of pairwise Micropia dS values and orthologous nuclear genes dS values. This methodology is able to detect HTTs between closely-related species [29]. In fact, all the 51 comparisons involving only species of the cardini group showed significantly lower dS values for Micropia than for any of the three evaluated nuclear genes. Nevertheless, although several HTTs events seem to have occurred between species of the cardini group, it is quite probable that the ancestor sequence of this group came from a species belonging to the repleta group (or another related group not analyzed here), for which at least some sequences from subfamily 7 seem to have evolved through VTTs. This can be seen, for example, by the absence of rejection of the null hypothesis of VTT in the comparison of dS values between the sequences Dhydei_X13304 and Dbuzzatti_04_2 and those of the host nuclear genes. This pattern is also corroborated by [39]. Several other HTTs might also have occurred within the melanogaster group (53.3% of potential coding sequences) and evidence for these can be found within subfamilies 1, 4, 10, 11 and 14. In subfamily 10, for example, the Micropia copies in D. melanogaster, D. simulans and D. sechellia genomes are identical, suggesting recent events of HTTs. Conversely, in subfamily 1, there are clear incongruences between Micropia and species phylogeny, and a sequence encountered in D. suzukii may have been recently transferred to D. rhopaloa, given the earlier branching of the Micropia sequences from D. suzukii genome. This event occurred around 5 mya. In fact, these species are included in different subgroups of the melanogaster group, for which divergence times at the same divergence level are older than 10 mya [46]. Interestingly, signals of HTTs are less straightforward among species of the repleta group, and despite the presence of sequences nested in different Micropia subfamilies; only subfamily 7 presents some evidence of HTT involving D. hydei, D. buzzatii and D. mercatorum. Such events were dated to approximately 1.25 mya, which is quite more recent than the divergence times estimated for these species (4–16 mya [52]). There are two common features between these events and those presented above for the cardini group: also here multiple HTTs can be inferred, and these lie in the same confidence interval time as those discussed above. Moreover, all the evaluated species of both the cardini and the repleta groups occur in the Neotropics [67], which faced severe climatic oscillations during this period [69]. Since it was already shown that these events possibly changed the distribution of several species of Drosophila [70, 71], they may have led to several secondary contacts which created the necessary conditions for HTT. All the HTTs discussed so far occurred between closely related species, comprising the same species group. According to [16], it is expected that the more species sampled within a group, the more HTT events will be discovered, since retrotransposons show low HTT rates between distantly related lineages. Nevertheless, considering the dS comparisons performed within each of the Micropia subfamilies, in association to the incongruences between species and Micropia phylogenies, we were also able to hypothesize the occurrence at least seven other HTTs involving species from distinct Drosophila groups or even distinct subgenera, as follow: Subfamily 3: since this subfamily is widely spread in the genome of species belonging to the subgenus Siphlodora, there must have occurred one HTT from one species of the Siphlodora subgenus to D. suzukii, the only species of the melanogaster group with sequences belonging to this Micropia subfamily; Subfamily 7: the sequences Dhydei_X13304 and X13305 do not present signals of HTT with Dbuzzatti_04_2, so these sequences might be the presumably ancestral copies within this subfamily. In this way, besides the HTTs within the cardini and repleta groups discussed above, and that from one species of the repleta group (possibly D. hydei) to another species of the cardini group, there might have occurred at least one HTT from D. buzzatii to D. willistoni; Subfamily 11: as Damericana_121 does not show signals of HTT comparing with Dbusckii_03, they might represent ancestral sequences. In this way, it might have occurred at least one HTT to species of the melanogaster group; Subfamily 12: given the absence of HTTs signals among several species of the melanogaster group, as well as among species of the Siphlodora subgenus, most of these copies possibly evolved through VTT since the most recent common ancestor (MRCA) of both lineages. Nevertheless, there is evidence of one HTT presumably from D. sechellia to D. willistoni, one from D. ananassae to D. albomicans, and one involving the MRCA of the melanogaster and Siphlodora lineages. Subfamily 14: this Micropia subfamily is widespread in the melanogaster group, from which an HTT presumably occurred to D. americana. In conclusion, the Micropia evolutionary history is based on VTTs and HTTs events with a high diversification of sequences leading to the distinct subfamilies here detected, with some sequences still capable to encode RT enzyme. Moreover, species from the repleta and melanogaster group seem to have played an important role in most HTT events inferred here within Drosophila. The wide distribution range occupied by some species of these groups possibly contributed to these phenomena, by providing more chances to HTT due to ancient overlapping distribution with other species [16].

In vitro searches for Micropia within genomes.

A: PCR-blot results of species from the cardini and repleta groups. B: Dot-blot on genomic DNA confirming the pattern seen on the PCR-blot. In both cases, the probe used was an 812bp PCR fragment from D. hydei dhMiF2 sequence. Control: 5μl (in 10 μl) of the Micropia probe. (TIF) Click here for additional data file. Dot-blot on genomic DNA. The probe used was an 812bp PCR fragment from D. hydei dhMiF2 sequence. 1. D. funnebris; 2. D. griseolineata; 3. D. maculifrons; 4. D. guaru; 5. D. ornatifons; 6. D. immigrans; 7. D. bandeirantorum; 8. D. mediodiffusa; 9. D. mediopictoides; 10. D. mediopunctata; 11. D. paraguayensis; 12. D. paramediostriata; 13. D. tripunctata. +: positive control, 5μl (in 10 μl) of Micropia probe; -: negative control, D. similis DNA. (TIF) Click here for additional data file.

Bayesian phylogenetic tree of the 247 Micropia sequences recovered by our searches within the Drosophilidae species analyzed in this study after the first filtering strategy.

The phylogenetic tree was based on amino acid sequences following a mixed evolution model with gamma correction. Bargues and Lerats´ sequences [15] were included in the analysis. The posterior probability of each clade is indicated beside its respective internal branch. (TIF) Click here for additional data file.

List of BLASTn and tBLASTn results.

Species scaffold: represents the scaffold in the species genome where the Micropia sequence was found. First nt: first nucleotide in the scaffold where the Micropia RT sequence homologous to our query was detected. Last nt: last nucleotide in the scaffold where the Micropia RT sequence homologous to our query was detected. BLAST id: Blast identities. E-value: E-value recovered by BLAST searches. Methodology: database and in silico search methodology used to find the Micropia best match query. *Sequences used as initial BLAST searches. **Sequences remained after the two BLAST search methodology. (XLSX) Click here for additional data file.

GenBank accession numbers of nuclear genes used in the dS analysis.

Data not available. (XLSX) Click here for additional data file.

Summary of the number of Micropia sequences recovered in each BLAST search step.

Here includes the sequences obtained within the Drosophilidae genomes and the sequences used as query (*). (XLSX) Click here for additional data file.

Amino acid genetic distances between sequences belonging to the same Micropia subfamily.

Data for each subfamily are in distinct sheets in this Excel file. (XLSX) Click here for additional data file.

Potentially coding sequences and their respective coding frame.

Sequences presenting stop codons are represented by a dash (-). The involvement in HTT was identified by the Fisher’s exact test (see S6 Table) (XLSX) Click here for additional data file.

Pairwise comparative analysis of dS values between Micropia and Adh, Amd and Ddc nuclear gene sequences.

Comparisons suggesting horizontal transposon transfer events were statistically tested by one-sided tail Fisher's exact test (Ost). Colors represent the p values lower than 0.05 (see Fig 2) to OstMicropia-Adh (orange), OstMicropia-Amd (pink) and OstMicropia-Ddc (purple). (XLSX) Click here for additional data file.

Nucleotide alignment comprising all Micropia sequences retrieved.

The 363 sequences were recovered through in vitro and in silico searches. The sequence used as outgroup (a Copia retroelement sequence from D. melanogaster genome) were added to this alignment. (FAS) Click here for additional data file.

Alignment of nucleotide sequences.

The sequences from S1 File were filtered to include only the ones showing a minimum overlap of 300 bp (first filtering strategy) encompassing 247 Micropia sequences. The sequence used as outgroup (a Copia retroelement sequence from D. melanogaster genome) were added to this alignment. (FAS) Click here for additional data file.

Alignment of amino acid sequences.

This alignment comprises the 247 amino acid Micropia sequences plus the sequence used as outgroup (a Copia retroelement sequence from D. melanogaster genome) recovered after the first filtering strategy and employed for the assessment of reciprocal monophyly patterns regarding sequences retrieved from the same species (see S3 Fig). (FAS) Click here for additional data file. This alignment comprises the 100 amino acid Micropia sequences recovered after the second filtering strategy, plus the 50 sequences of the Micropia/Sacco group characterized by [15], also including the sequence used as outgroup (a Copia retroelement sequence from D. melanogaster genome) employed in the phylogenetic reconstruction of Fig 2. (FAS) Click here for additional data file.

Codon alignment of nucleotide sequences.

This alignment comprises the 100 nucleotide Micropia sequences recovered after the second filtering strategy employed in the dS estimates. (FAS) Click here for additional data file. 19 Aug 2019 PONE-D-19-19755 Evolutionary history and classification of Micropia retroelements in Drosophilidae species PLOS ONE Dear Dr Cordeiro, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. We would appreciate receiving your revised manuscript by Oct 03 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. We look forward to receiving your revised manuscript. Kind regards, Ruslan Kalendar, PhD Academic Editor PLOS ONE Journal Requirements: 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that you have included the phrase “data not shown” in your manuscript. Unfortunately, this does not meet our data sharing requirements. PLOS does not permit references to inaccessible data. We require that authors provide all relevant data within the paper, Supporting Information files, or in an acceptable, public repository. Please add a citation to support this phrase or upload the data that corresponds with these findings to a stable repository (such as Figshare or Dryad) and provide and URLs, DOIs, or accession numbers that may be used to access these data. Or, if the data are not a core part of the research being presented in your study, we ask that you remove the phrase that refers to these data. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors of this study have previously published articles regarding transposable elements in the Drosophilidae. This study includes a large taxon sampling. Two techniques were used to sample species for presence or remnants of the Micropia TE. One technique was to sample “in vitro” from fly DNA isolated by the authors. The other sampling technique to search through published genomes referred to as “in silico.” The evolution of Micropia elements uses previously established and employed criteria of percent similarity. Overall: Slight editing for word usage and grammar needed. For example Line 28 “detaches” is incorrect usage. I would omit “detaches as” and replace with “is” Line 38 “identified combined” omit “combined” Line 41 “sequences found in” omit “found” Line 50 “McClintock” add ‘s to be “McClintock’s” With regard to the taxon sampling: For the natural populations for 24 Drosophila species - were these field collected by the authors? Have any morphological vouchers been deposited in a collection? Line 132 “Genomic DNA was prepared according to [44].” This paper should briefly summarize ref 44’s DNA prep procedure rather than expecting the reader to chase down publications to evaluate a study. Was the isolation through single fly preps or using a large quantity of flies such as 2 ml volume ground using a grinder or container that is reused. Previous horizontal transfer has been misidentified when a grinder was reused due to the sensitivity of PCR amplification. It was well-thought-out to use of three nuclear genes for comparison for rates of change in sequence divergence in the Micropia TE. Lines 261 and 262 The phylogenetic tree of the species used in this study “…was based on was based on data compiled from [49, 50, 51, 52, 53, 54 and 55].” Was this tree created by stitching together clades from these papers because there is no one taxonomic investigation that overlaps the species in this investigation? If this is correct this should be clearly stated to the reader. Reviewer #2: Cordeiro et al studied the phylogenetic dristribution of micropia sequences and showed that HTT can be an important component of the evolution history of micropia. The manuscript is well structured and the logic is sound. However, the method, especially for the in silico part, is too loose and may bias the results. The manuscript is also vague in methods and missing some important information (dS, divergence times, alignment, etc.) In addition, there are many grammar mistakes and writing need to be improved. Therefore, I recommend a major revision. Below are some more specific comments: The matching threshold of In silico searches were not stringent enough and there could be some false positives. I thus recommend the authors blasting with lower e-value threshold. The authors should also provide more details of the In silico search process, e.g. what database was used for blast. It would be better if the authors provide more statistics for the In silico searchs, e.g. how many hits were kept during each Blastn/tBlastn process. Line 50: This sentence read awkwardly. Line 60: The authors need to clarify what LTR stands for. Line 71: grammar mistakes and typo Line 170: The default e-value is usually too high. Line 177: Scores are dependent on gene length. I recommend to report e-value instead. Line 180: This sentence in confusing. Not sure what does it mean. Line 189: The authors need to explain why doing another round of Blast. Line 194: Why translate unaligned sequences? It seems to me that unaligned sequences were not micropia based on sequence similarity. Line 193: The authors need to provide the alignment sequences as a figure/table. Line 200: Manually removing gaps may bias the phylogenetic analyses Line 206: The authors need to justify why D.melanogaster was used as a outgroup. Line 225: It would be better if the authors could provide the dS and divergence times as a table ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step. 3 Oct 2019 Dear Reviewers, Please find attached the MS entitled “Evolutionary history and classification of Micropia retroelements in Drosophilidae species” (PONE-D-19-19755) which is a thoroughly revised version of the previous submission to PLOS ONE. We thank you for the opportunity as well as for the constructive criticism from both reviewers which allowed major improvements in our study. We followed carefully all recommendations and provide below a point-by-point explanation on how we proceeded in regard to each specific comment of the reviewers. We are confident the current version is much improved and hope it reaches the level of quality expected by PLOS ONE. We tracked the changes suggested by the reviewer by marking them in light blue in the text, and we believe that now all data underlying the findings are fully available as supplementary material. Sincerely, J. Cordeiro, on behalf of all authors _____________________ Reviewer #1: The authors of this study have previously published articles regarding transposable elements in the Drosophilidae. This study includes a large taxon sampling. Two techniques were used to sample species for presence or remnants of the Micropia TE. One technique was to sample “in vitro” from fly DNA isolated by the authors. The other sampling technique to search through published genomes referred to as “in silico.” The evolution of Micropia elements uses previously established and employed criteria of percent similarity. Reply: Thank you for all your constructive comments and for the opportunity to improve our manuscript. All requests and suggestions were addressed and we are confident that we now provided a more comprehensive manuscript. Overall: Slight editing for word usage and grammar needed. For example Line 28 “detaches” is incorrect usage. I would omit “detaches as” and replace with “is” Line 38 “identified combined” omit “combined” Line 41 “sequences found in” omit “found” Line 50 “McClintock” add ‘s to be “McClintock’s” Reply: We revised the MS for grammar and professional English use and the suggestions provided by the reviewer were implemented. With regard to the taxon sampling: For the natural populations for 24 Drosophila species - were these field-collected by the authors? Have any morphological vouchers been deposited in a collection? Reply: The species were kept in the Laboratory of Drosophilidae at Universidade Federal do Rio Grande do Sul, Brazil. Some species were obtained through the previous Tucson Drosophila Stock Center (current The National Drosophila Species Stock Center at Cornell University). Some of them were field collected by the researchers Dr. Marco Silva Gottschalk, Dr. Daniela De Toni, Dr. Jonas Döge and Dr. Luciano Basso da Silva during 2000-2009. In these cases, species vouchers are available at the Laboratory of Drosophilidae. In this way, we identified all collectors/suppliers in Table 1 and provided the above information in the main text. Line 132 “Genomic DNA was prepared according to [44].” This paper should briefly summarize ref 44’s DNA prep procedure rather than expecting the reader to chase down publications to evaluate a study. Was the isolation through single fly preps or using a large quantity of flies such as 2 ml volume ground using a grinder or container that is reused. Previous horizontal transfer has been misidentified when a grinder was reused due to the sensitivity of PCR amplification. Reply : In the MS, we improved this sentence to “Genomic DNA was extracted through phenol-isoamyl-chloroform protocol according to [44] with approximately 100 adult flies per species macerated in liquid nitrogen using individual new sterile grinders.” Lines 261 and 262 The phylogenetic tree of the species used in this study “…was based on was based on data compiled from [49, 50, 51, 52, 53, 54 and 55].” Was this tree created by stitching together clades from these papers because there is no one taxonomic investigation that overlaps the species in this investigation? If this is correct this should be clearly stated to the reader. Reply : Thanks for calling attention to this issue. Yes, there is only limited species overlap in the available phylogenetic studies so far. We now clearly stated this in the main text and in the legend of Figure 1. __________ Reviewer #2: Cordeiro et al studied the phylogenetic dristribution of micropia sequences and showed that HTT can be an important component of the evolution history of micropia. The manuscript is well structured and the logic is sound. However, the method, especially for the in silico part, is too loose and may bias the results. The manuscript is also vague in methods and missing some important information (dS, divergence times, alignment, etc.) In addition, there are many grammar mistakes and writing need to be improved. Therefore, I recommend a major revision. Reply: We want also to thank you for your constructive comments. They were essential to improve our manuscript. All requests, questions, and suggestions were addressed and we are confident that we now provided a more comprehensive manuscript. Below are some more specific comments: The matching threshold of In silico searches were not stringent enough and there could be some false positives. I thus recommend the authors blasting with lower e-value threshold. The authors should also provide more details of the In silico search process, e.g. what database was used for blast. It would be better if the authors provide more statistics for the In silico searchs, e.g. how many hits were kept during each Blastn/tBlastn process. Reply: Thanks for the comments. The Methodology section was improved to address the questions regarding E-value and target databases. Furthermore, basic statistics of the results were provided at the beginning of the Results section. We emphasize that the adopted E-value needed to be less stringent in order to recover divergent sequences and allow a confident description of the Micropia diversity and subdivision. Otherwise, it wouldn’t be possible to suggest a classification scheme for this retroelement. The effectiveness of the adopted strategy can be further accessed by the recovered phylogenetic tree, which supports the reciprocal monophyly of all retrieved Micropia sequences. Line 50: This sentence read awkwardly. Reply: The sentence was improved. Line 60: The authors need to clarify what LTR stands for. Reply: The acronym LTR was clarified through the use of long terminal repeats in the first appearance of LTR. Line 71: grammar mistakes and typo Reply: We improved the MS with English revision also correcting typos. Line 170: The default e-value is usually too high. Reply: We did not use a default E-value. A threshold of 1.0E-05 was adopted in our searches together with a minimum score of 50, and this is now clarified in the text. Line 177: Scores are dependent on gene length. I recommend to report e-value instead. Reply: Since both, scores and e-values, present some shortcomings, we are presenting both values now. Line 180: This sentence is confusing. Not sure what does it mean. Reply: The sentence was improved to “Sequences that failed to align in this first multiple alignment steps underwent a second alignment step (this time pairwise or even local alignment) against the query sequence that presented the highest score in the BLASTn searches (hereafter “best query” sequence). This is certainly an unusual strategy, which was necessary in order to align some divergent sequences. Line 189: The authors need to explain why doing another round of Blast. Reply: We performed this two BLAST step strategy aiming to achieve a better representation of the whole diversity of Micropia sequences encountered in Drosophilidae, in order to attain our goal of providing a classification scheme for this retroelement. In this line, we added the following sentence to the manuscript: “This two BLAST step strategy was performed to guarantee that the real diversity of Micropia sequences was retrieved from the genomes, enabling a better representation of these sequences in our dataset.” Line 194: Why translate unaligned sequences? It seems to me that unaligned sequences were not micropia based on sequence similarity. Reply: Translation in all reading frames was performed with unaligned sequences in order to identify putatively encoding elements. If aligned sequences were used in this step, encoding sequences that did not present insertions encountered in other sequences could have been erroneously classified as inactive. The effectiveness of the adopted strategy can be further accessed by the recovered phylogenetic tree, which supports the reciprocal monophyly of all Micropia sequences. Line 193: The authors need to provide the alignment sequences as a figure/table. Reply: Thanks for calling attention to this issue. We now provide the nucleotide and amino acid alignments employed in each step of the methodology as .fas fasta files in the supplementary material (S1 – S5 Files). Line 200: Manually removing gaps may bias the phylogenetic analyses Reply: In TEs studies, this is a common strategy that allows amino acid and dS analysis of all obtained sequences [see, for example, Ludwig & Loreto (2007) and Mota et al. (2010)]. As TEs usually present frameshifts, if such a strategy was not adopted, only nucleotide sequences or amino acid sequences of potentially encoding Micropia sequences could be analyzed. The first of these possibilities does not usually attain a good phylogenetic resolution, whereas the second only provides a partial description of the evolutionary scenario. Thus, only the manual edition of gaps that allows leaving all sequences in frame turned it possible to reach our aim of providing a classification scheme for Micropia sequences (by providing a resolved phylogenetic tree) and to infer putative HTTs (by enabling comparisons of dS estimates). References: Ludwig, A., & Loreto, E. L. S. (2007). Evolutionary pattern of the gtwin retrotransposon in the Drosophila melanogaster subgroup. Genetica, 130(2), 161-168. Mota, N. R., Ludwig, A., da Silva Valente, V. L., & Loreto, E. L. S. (2010). Harrow: new Drosophila hAT transposons involved in horizontal transfer. Insect Molecular Biology, 19(2), 217-228. Line 206: The authors need to justify why D.melanogaster was used as a outgroup. Reply: In this study, we used a Copia-like transposable element sequence as outgroup as it belongs to a distinct transposable element superfamily (Ty1/Copia) than Micropia (Ty3/Gypsy) (Bargues and Lerat 2017). This Copia-like retroelement was first found in the D. melanogaster genome (Saigo et al. 1984). We improved the sentence to better comprehension. References: Bargues N, Lerat E. Evolutionary history of LTR retrotransposons among 20 Drosophila species Mobile DNA. 2017; 8:7 Saigo,K., Kugimiya,W., Matsuo,Y., Inouye,S., Yoshioka,K., Yuki, S. (1984) Identification of the coding sequence for a reverse transcriptase-like enzyme in a transposable genetic element in Drosophila melanogaster Nature 312: 659–661 Line 225: It would be better if the authors could provide the dS and divergence times as a table Reply: We now provide the divergence time of Micropia sequences within supplementary S5 Table. Submitted filename: Response to reviewers.docx Click here for additional data file. 7 Oct 2019 Evolutionary history and classification of Micropia retroelements in Drosophilidae species PONE-D-19-19755R1 Dear Dr. Cordeiro, We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements. Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication. Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. With kind regards, Ruslan Kalendar, PhD Academic Editor PLOS ONE 10 Oct 2019 PONE-D-19-19755R1 Evolutionary history and classification of Micropia retroelements in Drosophilidae species Dear Dr. Cordeiro: I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. For any other questions or concerns, please email plosone@plos.org. Thank you for submitting your work to PLOS ONE. With kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Ruslan Kalendar Academic Editor PLOS ONE
  57 in total

1.  Increasing the data size to accurately reconstruct the phylogenetic relationships between nine subgroups of the Drosophila melanogaster species group (Drosophilidae, Diptera).

Authors:  Yong Yang; Zhuo-Cheng Hou; Yuan-Huai Qian; Han Kang; Qing-Tao Zeng
Journal:  Mol Phylogenet Evol       Date:  2011-09-29       Impact factor: 4.286

2.  A phylogenetic perspective on P transposable element evolution in Drosophila.

Authors:  J B Clark; M G Kidwell
Journal:  Proc Natl Acad Sci U S A       Date:  1997-10-14       Impact factor: 11.205

3.  The B73 maize genome: complexity, diversity, and dynamics.

Authors:  Patrick S Schnable; Doreen Ware; Robert S Fulton; Joshua C Stein; Fusheng Wei; Shiran Pasternak; Chengzhi Liang; Jianwei Zhang; Lucinda Fulton; Tina A Graves; Patrick Minx; Amy Denise Reily; Laura Courtney; Scott S Kruchowski; Chad Tomlinson; Cindy Strong; Kim Delehaunty; Catrina Fronick; Bill Courtney; Susan M Rock; Eddie Belter; Feiyu Du; Kyung Kim; Rachel M Abbott; Marc Cotton; Andy Levy; Pamela Marchetto; Kerri Ochoa; Stephanie M Jackson; Barbara Gillam; Weizu Chen; Le Yan; Jamey Higginbotham; Marco Cardenas; Jason Waligorski; Elizabeth Applebaum; Lindsey Phelps; Jason Falcone; Krishna Kanchi; Thynn Thane; Adam Scimone; Nay Thane; Jessica Henke; Tom Wang; Jessica Ruppert; Neha Shah; Kelsi Rotter; Jennifer Hodges; Elizabeth Ingenthron; Matt Cordes; Sara Kohlberg; Jennifer Sgro; Brandon Delgado; Kelly Mead; Asif Chinwalla; Shawn Leonard; Kevin Crouse; Kristi Collura; Dave Kudrna; Jennifer Currie; Ruifeng He; Angelina Angelova; Shanmugam Rajasekar; Teri Mueller; Rene Lomeli; Gabriel Scara; Ara Ko; Krista Delaney; Marina Wissotski; Georgina Lopez; David Campos; Michele Braidotti; Elizabeth Ashley; Wolfgang Golser; HyeRan Kim; Seunghee Lee; Jinke Lin; Zeljko Dujmic; Woojin Kim; Jayson Talag; Andrea Zuccolo; Chuanzhu Fan; Aswathy Sebastian; Melissa Kramer; Lori Spiegel; Lidia Nascimento; Theresa Zutavern; Beth Miller; Claude Ambroise; Stephanie Muller; Will Spooner; Apurva Narechania; Liya Ren; Sharon Wei; Sunita Kumari; Ben Faga; Michael J Levy; Linda McMahan; Peter Van Buren; Matthew W Vaughn; Kai Ying; Cheng-Ting Yeh; Scott J Emrich; Yi Jia; Ananth Kalyanaraman; An-Ping Hsia; W Brad Barbazuk; Regina S Baucom; Thomas P Brutnell; Nicholas C Carpita; Cristian Chaparro; Jer-Ming Chia; Jean-Marc Deragon; James C Estill; Yan Fu; Jeffrey A Jeddeloh; Yujun Han; Hyeran Lee; Pinghua Li; Damon R Lisch; Sanzhen Liu; Zhijie Liu; Dawn Holligan Nagel; Maureen C McCann; Phillip SanMiguel; Alan M Myers; Dan Nettleton; John Nguyen; Bryan W Penning; Lalit Ponnala; Kevin L Schneider; David C Schwartz; Anupma Sharma; Carol Soderlund; Nathan M Springer; Qi Sun; Hao Wang; Michael Waterman; Richard Westerman; Thomas K Wolfgruber; Lixing Yang; Yeisoo Yu; Lifang Zhang; Shiguo Zhou; Qihui Zhu; Jeffrey L Bennetzen; R Kelly Dawe; Jiming Jiang; Ning Jiang; Gernot G Presting; Susan R Wessler; Srinivas Aluru; Robert A Martienssen; Sandra W Clifton; W Richard McCombie; Rod A Wing; Richard K Wilson
Journal:  Science       Date:  2009-11-20       Impact factor: 47.728

4.  Repbase Update, a database of repetitive elements in eukaryotic genomes.

Authors:  Weidong Bao; Kenji K Kojima; Oleksiy Kohany
Journal:  Mob DNA       Date:  2015-06-02

5.  Retrotransposon-like sequences are expressed in Y chromosomal lampbrush loops of Drosophila hydei.

Authors:  P Huijser; C Kirchhoff; D H Lankenau; W Hennig
Journal:  J Mol Biol       Date:  1988-10-05       Impact factor: 5.469

6.  Micropia: a retrotransposon of Drosophila combining structural features of DNA viruses, retroviruses and non-viral transposable elements.

Authors:  D H Lankenau; P Huijser; E Jansen; K Miedema; W Hennig
Journal:  J Mol Biol       Date:  1988-11-20       Impact factor: 5.469

7.  The Drosophila micropia retrotransposon encodes a testis-specific antisense RNA complementary to reverse transcriptase.

Authors:  S Lankenau; V G Corces; D H Lankenau
Journal:  Mol Cell Biol       Date:  1994-03       Impact factor: 4.272

Review 8.  Using bioinformatic and phylogenetic approaches to classify transposable elements and understand their complex evolutionary histories.

Authors:  Irina R Arkhipova
Journal:  Mob DNA       Date:  2017-12-06

9.  Genomic Plasticity Mediated by Transposable Elements in the Plant Pathogenic Fungus Colletotrichum higginsianum.

Authors:  Ayako Tsushima; Pamela Gan; Naoyoshi Kumakura; Mari Narusaka; Yoshitaka Takano; Yoshihiro Narusaka; Ken Shirasu
Journal:  Genome Biol Evol       Date:  2019-05-01       Impact factor: 3.416

Review 10.  Ten things you should know about transposable elements.

Authors:  Guillaume Bourque; Kathleen H Burns; Mary Gehring; Vera Gorbunova; Andrei Seluanov; Molly Hammell; Michaël Imbeault; Zsuzsanna Izsvák; Henry L Levin; Todd S Macfarlan; Dixie L Mager; Cédric Feschotte
Journal:  Genome Biol       Date:  2018-11-19       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.