| Literature DB >> 29228262 |
Rosa Maria Cossu1,2, Claudio Casola3, Stefania Giacomello4,5, Amaryllis Vidalis6,7, Douglas G Scofield6,8,9, Andrea Zuccolo1,10.
Abstract
The accumulation and removal of transposable elements (TEs) is a major driver of genome size evolution in eukaryotes. In plants, long terminal repeat (LTR) retrotransposons (LTR-RTs) represent the majority of TEs and form most of the nuclear DNA in large genomes. Unequal recombination (UR) between LTRs leads to removal of intervening sequence and formation of solo-LTRs. UR is a major mechanism of LTR-RT removal in many angiosperms, but our understanding of LTR-RT-associated recombination within the large, LTR-RT-rich genomes of conifers is quite limited. We employ a novel read-based methodology to estimate the relative rates of LTR-RT-associated UR within the genomes of four conifer and seven angiosperm species. We found the lowest rates of UR in the largest genomes studied, conifers and the angiosperm maize. Recombination may also resolve as gene conversion, which does not remove sequence, so we analyzed LTR-RT-associated gene conversion events (GCEs) in Norway spruce and six angiosperms. Opposite the trend for UR, we found the highest rates of GCEs in Norway spruce and maize. Unlike previous work in angiosperms, we found no evidence that rates of UR correlate with retroelement structural features in the conifers, suggesting that another process is suppressing UR in these species. Recent results from diverse eukaryotes indicate that heterochromatin affects the resolution of recombination, by favoring gene conversion over crossing-over, similar to our observation of opposed rates of UR and GCEs. Control of LTR-RT proliferation via formation of heterochromatin would be a likely step toward large genomes in eukaryotes carrying high LTR-RT content.Entities:
Keywords: Picea; Pinus; angiosperm; gene conversion; genome size; gymnosperm; recombination suppression; retroelement
Mesh:
Substances:
Year: 2017 PMID: 29228262 PMCID: PMC5751070 DOI: 10.1093/gbe/evx260
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Method to estimate ratio of solo-long terminal repeats (LTRs) to complete LTR-retrotransposons (RTs) within a species. (I) Retrieve or assemble 3 to 10 paralogs for each LTR-RT group. (II) Extract 50-nt START and END tags from LTRs of paralogs. (III) Find genomic reads matching START and END tags with RepeatMasker (Smit et al. 2015), allowing for mismatches. (IV) For each matching read, extract a 20-nt tract containing 5 nt from the tag and 15 nt flanking sequence. Tracts are taken from the 5′ or 3′ ends of START or END tag matches, respectively. (V) Map each tract to the LTR-RT paralogs collected in (I) using BWA ALN (Li and Durbin 2009), allowing for mismatches. Count the numbers of mapped (M) and unmapped (U) tracts. Genomic reads covering complete LTR-RTs yield tracts that are mapped and unmapped in equal numbers, while genomic reads covering solo LTRs produce only unmapped tracts. (VI) The relative genomic content of solo LTRs to complete LTR-RTs is inferred from the ratio of mapped to unmapped tracts. See “Methods” section for further details and pipeline validation results.
. 2.—Ratios of solo-LTRs to complete LTR-RT elements, as a proxy for rates of unequal recombination, from seven angiosperm species and four conifer species versus genome size (log10 axis). For each species, ratios for separate LTR-RT groups are shown together with the total ratio of solo-LTRs to complete LTR-RT elements for all tracts. Shown above Brachypodium distachyon, Vitis vinifera, and Oryza sativa are the numbers of LTR-RT groups from each species with ratios that exceed the upper limit of the y-axis. See supplementary table S1, Supplementary Material online for genome size references and supplementary tables S2 and S3, Supplementary Material online for all LTR-RT group ratios.
. 3.—Proportion of examined LTR-RTs with intraelement gene conversion events (GCEs) between LTRs versus genome size (log10 axis). Pooled results for all identified GCEs are shown, together with separate results for Gscale parameters in order of increasing stringency against mismatches for detection of GCEs between aligned sequences; see “Methods” section for further details. Species are colored as in figure 2.
. 4.—Characteristics of examined LTR-RTs inferred to contain (values on x-axis) or lack (values on y-axis) GCEs; the diagonal dashed lines represent equal values in both cases. Plotted values are within-species means ± standard error. Separate Picea abies values are shown for LTR-RTs in fosmid pool assemblies (filled circles) and the genome assembly (open circles); the latter contains a biased, lower proportion of repetitive sequences than the P. abies genome in vivo, see main text. Arabidopsis thaliana is excluded due to just one observed GCE. Species are colored as in figure 2.
. 5.—Proportion of examined LTR-RTs with intraelement GCEs between LTRs versus the total ratio of solo-LTRs to complete LTR-RT elements, as a proxy for rates of unequal recombination. Proportion of GCEs shown is for all identified GCEs (equivalent to solid dots in fig. 3). Species are colored as in figure 2 and symbol area is proportional to genome size of each species. The correlation among the six small- to medium-genome species is positive (Spearman’s ρ = 0.841, rs = 0.036) while including the two large-genome species reverses and weakens the correlation to nonsignificance (Spearman’s rs = −0.216, P = 0.61).