Literature DB >> 27507140

Genome-wide variation in recombination rate in Eucalyptus.

Jean-Marc Gion1, Corey J Hudson2,3, Isabelle Lesur4, René E Vaillancourt2, Brad M Potts2, Jules S Freeman5.   

Abstract

BACKGROUND: Meiotic recombination is a fundamental evolutionary process. It not only generates diversity, but influences the efficacy of natural selection and genome evolution. There can be significant heterogeneity in recombination rates within and between species, however this variation is not well understood outside of a few model taxa, particularly in forest trees. Eucalypts are forest trees of global economic importance, and dominate many Australian ecosystems. We studied recombination rate in Eucalyptus globulus using genetic linkage maps constructed in 10 unrelated individuals, and markers anchored to the Eucalyptus reference genome. This experimental design provided the replication to study whether recombination rate varied between individuals and chromosomes, and allowed us to study the genomic attributes and population genetic parameters correlated with this variation.
RESULTS: Recombination rate varied significantly between individuals (range = 2.71 to 3.51 centimorgans/megabase [cM/Mb]), but was not significantly influenced by sex or cross type (F1 vs. F2). Significant differences in recombination rate between chromosomes were also evident (range = 1.98 to 3.81 cM/Mb), beyond those which were due to variation in chromosome size. Variation in chromosomal recombination rate was significantly correlated with gene density (r = 0.94), GC content (r = 0.90), and the number of tandem duplicated genes (r = -0.72) per chromosome. Notably, chromosome level recombination rate was also negatively correlated with the average genetic diversity across six species from an independent set of samples (r = -0.75).
CONCLUSIONS: The correlations with genomic attributes are consistent with findings in other taxa, however, the direction of the correlation between diversity and recombination rate is opposite to that commonly observed. We argue this is likely to reflect the interaction of selection and specific genome architecture of Eucalyptus. Interestingly, the differences amongst chromosomes in recombination rates appear stable across Eucalyptus species. Together with the strong correlations between recombination rate and features of the Eucalyptus reference genome, we maintain these findings provide further evidence for a broad conservation of genome architecture across the globally significant lineages of Eucalyptus.

Entities:  

Keywords:  Crossover; Eucalyptus globulus; Gene density; Genetic diversity; Genomic features; Meiotic recombination

Mesh:

Year:  2016        PMID: 27507140      PMCID: PMC4979139          DOI: 10.1186/s12864-016-2884-y

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Meiotic recombination is a fundamental evolutionary process, which is believed to have been central to the evolutionary success of eukaryotes. When crossover recombination occurs, DNA strands are exchanged between homologous chromosomes, resulting in allele shuffling. This generates novel genetic variation [1], by breaking the associations that occur both within and between linked genes [2]. As this genetic variation is sieved by natural selection, it shapes the adaptive evolution of living organisms and thus, indirectly, genome evolution. Aside from allele shuffling, recent studies have identified four other processes by which recombination directly influences genome evolution [1, 3]. These are: GC-biased gene conversion, whereby recombination promotes GC enrichment at repair sites; Hotspot drive, through which a higher transmission of non-recombinant alleles likely contributes to rapid shifts in the location of recombination hotspots [4, 5]; through the potential Mutagenic effect of recombination itself, which causes structural mutations; and Indel drive, a neutral mechanism that may drive genome contraction by a higher transmission of chromosomal deletions compared to insertions [6, 7]. In combination, these processes enable recombination to substantially influence genome evolution at a variety of genomic scales. Historically, recombination was studied through observation of chiasmata using microscopy [8], or by genetic linkage mapping using phenotypic markers. More recently, recombination has been studied through variation in recombination rate (RR); estimated from genetic distance (i.e., number of recombination events) divided by physical distance (the size in base pairs of DNA over which the genetic distance is measured) using three main approaches, linkage disequilibrium mapping, sperm typing and linkage mapping using molecular markers. The major differences between these approaches are whether they measure historical or current recombination and their applicability to populations, a single, or few individuals [9]. Linkage disequilibrium mapping estimates historical recombination at the population level, and is thus useful for understanding population evolution, but is affected by factors such as genetic drift, demography, mutation rate and natural selection. In contrast, both sperm typing and linkage mapping are little affected by these factors and provide direct estimates of current recombination across a single generation [9]. Linkage mapping is a classic technique widely used to study recombination in selfed or bi-parental families. Direct estimates of RR can be obtained from the genetic distance between pairs of linked markers compared to their physical distance. The resolution of linkage mapping studies is limited by sample size and the density of molecular markers. However, the use of high-throughput marker systems and whole genome sequences potentially allows fine scale study of RR, as well as comparison between sexes, populations and individuals [10]. Understanding variation in RR will contribute to a better understanding of genome evolution and is fundamental to many aspects of genetics. For example, knowing the genomic distribution of RR is important for predicting the potential of populations to respond to environmental change, as well as for breeding using both traditional and genomic approaches. Specifically, characterising RR will help predict the degree of linkage disequilibrium to guide marker densities needed for genomic selection, a technique which is showing promise for both animal and plant breeding [11]. It also has practical applications when trying to isolate the specific genes underlying quantitative trait loci (QTL) that have been detected in linkage mapping or association genetic studies, and for attempts to introduce favourable alleles into breeding lines [12, 13]. Most detailed studies of recombination and linkage disequilibrium have been accomplished in human, yeast, Drosophila or Arabidopsis, which have the required genomic resources (such as good reference genomes and high throughput marker systems). Although many of the factors which influence RR are conserved across eukaryotic organisms, empirical studies consistently report variation in RR between taxa, individuals and across genomes [9, 10, 14]. In view of this variation, it is important to obtain a broad understanding of RR across diverse taxa. This is particularly true for long-lived organisms such as forest trees which often have many genetic features, including low rates of selfing, high heterozygosity and high population genetic diversity [15, 16] which distinguish them from most model species. Trees often define and structure forest ecosystems; play vital ecosystem services, including climate-change mitigation; provide invaluable products; and are widely used for human recreation. Given their environmental and economic significance, understanding the genetic mechanisms which shape the evolution and diversity of trees is of major importance. The genomic resources required to study RR have only recently become available in forest trees [17-19] thus there is little knowledge of the genomic distribution of RR in these organisms. Nonetheless, insights into the patterns of recombination in tree species have been gained from studies of linkage disequilibrium (e.g., Slavov et al. [20], reviewed by Thavamanikumar et al. [21]) and comparative mapping [22]. Assembled reference genomes are available for hardwood forest trees, such as Populus [17], Eucalyptus [19], Quercus [23] and Castanea [24] providing the opportunity to directly estimate genome-wide RR using linkage maps. Recent studies in Eucalyptus have compared genetic to physical map distance and reported RR averaged over the whole genome [25] or for each chromosome [26, 27]. Silva-Junior and Grattapaglia [27] also calculated high resolution estimates of linkage disequilibrium and ‘population scaled recombination’ rate (an estimate of the history of recombination in different genomic segments), and its genomic correlates. However, for direct estimates of RR there is little information regarding variation between individuals, factors influencing this variation, nor its correlation with genomic features in forest trees. Eucalyptus is a diverse genus containing over 700 species across thirteen subgenera [28]. Most species belong to subgenus Symphyomyrtus including the majority of economically important plantation species [29], and the two species used in this study; E. grandis (section Latoangulatae) and E. globulus (section Maidenaria) [28]. Eucalypts generally have a mixed-mating system (i.e., they can reproduce both by selfing and outcrossing), and are mostly animal pollinated. Their outcrossing rate is variable, but generally high and most species are highly polymorphic [29, 30]. For example, recent findings indicate genome-wide diversity in E. grandis [27] is at the high end for plants, in line with other angiosperm trees such as Populus trichocarpa and an order of magnitude higher than most conifers, corroborating previous findings in eucalypt [19, 29]. All eucalypt species studied to date possess the same chromosome number (n = 11), although their genomes vary in size from 360 Mb (Corymbia variegata) to 640 Mb (Eucalyptus grandis) [29]. Despite this variation in genome size, there is a high degree of karyotype stability between species [29], which includes broad conservation of chromosome structure such as the location of centromeres and heterochromatin content within the subgenus Symphyomyrtus [31], and this is supported by comparative mapping studies [32, 33]. Specifically, there was no evidence for gross chromosomal rearrangements between (the 530 Mb genome of) E. globulus and an E. grandis/urophylla consensus map based on nearly 400 shared markers [33]. Re-sequencing of the E. globulus and E. grandis genomes suggested that the differences in genome size between these species are largely attributable to many small changes distributed throughout their genomes [19]. Further, the genomic distribution of diversity within and divergence between eucalypt species in subgenus Symphyomyrtus, appears stable and correlated with genomic architecture [34], providing further evidence for genome stability. In view of this stability, the numbering of chromosomes in most recent molecular studies in eucalypt has followed the consensus linkage map of Brondani et al. [35], and this order was used for the 11 major scaffolds in the E. grandis genome [19]. We here focus on Eucalyptus globulus, and exploit the recently published Eucalyptus grandis reference genome in conjunction with 10 linkage maps constructed with sequenced anchored markers. This experimental design allowed testing of the effects of various factors reported in other species to be associated with chromosomal variation in RR, including chromosome, sex, and cross-type; genomic features such as GC content, density of genes and transposable elements; and the population genetic parameters, genetic diversity and divergence (calculated from six species from subgenus Symphyomyrtus, [34]; see methods for detail).

Results

Variation between individuals

The difference in RR amongst parents (Fig. 1a) was highly significant (F9,90 = 6.1, P < 0.001) and due mainly to the high RR of two F1 parents (F1.4-F and F1.5-M) and low RR in the male and female parents of one of the F2 families (F2.C). The RR of the F1 families (3.1 ± 0.09 cM/Mb) was greater than that of the F2 families (2.8 ± 0.11 cM/Mb), but not significantly so (F1,8 = 3.9, P = 0.085). The F2 families had higher levels of heterozygosity than the F1 (2-tailed t8 = 7.04, P < 0.001). However while negative, the correlation between parental heterozygosity and RR was not significant. There was no significant differences in RR between sexes (F1,8 = 0.39, P = 0.549). The genome-wide RR ranged between 2.71 and 3.51 cM/Mb between the 10 E. globulus parents, and averaged 2.98 cM/Mb (Table 1). These values were calculated across linkage maps ranging in size (i.e., genetic distance, estimated by Haldane’s mapping function) from 1442 to 1787 cM (Table 2), averaging 1545 cM.
Fig. 1

Variation in recombination rate between individuals and chromosomes in Eucalyptus globulus. a Mean genome-wide RR for each of the 10 Eucalyptus globulus individuals. b Mean RR in each chromosome across the 10 individuals. Means with different letters were significantly different in pairwise comparisons following the Tukey-Kramer (honestly significant difference, HSD) adjustment

Table 1

Chromosome and genome scale descriptive statisticsa of the data used to study recombination in Eucalyptus

Level and variableMinMaxMeanCVe
Chromosome averages (n = 11)
 Physical size (Mb)39.0280.1055.100.29
 Diversity (global HHw)b 0.1960.2290.1960.04
 Divergence (global Fst)c 0.3850.4050.3540.03
 GC content (%)35.6038.2937.230.02
 RR (cM/Mb)1.983.812.980.20
Individual genome-wide averages (n = 10)
 Genetic distance (cM)1442178715450.08
 Physical size (Mb)508.3569.5543.10.03
 Genome coverage (%)84.1393.5589.370.03
 Heterozygosityd 6308157080.10
 RR (cM/Mb)2.713.512.980.08

aDiversity and divergence were estimated by Hudson et al. [34]. The remaining variables were estimated from our E. globulus data and the E. grandis genome [19]

bThe diversity within 6 Eucalyptus species, including E. globulus

cDivergence between the 6 Eucalyptus species

dEstimated from the total number of polymorphic markers derived from the DArT assay (see Methods)

eCoefficient of variation

Table 2

Detailed summary of the linkage maps used in this study

Scaffold ID1234567891011
CrossPa Scaffold Size (Mb)b 40.364.280.142.074.753.952.474.339.039.345.5
F1.1FInterval Number88156171012111058
Total length (cM)1221701749912519914720613291122
Phys. Size (Mb)36.158.067.640.671.052.651.764.737.226.541.5
Mean RR (cM/Mb)3.382.932.572.441.763.782.843.193.553.442.94
MInterval Number7101681110914849
Total length (cM)1501761598212917310512899113139
Phys. Size (Mb)38.154.873.034.172.349.241.856.435.129.142.3
Mean RR (cM/Mb)3.933.212.182.401.783.522.512.272.823.883.29
F1.4FInterval Number8919616913138312
Total length (cM)17619918498145183180224118117163
Phys. Size (Mb)37.455.375.836.072.345.349.959.334.325.443.9
Mean RR4.713.602.432.722.014.043.613.773.444.613.72
MInterval Number91017412812166716
Total length (cM)1531441421101431541151778299137
Phys. Size (Mb)39.351.874.534.966.338.951.271.933.531.041.5
Mean RR (cM/Mb)3.902.781.913.152.163.962.252.462.453.193.30
F1.5FInterval Number7717710911149911
Total length (cM)115142136101109165152170104104144
Phys. Size (Mb)34.447.967.935.659.440.251.965.336.127.642.1
Mean RR (cM/Mb)3.352.962.002.841.844.112.932.602.883.773.42
MInterval Number614127121181071111
Total length (cM)139157184116166181173206109136151
Phys. Size (Mb)31.656.871.635.771.052.651.166.436.732.142.3
Mean RR (cM/Mb)4.402.762.573.252.343.443.383.102.974.243.57
F2.JFInterval Number6131781411101591010
Total length (cM)107167139101129163101158105162168
Phys. Size (Mb)36.156.076.136.769.743.347.266.434.836.743.4
Mean RR (cM/Mb)2.962.981.832.751.853.762.142.383.024.423.87
MInterval Number10917121314141581416
Total length (cM)12916515392141164142181121130124
Phys. Size (Mb)36.155.075.432.372.953.152.371.737.533.242.7
Mean RR (cM/Mb)3.573.002.032.851.933.092.712.523.233.922.90
F2.CFInterval Number513169141510141198
Total length (cM)105164160110144168133137117120118
Phys. Size (Mb)35.061.169.738.067.052.251.164.738.035.042.3
Mean RR (cM/Mb)3.002.682.302.902.153.222.602.123.083.432.79
MInterval Number6111811121214178812
Total length (cM)11917213992136189121178102113124
Phys. Size (Mb)32.661.576.239.169.452.050.372.338.335.642.3
Mean RR (cM/Mb)3.652.801.822.351.963.642.412.462.663.182.93

aP = Parent, F = Female, M = Male bPhys. Size = Physical Size

Variation in recombination rate between individuals and chromosomes in Eucalyptus globulus. a Mean genome-wide RR for each of the 10 Eucalyptus globulus individuals. b Mean RR in each chromosome across the 10 individuals. Means with different letters were significantly different in pairwise comparisons following the Tukey-Kramer (honestly significant difference, HSD) adjustment Chromosome and genome scale descriptive statisticsa of the data used to study recombination in Eucalyptus aDiversity and divergence were estimated by Hudson et al. [34]. The remaining variables were estimated from our E. globulus data and the E. grandis genome [19] bThe diversity within 6 Eucalyptus species, including E. globulus cDivergence between the 6 Eucalyptus species dEstimated from the total number of polymorphic markers derived from the DArT assay (see Methods) eCoefficient of variation Detailed summary of the linkage maps used in this study aP = Parent, F = Female, M = Male bPhys. Size = Physical Size

Variation between chromosomes

The 11 chromosomes of E. globulus differed markedly in RR. Averaged across parents, chromosome-wide RR ranged from 1.98 to 3.81 cM/Mb (Table 1). The difference between chromosome averages was highly significant when tested using parents as replicates (F10,90 = 34.0, P < 0.001), and was also significant (F10,89 = 92.7, P < 0.001) after removing the correlated effect of chromosome size (see discussion below). The variation between chromosomes was mainly due to the high RR in chromosomes 10, 1 and 6 and low RR in chromosomes 3 and 5 (Fig. 1b). The difference between chromosomes was moderately to highly positively correlated with previously published estimates for E. grandis x urophylla consensus maps ([26] n = 11, Pearson r = 0.90, P < 0.001; [25] n = 11, Pearson r = 0.92, P < 0.001) as well as parental maps [27] of E. grandis (n = 11, Pearson r = 0.72, P = 0.012) and E. urophylla (n = 11, Pearson r = 0.55) although the latter was not significant (P = 0.081). In no case was the interaction of chromosome with either sex (F10,80 = 1.17, P = 0.327) or cross type (F10,80 = 1.03, P = 0.426) significant, indicating that the chromosomal differences in RR were stable across sex and cross-type in this study. Several genomic features of the E. grandis reference genome were correlated with the chromosomal variation in RR (Table 3). Chromosomes with high RR were smaller, had higher GC content, higher gene density, lower frequencies of several categories of transposable elements and fewer tandemly duplicated genes (Table 4). While the association with transposable elements as a whole was not significant (r = −0.59, P = 0.056), significant negative correlations were observed for specific classes of transposable elements (DNA transposable elements r = −0.76, P = 0.007 and uncategorised elements, i.e., those that could not be assigned to any specific transposable element class, r = −0.97, P < 0.001), and superfamilies within both the retrotransposons (long interspersed nuclear elements/short interspersed nuclear elements [LINE/SINE] r = −0.78, P = 0.005) and DNA transposable elements (helitrons r = −0.69, P = 0.019). Chromosomes with high RR also tended to have a lower density of tandem gene duplications (r = −0.72, P = 0.012). However, most of these genomic features were inter-correlated (Table 3) and when the effect of gene density (the most highly correlated feature in Table 3) was removed by calculation of partial correlation coefficients, correlations between RR and all other factors listed in Table 3 became insignificant. Partialling out the effect of chromosome size had little effect on the correlation between RR and other genome features, and the correlations involving gene density (partial r = 0.87, P < 0.001) and %GC (partial r = 0.76, P = 0.010) remained significant.
Table 3

Pearson correlations among chromosome attributes (n = 11)

RRChromosome lengthGC contentGene densityTransposable element densityNumber of tandem duplicated genesDiversity
Chromosome length (bp)−0.74**
GC content (%)0.90***−0.76**
Gene density0.94***−0.70*0.91***
Transposable element density−0.59ns0.32ns−0.29ns−0.62*
Number of tandem duplicated genes−0.72*0.53ns−0.70*−0.75**0.46ns
Diversity (Global HHW)a −0.75**0.61*−0.82**−0.85***0.41ns0.78**
Divergence (Global Fst)b 0.90***−0.84**0.95***0.92***−0.40ns−0.79**−0.91***

aThe diversity within 6 Eucalyptus species, including E. globulus

bDivergence between the 6 Eucalyptus species

Significance levels are: *** P < 0.001, ** P < 0.01, * P < 0.05, ns P ≥ 0.05

Table 4

Chromosome mean values for the main attributes used in correlation analysesa

ChromosomeRecombination rate (cM/Mb)Chromosome length (Mbp)b GC content (%)Gene densityc Transposable element densityd Number of tandem duplicated genese Diversity (Global HHW)Divergence (Global Fst)
13.6840.300.38063.55401.860.0880.1810.400
22.9764.240.36453.05392.640.0920.2060.376
32.1680.090.35742.84416.240.1030.2290.354
42.7741.980.37253.03430.050.0850.1880.395
51.9874.730.35344.35418.500.1000.2150.359
63.6653.890.37771.49361.550.0770.1750.400
72.7452.450.36852.70412.180.0920.2190.373
82.6974.330.37556.09439.170.0900.1890.383
93.0139.020.37561.00398.210.0960.1940.391
103.8139.360.38369.77406.390.0900.1880.405
113.2745.510.37767.35389.910.0880.1650.403

aAll chromosome attributes are from the Eucalyptus grandis reference genome [19], except diversity and divergence which were estimated by Hudson et al. [34]

bEquivalent to 'Scaffold Size' in Table 2

c Number of genes per Mbp

d Number of transposable elements per Mbp

e The proportion of annotated genes which are tandem duplicates

Pearson correlations among chromosome attributes (n = 11) aThe diversity within 6 Eucalyptus species, including E. globulus bDivergence between the 6 Eucalyptus species Significance levels are: *** P < 0.001, ** P < 0.01, * P < 0.05, ns P ≥ 0.05 Chromosome mean values for the main attributes used in correlation analysesa aAll chromosome attributes are from the Eucalyptus grandis reference genome [19], except diversity and divergence which were estimated by Hudson et al. [34] bEquivalent to 'Scaffold Size' in Table 2 c Number of genes per Mbp d Number of transposable elements per Mbp e The proportion of annotated genes which are tandem duplicates Chromosomal variation in RR was also associated with differences in marker diversity (assuming Hard-Weinberg equilibrium [H]) and species divergence (measured as the proportion of the genetic variance in a sub population relative to the total genetic variance [F]). It was highly negatively correlated with the pooled genetic diversity (global H) within the six Eucalyptus species studied by Hudson et al. [34] (n = 11, r = −0.75, P = 0.008). The correlation between chromosomal RR and H of each individual species, including E. globulus, were similar in size and all also negative (r = −0.60 to −0.74). The trends in global diversity were also evident regardless of whether markers were intra- or intergenic. However, for global H the correlation was only significant for intragenic markers (r = −0.65, P = 0.032). The global F amongst the six species studied by Hudson et al. [34] was positively correlated with RR at the chromosome level (r = 0.90, P < 0.001). This relationship with F was also found for both intragenic (r = 0.93, P < 0.001) and intergenic (r = 0.75 P = 0.008) markers.

Discussion

This is the first study in forest trees to present direct, genome-wide, estimates of RR from multiple mapping pedigrees. We show RR varies between individuals as well as between chromosomes. The relative differences amongst chromosomes appear stable across eucalypt species, and are correlated with genomic features commonly related to RR in diverse taxa; including GC content, gene density and the density of specific transposable elements. Significantly, we show that the differences in RR are negatively correlated with species-wide genetic diversity at the chromosome level. The direction of this correlation is the opposite to that commonly observed (see reviews by Smukowski and Noor [9], and Cutter and Payseur [36]) and, we argue, is likely to reflect an interaction of selection and genome architecture. The variation we found in genome-wide RR between individuals is consistent with reports in other studies [9, 37, 38] and may be due to environmental or genetic factors. RR has been shown to be affected by the environment in both plants and animals [39-41]. However, the role of the environment cannot be clarified in the present study as our parents come from different environments as well as being genetically distinct. Genetic factors which may influence RR, include sex and cross type. Sex specific differences have been reported in some taxa, with elevated RR often occurring in the gender subject to gametic selection [42]. However, no difference in RR between sexes was evident in E. globulus. There was a signal in our data, albeit not significant, that cross type could explain some of the variation between individuals, with lower average RR in the F2’s compared with the F1’s. However further study will be required to validate this. Irrespective of whether the genome-wide variation in RR we observed between individuals is due to genetic or environmental factors, it is clear that representative maps of the RR landscape in Eucalyptus species will need to integrate results from multiple crosses and individuals. Our genome-wide average RR for E. globulus (2.98 cM/Mb) was higher than estimates from E. grandis (2.17 cM/Mb), E. urophylla (2.24 cM/Mb) [27], and E. grandis/urophylla consensus maps (1.58 cM/Mb and 1.95 cM/Mb; [25] and [26], respectively). However, differences in the scale over which RR was estimated is likely to have influenced the different estimates. Differences in genome coverage also no doubt influenced the estimates of RR between studies, particularly in the case of Kullan et al. [25], which estimated RR based on 152 windows of approximately 1 cM across the genome. Direct comparison of the estimates from different studies is also problematic because different mapping techniques produce different genetic distances and thus different estimates of genetic to physical distance. Our estimates of genetic distance (cM) were derived from Joinmap software, using the maximum likelihood algorithm, which uses Haldane’s mapping function. The estimates of genetic distance used by Petroli et al. [26] were derived from RECORD software [43]. RECORD produces distances very similar to the Joinmap maximum likelihood algorithm, if Haldane’s mapping function is used [44]. However, Silva-junior and Grattapaglia [27] used Kosambi’s mapping function, which produces smaller estimates of genetic distance relative to Haldane’s function. Similarly, both Kullan et al. [25] and Petroli et al. [26] used the regression algorithm and Kosambi’s mapping function in Joinmap, which produce considerably smaller genetic distances than Joinmap’s maximum likelihood algorithm. Therefore, the greater estimate of genome-wide RR in our study may simply reflect greater estimates of genetic distance due to methodological issues. Clearly, further study is required to draw robust conclusions regarding genome-wide differences in RR within and between eucalypt species, ideally using linkage maps constructed from similar techniques in multiple representatives of each species. There was substantial variation in RR between chromosomes in our study, indeed chromosome was the most significant explanatory variable in our analysis of variance. Few studies in plants have examined genome-wide RR using multiple crosses, which are needed to provide the replication to test for significant differences between chromosomes. Nonetheless, chromosomal variation in RR has been demonstrated in a range of taxa [45, 46], including plants such as maize in which intraspecific variation was of a similar magnitude to our findings [14]. Chromosomal variation in RR is likely to at least in part reflect variation in chromosome size, due to the requirement for at least one crossover per chromosome per meiosis [45]. However, in our study the variation in RR between chromosomes remained significant after removing the correlated effect of chromosome size, arguing that other aspects of genomic architecture also contribute to this variation. Despite the significant differences in genome-wide RR between individuals, three lines of evidence suggest that the relative RR in each chromosome is highly stable; not only within E. globulus, but between Eucalyptus species, and across evolutionary time. First, the differences between chromosomes in E. globulus were evident even when the significant effect of variation amongst the individuals was removed; and there was no significant interaction between chromosome and either sex or cross type within E. globulus. Indeed, the difference in RR amongst chromosomes was nearly two fold, whereas that amongst individuals was only 1.3 fold. Second, the high positive correlation between our chromosomal estimates of RR from E. globulus (section Maidenaria) and those in E. grandis/urophylla (section Latoangulatae) [25, 26] suggests the relative chromosomal differences in RR are stable across these Eucalyptus sections, despite an estimated 10 to15 Ma since their divergence [47]. Third, a positive correlation between GC content and RR is expected based on studies in other taxa (see below). The high positive chromosome-level correlation between GC content (from the E. grandis genome) and RR (measured from E. globulus map distance against the E. grandis genome), also points to stability of RR between E. grandis and E. globulus. Further, because our estimate of RR is from a single generation of recombination, while GC content in E. grandis reflects historical RR, the high positive correlation also points to a long term stability of RR at the chromosome level.

Association with genomic features and population genetic parameters

The chromosome level correlation between GC content and RR we observe is consistent with the theory of ‘GC-biased gene conversion’, whereby DNA repair during recombination promotes enrichment in GC bases, accounting for the positive correlation between RR and GC content seen in most eukaryotes studied to date [9, 48]. The particularly strong positive correlation observed between chromosome level RR and GC in E. globulus may reflect the unusually high GC content seen in the family Myrtales, compared with other eudicots [48]. In contrast, Arabidopsis is relatively GC poor and GC content and RR are uncorrelated, in line with a general pattern of greater GC enrichment in more derived plant genomes [48]. Our results also suggest a strong positive correlation between chromosomal estimates of RR and gene density. This finding is consistent with the positive correlation between RR and gene density observed at the whole-genome level in a range of taxa [9]. In plants such as maize, rice, wheat, and A. thaliana much of this correlation is attributable to the low density of genes in recombination-poor heterochromatin [49]. While in vertebrate genomes, Nam and Ellegren [7] hypothesise neutral processes associated with recombination, including a bias in deletions over insertions, contribute to a more compact genome structure and greater gene density in regions of elevated RR. However, the correlation between gene density and RR varies in strength and direction amongst taxa [9, 36, 50]. The positive correlation we observe at the chromosome level is particularly strong relative to other reports. Consistent with our findings, gene density was positively correlated with historical recombination rate in E. grandis (i.e., population scaled recombination rate, measured in 100 kb windows across the genome; [27]), suggesting the relationship between gene density and RR may be widespread in Eucalyptus, at least across the subgenus Symphyomyrtus. A negative correlation between RR and some categories of transposable elements was also evident in our study. Transposable elements comprise over 50 % of most eukaryotes genomes and have a significant impact on genome evolution [51, 52]. In particular, it is well accepted that the accumulation of transposable elements is a major factor driving genome expansion [53]. The relative density of transposable elements across the genome reflects a balance between transposable element insertion and deletion. Transposable element deletion is believed to be primarily driven by recombinational processes, consistent with the observed transposable element accumulation in areas of low recombination in many taxa [49, 54] and in line with the correlation we observe. The mechanisms contributing to transposable elements accumulation in regions of low RR are the subject of some debate. Leading hypothesis involve relaxation of various forms of selection in these regions (see Dolgin and Charlesworth [55]); biased insertion of transposable elements in areas of low RR [56]; and a general deletion bias associated with recombination [7]. Regardless of the causative factors, in our study it is possible that in chromosomes with relatively low RR, a greater density of transposable elements has contributed to a reduced gene density, and thus contributed to the positive correlation we observe between RR and gene density at the chromosome level. A notable finding to emerge from this study is our strong negative correlation between RR at the chromosome-level and species-wide genetic diversity. The common trend in animals is for a positive association between RR and genetic diversity, whereas the association in plants is more variable, but generally zero or positive [36]. The relationship between RR and genetic diversity may be affected by the scale and type of measures of RR (e.g., chromosome averages versus smaller genomic regions) and genetic diversity (i.e., within individuals, populations or species) both of which differ widely between studies [9]. Nevertheless, only in Oryza species (wild and domesticated rice) have negative correlations between RR and genetic diversity been previously reported [36]. In O. sativa, this association involved 100 kb intervals across three chromosomes (r = −0.183 and −0.314; for O. sativa ssp. indica and ssp. japonica, respectively) [57]. The more commonly observed positive correlation between RR and genetic diversity is often attributed to the impact of linked selection [9, 36]. Linked selection predicts a positive relationship between RR and genetic diversity due to the homogenising effects of selection being spread further in regions of low recombination [58]. However, as noted above there is disparity in this relationship between species [36] suggesting other factors also influence how selection impacts the genomic distribution of diversity, such as gene density [59]. Gene dense regions will be on average subject to greater selection, reducing diversity. Therefore, a more general prediction for the effect of selection across genomes is that the distribution of genetic diversity will be related to the density of elements under selection, i.e., gene density (assuming genes are the most likely targets of selection) [59]. This was the case in a recent study of E. grandis, where nucleotide diversity (at the population level) was negatively correlated with gene density in 100 kb windows across the E. grandis genome; and amongst the genomic features analysed, gene density yielded the highest correlation with nucleotide diversity, far exceeding the strength of the correlation between historical recombination rate and nucleotide diversity [27]. Thus when gene density and RR are strongly positively correlated, as in E. globulus, this may counter the influence of RR on linked selection, disrupting the relationship between RR and genetic diversity [36]. Indeed, there is growing evidence that the effects of linked selection are not confined to areas of low recombination but are often pervasive throughout genomes [36, 60], including in the forest tree poplar [61]. In support of this hypothesis, many plants exhibit a positive correlation between RR and gene density and a weak to no correlation between RR and genetic diversity [36, 60, 62]. In the case of Oryza sativa [57] and E. globulus the strong negative correlation between RR and diversity is likely to reflect, at least in part, the strong correlation between gene density and RR. However, in the case of eucalypts the contribution of other factors, such as their high genetic diversity [19, 27, 29] (see introduction) other facets of genome architecture, and the history of population size within each species, requires further investigation. Further, we cannot completely dismiss the possibly that other mechanisms are involved. For example, diversity per se could be affecting RR. It is well known that sequence divergence between individuals can inhibit recombination in adjacent regions of the genome, associated with the DNA mis-match repair system [63, 64] and this process could also contribute to the negative correlation between RR and diversity we observe.

Conclusions

Comparison of our results with previously published estimates provides evidence for stability in chromosomal level recombination rates across eucalypt species. Together with the previously reported stability in karyotypes, genome synteny/colinearity and the genomic distribution of population diversity and divergence across the globally significant lineages of Eucalyptus, our finding bode well for the transfer of genomic information between species within these lineages. We also demonstrate that the differences in RR between chromosomes are negatively correlated with species-wide genetic diversity at the chromosome level, which is opposite to the correlation observed in most other taxa. While our findings may in part reflect the scale of investigation, these insights into the recombination landscape of Eucalyptus provide testable hypotheses for future research in forest trees. Future research will investigate whether chromosomal differences in recombination rate are conserved between more distantly related eucalypt species, as well as investigating correlations between RR and genomic attributes at finer genomic scales.

Methods

Linkage map construction and comparison with physical map

Recombination rate (RR) was estimated in 10 unrelated genotypes, which were the parents of 5 full-sib pedigrees. These were all inter-provenance crosses and included three F1 (each with 183 to 184 individuals) and two ‘outbred F2’ (172 and 503 individuals) pedigrees. Together, these pedigrees sampled a diverse section of the natural distribution of E. globulus (see Hudson et al. [33], Freeman et al. [65]). All of the pedigrees had been previously used for linkage map construction [33, 65] and the progenies genotyped with diversity array technology ([DArT], DArT P/L Ltd Canberra, Australia) [66, 67] and microsatellite markers as described in Hudson et al. [33] and Freeman et al. [65]. To estimate RR in this study, we firstly rebuilt each of the 10 parental linkage maps using a subset of the markers used in the original studies. This was done to maximise the accuracy of RR estimates from the genome-anchored markers. The subset of markers were selected based on having a known and unique physical location on the BRASUZ1 reference genome of E. grandis [19] based on basic local alignment search tool (BLAST) searches, and linkage mapping criteria as described below. For BLAST searches, adaptors were removed from the DArT sequences using Cutadapt V1.2.1 [68], then BlastN searches were performed in the Blastall V2.2.26 suite (alignment length > = 50 bp; e-value = 10−50). Two BLAST analyses were conducted. Firstly, an auto-blast amongst all marker sequences was performed to identify redundant markers. Secondly, the physical location of non-redundant markers was determined through a BLAST search against the reference genome. For the latter, BLAST results were classified according to the criteria developed by Salse et al. [69]: the cumulative percentage of sequence identity (CIP) and the cumulative alignment percentage (CALP), based on the length of high-scoring segment pairs (HSP) and the sum of all HSP lengths (AL). For each marker, the most plausible position on the BRASUZ1 reference genome was identified based on the CIP value with a maximum CALP (markers with CALP > 200 % were removed) which minimised RR. Of the non-redundant markers anchored to unique genome positions, four main mapping criteria were used to select the markers used to estimate RR. Ordered by priority, these criteria were: i) genome coverage; ii) segregation; iii) linkage statistics; and iv) RR. Markers with weak linkage statistics were excluded, with the exception of a few markers which were included to maximise genome coverage, provided their map order matched the physical map. Uniparentally inherited markers segregating 1:1 were selected preferentially to biparentally inherited markers segregating 3:1 (preventing auto-correlations when comparing recombination estimates in parental maps from the same pedigree), although markers segregating 3:1 were selected in cases where they increased map coverage. Linkage analysis was performed using the maximum likelihood algorithm in Join-Map v.4 [70]. Map distance in cM was calculated using Haldane’s mapping function. To estimate RR (cM/Mb), we aligned linkage groups (LGs) against the 11 main scaffolds of BRASUZ1. MapChart 2.2 software [71] was used to compare physical and genetic maps. Markers were accepted when the RR between adjacent markers was less than 25 cM/Mb, since greater RR probably reflects genotyping error(s).

Variation in RR and relationships with genome features and population genetic parameters

RR was estimated by comparing 2,452 meioses (i.e., 1,226 progenies in total across the 5 pedigrees × 2 parents in each pedigree; see above) in E. globulus over the 640 Mb of the reference genome of E. grandis. Variation in RR was studied at two different levels: within chromosomes and genome-wide. At the chromosome level, two estimates of RR were used: i) a RR for each chromosome of each parent, calculated as the ratio of the total LG length in cM to the corresponding physical size per scaffold in Mb; and ii) an average chromosome RR (n = 11) across all 10 parents. The RR for each chromosome of each parent (n = 110; Table 2) was used to test the significance of the parent and chromosome main effects, using a fixed effect model with no interaction and Type 3 Sum of Squares. The chromosome averages (n =11) were used for studying the (Pearson’s) correlation with other chromosomes features (chromosome average data for the main features are presented in Table 4). These analyses were undertaken using PROC MIXED and PROC CORR of SAS respectively. At the genome-wide level, for each parent, we used the total genetic distance and the corresponding physical size, to estimate an average RR. Heterozygosity, diversity and divergence indexes were correlated against RR at the chromosome level. Relative heterozygosity was estimated for each individual and at the chromosome level for each individual, from the total number of polymorphic markers derived from the DArT assay. Diversity and divergence indexes were derived from independent samples [34]. These diversity indexes were calculated from six Eucalyptus species from subgenus Symphyomyrtus, using ~ 90 range-wide samples per species. The six species belong to three sections within Symphyomyrtus: section Latoangulatae (E. grandis and E. urophylla), section Maidenaria (E. globulus, E. nitens and E. dunnii) and section Exsertaria (E. camaldulensis). These species are among the most important hardwood plantations species world-wide and are representative of the sections (lineages) that contain the other economically important eucalypt species [29]. We used the average diversity (H) in each species, as well as the average diversity within, and divergence (F) between, the six species in our correlation analyses at the chromosome level. The ‘global’ estimates (i.e., those averaged across six species) of diversity and divergence were also analysed after partitioning the markers into those within genes and those >5 kb from a gene (i.e., intra- versus intergenic markers; following [34]). We also tested for relationships between RR in E. globulus and genomic features of the Eucalyptus reference genome [19], based on our estimates of average RR for each chromosome, using Pearson’s correlation. These chromosomal features included: physical size of each E. grandis chromosome scaffold (in Mb), the GC content, average gene density, the number of tandem duplicated genes and the density of transposable elements (also broken into subcategories of DNA transposable elements, retrotransposons and uncategorised elements, and superfamilies within the DNA transposable elements and retrotransposon subcategories).

Abbreviations

AL, the sum of all HSP lengths; BLAST, basic local alignment search tool; CALP, the cumulative alignment percentage; CIP, the cumulative percentage of sequence identity; cM, centimorgan; CV, coefficient of variation; DArT, diversity array technology; F, the proportion of the genetic variance in a sub population relative to the total genetic variance; H, genetic diversity assuming Hard-Weinberg equilibrium; HSP, the length of high-scoring segment pairs; LGs, linkage groups; LINE, long interspersed nuclear element; Mb, megabase; QTL, quantitative trait loci; RR, recombination rate; SINE, short interspersed nuclear element
  56 in total

1.  The recombination landscape in Arabidopsis thaliana F2 populations.

Authors:  P A Salomé; K Bomblies; J Fitz; R A E Laitinen; N Warthmann; L Yant; D Weigel
Journal:  Heredity (Edinb)       Date:  2011-11-09       Impact factor: 3.821

2.  RECORD: a novel method for ordering loci on a genetic linkage map.

Authors:  Hans Van Os; Piet Stam; Richard G F Visser; Herman J Van Eck
Journal:  Theor Appl Genet       Date:  2005-10-14       Impact factor: 5.699

3.  Mechanisms of recent genome size variation in flowering plants.

Authors:  Jeffrey L Bennetzen; Jianxin Ma; Katrien M Devos
Journal:  Ann Bot       Date:  2005-01       Impact factor: 4.357

4.  Two-parameter characterization of chromosome-scale recombination rate.

Authors:  Wentian Li; Jan Freudenberg
Journal:  Genome Res       Date:  2009-09-14       Impact factor: 9.043

Review 5.  Recombination rate variation in closely related species.

Authors:  C S Smukowski; M A F Noor
Journal:  Heredity (Edinb)       Date:  2011-06-15       Impact factor: 3.821

Review 6.  Forest tree genomics: growing resources and applications.

Authors:  David B Neale; Antoine Kremer
Journal:  Nat Rev Genet       Date:  2011-02       Impact factor: 53.242

7.  Flammable biomes dominated by eucalypts originated at the Cretaceous-Palaeogene boundary.

Authors:  Michael D Crisp; Geoffrey E Burrows; Lyn G Cook; Andrew H Thornhill; David M J S Bowman
Journal:  Nat Commun       Date:  2011-02-15       Impact factor: 14.919

8.  Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster.

Authors:  D J Begun; C F Aquadro
Journal:  Nature       Date:  1992-04-09       Impact factor: 49.962

9.  Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

Authors:  Jing Wang; Nathaniel R Street; Douglas G Scofield; Pär K Ingvarsson
Journal:  Genetics       Date:  2015-12-31       Impact factor: 4.562

10.  High-density linkage mapping in a pine tree reveals a genomic region associated with inbreeding depression and provides clues to the extent and distribution of meiotic recombination.

Authors:  Emilie Chancerel; Jean-Baptiste Lamy; Isabelle Lesur; Céline Noirot; Christophe Klopp; François Ehrenmann; Christophe Boury; Grégoire Le Provost; Philippe Label; Céline Lalanne; Valérie Léger; Franck Salin; Jean-Marc Gion; Christophe Plomion
Journal:  BMC Biol       Date:  2013-04-18       Impact factor: 7.431

View more
  12 in total

1.  The evolution of recombination rates in finite populations during ecological speciation.

Authors:  James Reeve; Daniel Ortiz-Barrientos; Jan Engelstädter
Journal:  Proc Biol Sci       Date:  2016-10-26       Impact factor: 5.349

Review 2.  Coevolution between transposable elements and recombination.

Authors:  Tyler V Kent; Jasmina Uzunović; Stephen I Wright
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-12-19       Impact factor: 6.237

3.  Mutation Load in Sunflower Inversions Is Negatively Correlated with Inversion Heterozygosity.

Authors:  Kaichi Huang; Kate L Ostevik; Cassandra Elphinstone; Marco Todesco; Natalia Bercovich; Gregory L Owens; Loren H Rieseberg
Journal:  Mol Biol Evol       Date:  2022-05-03       Impact factor: 8.800

4.  Genome-wide recombination rate variation in a recombination map of cotton.

Authors:  Chao Shen; Ximei Li; Ruiting Zhang; Zhongxu Lin
Journal:  PLoS One       Date:  2017-11-27       Impact factor: 3.240

5.  Genomic Scans across Three Eucalypts Suggest that Adaptation to Aridity is a Genome-Wide Phenomenon.

Authors:  Dorothy A Steane; Brad M Potts; Elizabeth H McLean; Lesley Collins; Barbara R Holland; Suzanne M Prober; William D Stock; René E Vaillancourt; Margaret Byrne
Journal:  Genome Biol Evol       Date:  2017-02-01       Impact factor: 3.416

6.  Genome-Wide Association Study of Major Agronomic Traits in Foxtail Millet (Setaria italica L.) Using ddRAD Sequencing.

Authors:  Vandana Jaiswal; Sarika Gupta; Vijay Gahlaut; Mehanathan Muthamilarasan; Tirthankar Bandyopadhyay; Nirala Ramchiary; Manoj Prasad
Journal:  Sci Rep       Date:  2019-03-22       Impact factor: 4.379

Review 7.  Variation in recombination frequency and distribution across eukaryotes: patterns and processes.

Authors:  Jessica Stapley; Philine G D Feulner; Susan E Johnston; Anna W Santure; Carole M Smadja
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2017-12-19       Impact factor: 6.237

8.  The Evolutionary Consequences of Transposon-Related Pericentromer Expansion in Melon.

Authors:  Jordi Morata; Marc Tormo; Konstantinos G Alexiou; Cristina Vives; Sebastián E Ramos-Onsins; Jordi Garcia-Mas; Josep M Casacuberta
Journal:  Genome Biol Evol       Date:  2018-06-01       Impact factor: 3.416

9.  Maximising recombination across macadamia populations to generate linkage maps for genome anchoring.

Authors:  Kirsty S Langdon; Graham J King; Abdul Baten; Ramil Mauleon; Peter C Bundock; Bruce L Topp; Catherine J Nock
Journal:  Sci Rep       Date:  2020-03-19       Impact factor: 4.379

10.  High-Density Linkage Maps Based on Genotyping-by-Sequencing (GBS) Confirm a Chromosome-Level Genome Assembly and Reveal Variation in Recombination Rate for the Pacific Oyster Crassostrea gigas.

Authors:  Xiaoshen Yin; Alberto Arias-Pérez; Tevfik Hamdi Kitapci; Dennis Hedgecock
Journal:  G3 (Bethesda)       Date:  2020-12-03       Impact factor: 3.154

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.