| Literature DB >> 35211148 |
Bagdevi Mishra1,2, Bartosz Ulaszewski3, Joanna Meger3, Jean-Marc Aury4, Catherine Bodénès5, Isabelle Lesur-Kupin5,6,7, Markus Pfenninger1, Corinne Da Silva4, Deepak K Gupta1,2,8, Erwan Guichoux5, Katrin Heer7,9, Céline Lalanne5, Karine Labadie4, Lars Opgenoorth7, Sebastian Ploch1, Grégoire Le Provost5, Jérôme Salse10, Ivan Scotti11, Stefan Wötzel1,2, Christophe Plomion5, Jaroslaw Burczyk3, Marco Thines1,2,8.
Abstract
The European Beech is the dominant climax tree in most regions of Central Europe and valued for its ecological versatility and hardwood timber. Even though a draft genome has been published recently, higher resolution is required for studying aspects of genome architecture and recombination. Here, we present a chromosome-level assembly of the more than 300 year-old reference individual, Bhaga, from the Kellerwald-Edersee National Park (Germany). Its nuclear genome of 541 Mb was resolved into 12 chromosomes varying in length between 28 and 73 Mb. Multiple nuclear insertions of parts of the chloroplast genome were observed, with one region on chromosome 11 spanning more than 2 Mb which fragments up to 54,784 bp long and covering the whole chloroplast genome were inserted randomly. Unlike in Arabidopsis thaliana, ribosomal cistrons are present in Fagus sylvatica only in four major regions, in line with FISH studies. On most assembled chromosomes, telomeric repeats were found at both ends, while centromeric repeats were found to be scattered throughout the genome apart from their main occurrence per chromosome. The genome-wide distribution of SNPs was evaluated using a second individual from Jamy Nature Reserve (Poland). SNPs, repeat elements and duplicated genes were unevenly distributed in the genomes, with one major anomaly on chromosome 4. The genome presented here adds to the available highly resolved plant genomes and we hope it will serve as a valuable basis for future research on genome architecture and for understanding the past and future of European Beech populations in a changing climate.Entities:
Keywords: Fagaceae; Hi-C; SNPs; chromosomes; genome architecture; genomics; repeat elements
Year: 2022 PMID: 35211148 PMCID: PMC8862710 DOI: 10.3389/fgene.2021.691058
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1The more than 300 year-old Fagus sylvatica reference individual Bhaga on a cliff over the Edersee in the Kellerwald Edersee National Park (Germany).
Comparison of BUSCO completeness in Fagaceae genomes available and in the present study (Fagus sylvatica V2).
| Species | Complete genes (%) | Single genes (%) | Duplicated genes (%) | Fragmented genes (%) | Missing genes (%) |
|---|---|---|---|---|---|
|
| 97.4 | 90.3 | 7.1 | 1.3 | 1.3 |
|
| 96.6 | 85.6 | 11 | 1.8 | 1.6 |
|
| 92.4 | 88.8 | 3.7 | 1.5 | 6.1 |
|
| 93.5 | 87.6 | 5.9 | 1.0 | 5.5 |
Distribution of exons in Fagus sylvatica in comparison to Juglans regia and Arabidopsis thaliana.
| Species | Minimum exons/gene | First quartile | Mean exons/gene | Median exons/gene | Third quartile | Maximum exons/gene |
|---|---|---|---|---|---|---|
|
| 1 | 2 | 4.916 | 4 | 7 | 70 |
|
| 1 | 2 | 5.301 | 4 | 7 | 70 |
|
| 1 | 1 | 5.299 | 4 | 7 | 79 |
FIGURE 2Locations of probable centromeric repeats on the chromosomes presented as red lines and telomeric locations as blue line on the chromosomes.
FIGURE 3Chloroplast genome insertions within 100 kb windows on the chromosomes. Each chromosome is represented as three rows, the first with insertions more than 100 bp in length, the second row with more than 1 kb and the third with more than 10 kb.
FIGURE 4Mitochondrion genome insertions within 100 kb windows on the chromosomes. Each chromosome is represented as three rows, the first with insertions more than 100 bp in length, the second row with more than 1 kb and the third with more than 10 kb.
FIGURE 5Repeat regions, coding regions, and regions coding for genes present within 100 kb windows on the chromosomes.
FIGURE 6Homozygous and heterozygous SNPs in Fagus sylvatica present within 100 kb windows on the chromosomes.
FIGURE 7Distribution of homozygous and heterozygous SNPS in non-overlapping 100 kb windows.
Size of the full-sib families identified from pedigree reconstruction.
| Candidate father | Size of the full-sib family |
|---|---|
| MSSB | 47 |
| MSSH | 68 |
| SSP01 | 24 |
| SSP02 | 27 |
| SSP03 | 4 |
| SSP04 | 10 |
| SSP05 | 16 |
| SSP06 | 13 |
| SSP07 | 9 |
| SSP08 | 17 |
| SSP09 | 12 |
| SSP10 | 9 |
| SSP11 | 17 |
| SSP12 | 86 |
| SSP13 | 15 |
| SSP14 | 10 |
| SSP15 | 2 |
| SSP16 | 13 |
| SSP17 | 3 |
| SSP18 | 8 |
| sum | 410 |
Summary statistics for three Fagus sylvatica unigene sets. The last column gives the number of homologous proteins (blastX E10-5) against the most complete fagaceae proteome (25,808 proteins) to date, that of Quercus robur (Plomion et al., 2018).
| Technologies | Assembler | # Contigs in the unigene | Identified oak proteins | # Contigs with identified proteins | |
|---|---|---|---|---|---|
|
| Sanger | MIRA | 21,057 | 22,684 | 16,512 |
| 454 Roche | |||||
| Muller et al., 2017 | Illumina | CLCBio | 44,335 | 24,804 | 24,480 |
| This study | Illumina | Velvet | 34,987 | 24,826 | 22,347 |
| ONT | Oases | ||||
| 33,013 | 24,811 | 21,886 | |||
| (≥200bp) |
In addition to Illumina and ONT RNAseq, contigs obtained from Lesur et al., 2015 were also included in the analysis. This first unigene provided a total of 609 transcripts to the new reference unigene.
Transcripts longer than 200 bp are available online (ENA, accession HBVZ01000000). Smaller contigs are available upon request.
Characteristics of the combined maternal linkage map in terms of genetic size (cM) and number of SNP markers for each linkage group (LG).
| LG | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Size (cM) | 279 | 152 | 224 | 137 | 168 | 192 | 146 | 172 | 182 | 171 | 186 | 64 | 140 | 2,213 |
| # of SNPs | 37 | 30 | 56 | 36 | 49 | 24 | 24 | 22 | 29 | 15 | 22 | 16 | 8 | 368 |
FIGURE 8Example of the high collinearity between homologous maternal (MSSB) linkage group #4 obtained from the analysis of three sets of offspring: xMSSH and xSSP12 correspond to the two largest full-sib families and x182 corresponds to the cosegregation analysis of their mapped markers in the 182 half-sibs.
Number of SNP markers of a given linkage group (LG) aligned to a specified scaffold (Bhaga_i) of the Fagus sylvatica assembly.
| Bhaga_1 | Bhaga_2 | Bhaga_3 | Bhaga_4 | Bhaga_5 | Bhaga_6 | Bhaga_7 | Bhaga_8 | Bhaga_9 | Bhaga_10 | Bhaga_11 | Bhaga_12 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LG1 | 2 | 26 | ||||||||||
| LG2 | 1 | 1 | 23 | |||||||||
| LG3 | 42 | |||||||||||
| LG4 | 1 | 1 | 22 | 1 | 1 | 1 | ||||||
| LG5 | 42 | |||||||||||
| LG6 | 1 | 16 | 1 | |||||||||
| LG7 | 16 | 1 | 1 | 2 | ||||||||
| LG8 | 1 | 15 | ||||||||||
| LG9 | 1 | 25 | ||||||||||
| LG10 | 1 | 12 | 1 | |||||||||
| LG11 | 20 | |||||||||||
| LG12 | 1 | 1 | 1 | 10 | ||||||||
| LG13 | 1 | 1 | 1 | 2 |