| Literature DB >> 34573338 |
Bartosz Ulaszewski1, Joanna Meger1, Bagdevi Mishra2,3, Marco Thines2,3,4, Jarosław Burczyk1.
Abstract
Growing amounts of genomic data and more efficient assembly tools advance organelle genomics at an unprecedented scale. Genomic resources are increasingly used for phylogenetic analyses of many plant species, but are less frequently used to investigate within-species variability and phylogeography. In this study, we investigated genetic diversity of Fagus sylvatica, an important broadleaved tree species of European forests, based on complete chloroplast genomes of 18 individuals sampled widely across the species distribution. Our results confirm the hypothesis of a low cpDNA diversity in European beech. The chloroplast genome size was remarkably stable (158,428 ± 37 bp). The polymorphic markers, 12 microsatellites (SSR), four SNPs and one indel, were found only in the single copy regions, while inverted repeat regions were monomorphic both in terms of length and sequence, suggesting highly efficient suppression of mutation. The within-individual analysis of polymorphisms showed >9k of markers which were proportionally present in gene and non-gene areas. However, an investigation of the frequency of alternate alleles revealed that the source of this diversity originated likely from nuclear-encoded plastome remnants (NUPTs). Phylogeographic and Mantel correlation analysis based on the complete chloroplast genomes exhibited clustering of individuals according to geographic distance in the first distance class, suggesting that the novel markers and in particular the cpSSRs could provide a more detailed picture of beech population structure in Central Europe.Entities:
Keywords: European beech; SNP; complete chloroplast genome; heteroplasmy; indel; microsatellite; population genomics
Mesh:
Year: 2021 PMID: 34573338 PMCID: PMC8468245 DOI: 10.3390/genes12091357
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Origin of sampled individuals and sequencing data volume.
| No. | Origin or Individual Name | Country | Longitude | Latitude | Number of Read Pairs | NCBI Accession Number | SRA Accession Number |
|---|---|---|---|---|---|---|---|
| 1 | Bhaga | Germany | 51.169167 N | 8.963056 E | [ | MW531753 | N/A |
| 2 | Jamy | Poland | 53.586019 N | 18.935019 E | [ | MW537046 | SAMN08948264 |
| 3 | Gdańsk | Poland | 54.383262 N | 18.516724 E | 3,777,769 | MW566769 | SAMN18917950 |
| 4 | Foret des Colettes | France | 46.183328 N | 2.949992 E | 4,899,373 | MW566771 | SAMN18917951 |
| 5 | Limitaciones | Spain | 42.818059 N | 2.249663 W | 6,210,877 | MW566772 | SAMN18917952 |
| 6 | Glorup | Denmark | 55.184748 N | 10.681238 E | 20,891,953 | MW566770 | SAMN18917953 |
| 7 | Łopuchówko | Poland | 52.583300 N | 17.083339 E | 5,114,816 | MW566774 | SAMN18917954 |
| 8 | Hasbruch | Germany | 53.120708 N | 8.4302740 E | 4,650,347 | MW566776 | SAMN18917955 |
| 9 | Bieszczady NP | Poland | 49.117093 N | 22.579103 E | 3,046,013 | MW566773 | SAMN18917956 |
| 10 | Eisenach | Germany | 50.087605 N | 10.106152 E | 4,461,792 | MW566778 | SAMN18917957 |
| 11 | Morbach | Germany | 50.740891 N | 6.980116 E | 5,833,195 | MW566784 | SAMN18917958 |
| 12 | Ehingen | Germany | 48.399106 N | 9.500861 E | 5,632,928 | MW566775 | SAMN18917959 |
| 13 | Veneto | Italy | 46.133489 N | 12.216683 E | 7,741,036 | MW566783 | SAMN18917960 |
| 14 | Cesky Krumlov | Czechia | 48.850035 N | 14.250406 E | 7,853,097 | MW566777 | SAMN18917961 |
| 15 | Brzeziny | Poland | 51.836489 N | 19.601247 E | 7,349,714 | MW566779 | SAMN18917962 |
| 16 | Smolenice | Slovakia | 48.485171 N | 17.372687 E | 5,072,400 | MW566782 | SAMN18917963 |
| 17 | Fantanele | Romania | 46.416750 N | 26.466475 E | 6,584,825 | MW566780 | SAMN18917964 |
| 18 | Fläming | Germany | 52.133389 N | 12.583406 E | 7,423,489 | MW566781 | SAMN18917965 |
Statistics for main chloroplast genome elements: LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
| Main Genome Elements | ||||||
|---|---|---|---|---|---|---|
| Origin or Individual Name | Read Coverage | NCBI Accession Number | Total Size (bp) | LSC | SSC | IR-A/IR-B) |
| Bhaga | - | MW531753 | 158,458 | 87,702 | 19,010 | 25,873 |
| Jamy | - | MW537046 | 158,462 | 87,705 | 19,011 | 25,873 |
| Gdańsk | 253x | MW566769 | 158,456 | 87,699 | 19,011 | 25,873 |
| Colettes | 498x | MW566771 | 158,391 | 87,634 | 19,011 | 25,873 |
| Limitaciones | 491x | MW566772 | 158,461 | 87,704 | 19,011 | 25,873 |
| Glorup | 356x | MW566770 | 158,461 | 87,704 | 19,011 | 25,873 |
| Łopuchówko | 212x | MW566774 | 158,461 | 87,704 | 19,011 | 25,873 |
| Hasbruch | 267x | MW566776 | 158,462 | 87,705 | 19,011 | 25,873 |
| Bieszczady NP | 211x | MW566773 | 158,426 | 87,669 | 19,011 | 25,873 |
| Eisenach | 105x | MW566778 | 158,456 | 87,699 | 19,011 | 25,873 |
| Morbach | 350x | MW566784 | 158,463 | 87,706 | 19,011 | 25,873 |
| Ehingen | 91x | MW566775 | 158,446 | 87,689 | 19,011 | 25,873 |
| Veneto | 625x | MW566783 | 158,463 | 87,706 | 19,011 | 25,873 |
| Cesky Krumlov | 300x | MW566777 | 158,462 | 87,705 | 19,011 | 25,873 |
| Brzeziny | 521x | MW566779 | 158,462 | 87,705 | 19,011 | 25,873 |
| Smolenice | 86x | MW566782 | 158,430 | 87,674 | 19,010 | 25,873 |
| Fantanele | 157x | MW566780 | 158,462 | 87,705 | 19,011 | 25,873 |
| Fläming | 306x | MW566781 | 158,464 | 87,705 | 19,013 | 25,873 |
General characteristics of chloroplast microsatellite markers in 18 F. sylvatica individuals.
| Mononucleotide | Dinucleotide | Pentanucleotide | Complex | Total | |
|---|---|---|---|---|---|
| Monomorphic | 93 | 2 | 4 | 27 | 126 |
| Polymorphic | 4 | - | - | 8 | 12 |
| Total | 97 | 2 | 4 | 35 | 138 |
Basic information of polymorphic chloroplast microsatellites; marker ratio—number of individuals associated with a particular marker variant; region types: LSC—Large Single Copy; SSC—Small Single Copy.
| No. | Starting Position (bp) * | Type | Region | Marker Ratio | Flanking Annotation |
|---|---|---|---|---|---|
| 1 | 4363 | Complex | SSC | 17/1 | ndhA (exon II) ↔ ndhA (exon I) |
| 2 | 8012 | Complex | SSC | 16/1/1 | psaC ↔ ndhD |
| 3 | 11,476 | Mononucleotide (A) | SSC | 17/1 | trnL ↔ rpl32 |
| 4 | 12,583 | Mononucleotide (T) | SSC | 17/0 ** | rpl32 ↔ ndhF |
| 5 | 46,142 | Complex | LSC | 16/1/1 | matK ↔ trnQ |
| 6 | 46,952 | Complex | LSC | 11/2/2/1/1/1 | matK ↔ trnQ |
| 7 | 50,589 | Mononucleotide (A) | LSC | 17/1 | trnG (exon I) ↔ trnG (exon II) |
| 8 | 55,923 | Complex | LSC | 16/2 | atpH ↔ atpI |
| 9 | 70,097 | Complex | LSC | 16/2 | rpoB ↔ trnC |
| 10 | 92,043 | Mononucleotide (A) | LSC | 16/2 | trnG (exon II) ↔ trnG (exon I) |
| 11 | 105,126 | Complex | LSC | 12/5/1 | ycf4 ↔ cemA |
| 12 | 107,580 | Complex | LSC | 17/1 | petA ↔ psbJ |
* according to the Bhaga reference; ** marker absent in an individual
Summary of the variant sites detected in the 18 chloroplast genomes, region types: LSC—Large Single Copy; SSC—Small Single Copy.
| No. | Position (bp) * | Marker Type | Region | Consensus | Alternative | Area | Marker Ratio | Flanking Annotation |
|---|---|---|---|---|---|---|---|---|
| 1 | 12,587 | SNP | SSC | T | C | noncoding | 17/1 | rpl32 ↔ ndhF |
| 2 | 46,985 | SNP | LSC | G | A | noncoding | 17/1 | tRNA-K ↔tRNA-Q |
| 3 | 71,204 | SNP | LSC | G | T | noncoding | 9/9 | tRNA-C ↔ petN |
| 4 | 80,558 | Indel | LSC | T | - | noncoding | 17/1 | psbZ ↔ tRNA-G |
| 5 | 112,198 | SNP | LSC | A | C | noncoding | 17/1 | psaJ ↔ rpl3 |
* the position (bp) is referred to the Bhaga genome
Summary statistics of within individual polymorphisms detected in regions of the 16 chloroplast genome assemblies. LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
| LSC | SSC | IR-A | IR-B | |
|---|---|---|---|---|
| Avg. variant depth | 349x | 360x | 477x | 477x |
| Avg. alternative var. depth | 18.7x | 16.1x | 18.4x | 18.5x |
| Number of uniqe positions | 5348 | 1161 | 1257 | 1262 |
| SNP | 76.8% | 80.9% | 83.7% | 84.1% |
| Indel | 10.2% | 8.8% | 9.6% | 9.4% |
| Complex | 8.2% | 6.2% | 3.1% | 3.1% |
| MNP | 0.2% | 0.3% | 0.8% | 0.7% |
| Mix | 4.6% | 3.9% | 2.7% | 2.7% |
| Coding | 48.6% | 67.3% | 62.9% | 63.1% |
| Non-coding | 51.4% | 32.7% | 37.1% | 36.9% |
Figure 1Share or markers in coding and non-coding regions in relation to the size share of coding and non-coding elements in main genome regions: LSC—large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Figure 2Phylogenetic relationships among 18 F. sylvatica individuals, as inferred using Maximum Likelihood, with F. crenata, F. japonica, and F. engleriana as outgroup. Numbers on nodes indicate percentages of bootstrap support from 1000 bootstrap replicates, the genetic distance between the F.sylvatica individuals and the outgroup was shorten by 0.04.
Summary of Mantel’s test statistics calculated within consecutive distance classes.
| Class | Boundry max (km) | Number of Pairs | Mantel r |
|
|---|---|---|---|---|
| 1 | 250 | 11 | 0.286 | 0.011 |
| 2 | 500 | 31 | 0.106 | 0.361 |
| 3 | 750 | 46 | 0.121 | 0.144 |
| 4 | 1000 | 27 | −0.016 | 0.760 |
| 5 | 1250 | 15 | −0.004 | 0.900 |
| 6 | 1500 | 11 | −0.023 | 0.374 |