| Literature DB >> 34748273 |
Matt Huff1, Josiah Seaman2,3, Di Wu4, Tetyana Zhebentyayeva4, Laura J Kelly2,3, Nurul Faridi5,6, Charles D Nelson5,7, Endymion Cooper2, Teodora Best4, Kim Steiner4, Jennifer Koch8, Jeanne Romero Severson9, John E Carlson4, Richard Buggs2,3, Margaret Staton1.
Abstract
Green ash (Fraxinus pennsylvanica) is the most widely distributed ash tree in North America. Once common, it has experienced high mortality from the non-native invasive emerald ash borer (EAB; Agrilus planipennis). A small percentage of native green ash trees that remain healthy in long-infested areas, termed "lingering ash," display partial resistance to the insect, indicating that breeding and propagating populations with higher resistance to EAB may be possible. To assist in ash breeding, ecology and evolution studies, we report the first chromosome-level assembly from the genus Fraxinus for F. pennsylvanica with over 99% of bases anchored to 23 haploid chromosomes, spanning 757 Mb in total, composed of 49.43% repetitive DNA, and containing 35,470 high-confidence gene models assigned to 22,976 Asterid orthogroups. We also present results of range-wide genetic variation studies, the identification of candidate genes for important traits including potential EAB-resistance genes, and an investigation of comparative genome organization among Asterids based on this reference genome platform. Residual duplicated regions within the genome probably resulting from a recent whole genome duplication event in Oleaceae were visualized in relation to wild olive (Olea europaea var. sylvestris). We used our F. pennsylvanica chromosome assembly to construct reference-guided assemblies of 27 previously sequenced Fraxinus taxa, including F. excelsior. Thus, we present a significant step forward in genomic resources for research and protection of Fraxinus species.Entities:
Keywords: zzm321990Fraxinuszzm321990; comparative genomics; emerald ash borer; genome annotation; genome assembly; green ash; whole genome duplication
Mesh:
Year: 2021 PMID: 34748273 PMCID: PMC9299157 DOI: 10.1111/1755-0998.13545
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 8.678
FIGURE 1High‐density, consensus genetic linkage map of Fraxinus pennsylvanica. (a) Consensus genetic map of the PE0048 × PE0248 F. pennsylvanica cross composed of 4193 SNPs. Black tickmarks represent markers segregating either in female or male parent. Vertical scale on the left reflects map genetic distances in centimorgans (cM). Coloured scale in the bottom shows variation in marker density (cM per locus) across the linkage groups. (b) Alignment of sequence‐based genetic markers from the high‐resolution linkage map (bottom) to the chromosomes of the F. pennsylvanica genome assembly (top)
Number of markers per linkage group for the consensus PE0048 × PE0248 genetic linkage map used for scaffolding and verification of Fraxinus pennsylvanica genome assembly
| LG | Distance (cM) | SNPs | Marker density (cM per locus) |
|---|---|---|---|
| LG1 | 71.3 | 226 | 0.32 |
| LG2 | 104.5 | 247 | 0.42 |
| LG3 | 85.3 | 182 | 0.47 |
| LG4 | 83.3 | 183 | 0.46 |
| LG5 | 101.4 | 275 | 0.37 |
| LG6 | 75.6 | 177 | 0.43 |
| LG7 | 63.9 | 202 | 0.32 |
| LG8 | 83.7 | 179 | 0.47 |
| LG9 | 77.8 | 224 | 0.35 |
| LG10 | 82.2 | 218 | 0.39 |
| LG11 | 64.6 | 181 | 0.36 |
| LG12 | 84.9 | 208 | 0.41 |
| LG13 | 71.9 | 241 | 0.30 |
| LG14 | 71.5 | 156 | 0.46 |
| LG15 | 82.9 | 142 | 0.58 |
| LG16 | 49.6 | 94 | 0.53 |
| LG17 | 65.7 | 185 | 0.35 |
| LG18 | 59.8 | 148 | 0.40 |
| LG19 | 56.7 | 128 | 0.44 |
| LG20 | 57.9 | 122 | 0.47 |
| LG21 | 64.1 | 153 | 0.42 |
| LG22 | 62.7 | 159 | 0.39 |
| LG23 | 54.5 | 163 | 0.33 |
|
| 1675.9 | 4193 | 0.40 |
Summary of the Fraxinus pennsylvanica genome assembly
| Total length | 756,791,288 |
| No. of scaffolds | 110 |
| Average length | 6,879,920.8 |
| Largest scaffold | 56,547,140 |
| Smallest scaffold | 10,000 |
| Number of Ns | 91,729,614 |
| Chromosome length | 755,065,760 |
| Chromosome scaffolds (%) | 99.77% |
| GC (%) | 34.40% |
| Repetitive elements (%) | 48.80% |
| Protein coding gene models | 35,470 |
| Reads mapped (%) | 88.97% |
| Collinear markers (%) | 97.40% |
| BUSCO description | Number in genome |
| Complete BUSCOs (C) | 1566 (97.03%) |
| Complete and single‐copy BUSCOs (S) | 1345 (83.33%) |
| Complete and duplicated BUSCOs (D) | 221 (13.69%) |
| Fragmented BUSCOs (F) | 25 (1.55%) |
| Missing BUSCOs (M) | 23 (1.43%) |
| Total BUSCO groups searched | 1614 (100%) |
FIGURE 2Fluorescence in situ hybridization of green ash metaphase chromosomes with 35S rDNA (green signals) and 5S rDNA synthetic oligo probes (red signals). (a) The major 35S rDNA locus (green arrows) is colocalized and overlapping with the 5S rDNA locus (red arrows); arrowheads point at the minor 35S rDNA locus. (b) Image (a) was captured under reduced DAPI intensity, providing increased contrast for the green and red fluorochromes and showing overlapping region of 35S and 5S rDNA (see inset). Scale bar =2 µm. (c) Close‐up and visual summary of major 35S locus FISH results. Cen, centromere; LA, long arm; NOR, nucleolus organizer region; SA, short arm; SAT, satellite
FIGURE 3Characterization of genome duplications in Fraxinus pennsylvanica. (a) Distribution of K s values of F. pennsylvanica and other Asterid species. (b) Visualization of blocks of synteny within the green ash genome probably due to an Oleaceae‐specific WGD. Lines link paralogues with a K s value of ≤0.25. (c) Block diagram of internal synteny between ash chromosomes
Categories of identified SNPs
| Total number of SNPs | 28,005 | 100% |
|---|---|---|
| Transversion | ||
| A/C | 2810 | 10.03% |
| A/T | 3467 | 12.38% |
| C/G | 2058 | 7.35% |
| G/T | 2806 | 10.02% |
| Transition | ||
| A/G | 8402 | 30.00% |
| C/T | 8462 | 30.22% |
FIGURE 4Population structure of the accessions. (a) structure results for genetic variation at K = 2 for 85 individuals representing 56 provenances. The coloured plot represents the estimates of Q (the estimated proportion of an individual's ancestry from each subpopulation). From left to right, the first 18 provenances (MO_177–OH_141) we classed as “pure southern,” provenances 19–29 (IL_169–NY_373) we classed as “admixed” and provenances 30–56 (MI_293–Man_513) we classed as “pure northern.” (b) Scatter plot principal component axis one (PC1) and axis two (PC2) based on genotype data of 85 samples. The x‐axis is PC1 and explained 10.8% of the variation and y‐axis is PC2 and explained 1.6% of the variation. Samples were coloured according to structure results. Orange, blue and green represent the “pure northern,” “pure southern” and “admixed” sets of individuals, respectively, from structure analysis (i.e., Q). (c) Geographical distribution of the seed source provenance locations of the trees, with each location labelled with the same colouring scheme as in (b)
Genetic (SNP) variation statistics for the RADseq individuals, grouped according to inferred subpopulation structure; F ST scores reflect subpopulation diversity compared to the total population diversity
|
|
|
|
| |
|---|---|---|---|---|
| Pure northern | 0.2022 | 0.2186 | 0.0012 | 0.0750 |
| Admixed | 0.2436 | 0.2485 | 0.0123 | 0.0197 |
| Pure southern | 0.2141 | 0.2327 | 0.0036 | 0.0799 |
Pairwise F ST values among “pure” and “admixed” provenances
| Admixed | Pure northern | |
|---|---|---|
| Pure northern | 0.040 | — |
| Pure southern | 0.041 | 0.111 |
FIGURE 5Manhattan plots. Significant loci associated with (a) budburst, (b) survival after EAB infestation, (c) foliage coloration, and (d) height. Each dot represents an SNP. Red horizontal line indicates the Bonferroni‐corrected significance threshold—log10(p) = 5.76. SNPs with FDR below 0.5 are circled
Genomic information on the significant marker‐trait associations in Fraxinus pennsylvanica assembly version 1.4
| Traits | Marker | Chr. | Position (bp) |
|
| FDR | Allelic effect (%) | Candidate gene |
|---|---|---|---|---|---|---|---|---|
| Survival after EAB infestation | 11179_87 | 12 | 19,832,226 | Fp_g28098 | 1.51E‐07 | 0.004 | 1.336 | RGG repeats nuclear RNA binding protein A‐like |
| 25645_98 | 3 | 5,788,464 | Fp_g7050 | 2.95E‐06 | 0.04 | 1.419 | Trihelix transcription factor ASIL2‐like | |
| 60870_103 | 10 | 14,733,501 | g23587 (filtered by gFACs) | 4.97E‐06 | 0.045 | 1.407 | ND | |
| Height_1979 | 52671_77 | 11 | 2,580,511 | Fp_g25431 | 1.50E‐06 | 0.041 | 0.052 | Uncharacterized protein LOC111393761 |
| Foliage colour | 4493_23 | 21 | 16,892,681 | Fp_g45726 | 1.06E‐06 | 0.029 | −1.976 | bZIP transcription factor 68‐like |
| 21238_22 | 23 | 10,624,963 | n/a | 3.66E‐06 | 0.05 | 4.394 | ND | |
| Bud burst | 77390_92 | 7 | 26,316,397 | Fp_g28596 | 1.34E‐07 | 0.002 | −5.752 | Uncharacterized protein LOC111368578 |
| 32755_118 | 2 | 8,992,315 | Fp_g4611 | 1.95E‐07 | 0.002 | −5.581 | Uncharacterized protein LOC111390290 | |
| 32756_125 | 2 | 8,992,444 | Fp_g4611 | 1.95E‐07 | 0.002 | −5.581 | Uncharacterized protein LOC111390290 | |
| 22279_18 | 23 | 3,577,167 | None | 1.38E‐06 | 0.009 | 5.859 | ND | |
| 62631_57 | 10 | 28,431,749 | None | 1.82E‐06 | 0.01 | 6.137 | ND | |
| 29105_13 | 17 | 5,653,169 | Fp_g38410 | 2.54E‐06 | 0.012 | −3.365 | Probable pectinesterase/pectinesterase inhibitor 12 isoform X1 | |
| 73875_59 | 18 | 15,502,133 | n/a | 1.02E‐05 | 0.039 | −2.102 | ND | |
| 37899_120 | 8 | 14,147,071 | g19520 (filtered by gFACs) | 1.55E‐05 | 0.048 | 4.336 | ND | |
| 21997_49 | 23 | 7,186,335 | Fp_g48399 | 1.57E‐05 | 0.048 | 3.388 | Transcription factor CYCLOIDEA‐like |
ND, SNP location was not within an annotated gene.
Comparison of Fraxinus excelsior genome statistics before (version 0.5) and after (version 0.7) reference‐guided assembly and reannotation
| BATG version 0.5 | BATG version 0.7 | |
|---|---|---|
| Total scaffolds | 89,514 | 71,971 |
| Assembly size (Mbp) | 867.5 | 869.2 |
| N50 | 103,995 | 30,774,430 |
| Ns | 149,164,818 (17.19%) | 150,919,118 (17.36%) |
| Complete BUSCOs (all) | 1436 (88.97%) | 1532 (94.92%) |
| Complete BUSCOs (single copy) | 1248 (77.32%) | 1343 (83.21%) |
| Complete BUSCOs (duplicated) | 188 (11.65%) | 189 (11.71%) |
| Fragmented BUSCOs | 79 (4.90%) | 38 (2.35%) |
| Missing BUSCOs | 99 (6.13%) | 44 (2.73%) |
| Total BUSCOs searched | 1614 | 1614 |
FIGURE 6Reference‐guided assembly of Fraxinus taxa genomes. (a) Pairing of a phylogenetic tree of available genomes in the genus Fraxinus and a bar chart illustrating percentage placement of base pairs of the original genomes to the green ash genome. F. pennsylvanica's placement is in orange to denote phylogenetic location relative to other species. (b) Illustration of the total complete BUSCOs identified before and after RagTag scaffolding. Abbreviations: “Subsp. ang…” = subspecies angustifolia, “Subsp. syri…” = subspecies syriaca, and “Subsp. oxy…” = subspecies oxycarpa. The red bars indicate additional, complete BUSCOs detected after reference‐guided scaffolding
Orthogroup statistics of genes from six Asterids
|
|
|
|
|
|
| Total | |
|---|---|---|---|---|---|---|---|
| Total genes | 35,470 | 50,684 | 28,140 | 25,574 | 35,768 | 32,118 | 207,754 |
| Number of genes in orthogroups | 32,239 (90.90%) | 45,008 (88.80%) | 25,534 (90.70%) | 23,062 (90.20%) | 30,009 (83.90%) | 28,488 (88.70%) | 184,340 (88.70%) |
| Number of species‐specific orthogroups | 675 | 939 | 580 | 530 | 1071 | 1290 | 5085 |
| Number of genes in species‐specific orthogroups | 1766 (4.98%) | 8194 (16.17%) | 2658 (9.45%) | 2361 (9.23%) | 5083 (14.21%) | 6130 (19.09%) | 26,192 (12.61%) |
| Number of unassigned genes | 3231 (9.1%) | 5676 (11.2%) | 2606 (9.3%) | 2512 (9.8%) | 5759 (16.1%) | 3630 (11.3%) | 23,414 (11.3%) |
| Predicted gene duplication events | 7292 | 19,050 | 7904 | 5761 | 11,282 | 11,239 | 62,528 |
Ash to olive chromosomal synteny defined by single‐copy orthologues
| Ash chromosome | Olive chromosome |
|---|---|
| Chr01 | Chr10 |
| Chr02 | Chr06 (RC) |
| Chr03 | Chr18 |
| Chr04 | Chr13 (RC) |
| Chr05 | Chr11 (RC) |
| Chr06 | Chr07 |
| Chr07 | Chr03 (RC) |
| Chr08 | Chr02 (RC) |
| Chr09 | Chr01 |
| Chr10 | Chr15 (RC) |
| Chr11 | Chr04 |
| Chr12 | Chr18, Chr19 |
| Chr13 | Chr12 |
| Chr14 | Chr22 |
| Chr15 | Chr14 (RC) |
| Chr16 | Chr20 (RC) |
| Chr17 | Chr17 |
| Chr18 | Chr16 |
| Chr19 | Chr09 |
| Chr20 | Chr21 |
| Chr21 | Chr08 |
| Chr22 | Chr05 (RC), Chr16 |
| Chr23 | Chr23 |
RC indicates the chromosome is in the reverse complemented orientation in the wild olive genome (Unver et al., 2017).