| Literature DB >> 35656016 |
Li Wang1, Jianguo Zhang1,2, Dan Peng3, Yang Tian1, Dandan Zhao1, Wanning Ni1, Jinhua Long1, Jinhua Li1, Yanfei Zeng1,2, Zhiqiang Wu3, Yiyun Tang4, Zhaoshan Wang1,2.
Abstract
The olive tree (Olea europaea L.) is the most iconic fruit crop of the Mediterranean Basin. Since the plant was introduced to China in the 1960s, the summer rain climate makes it susceptible to pathogens, leading to some olive diseases. Olea europaea L. subsp. cuspidata is natively distributed in the Yunnan province of China. It has a smaller fruit size, lower oil content, and higher resistance compared to subsp. europaea, which makes subsp. cuspidata a critical germplasm resource to be investigated. Here, a high-quality genome of subsp. cuspidata with 1.38 Gb in size was assembled and anchored onto 23 pseudochromosomes with a mounting rate of 85.57%. It represents 96.6% completeness [benchmarking universal single-copy orthologs (BUSCO)] with a contig N50 of 14.72 Mb and a scaffold N50 of 52.68 Mb, which shows a significant improvement compared with other olive genomes assembled. The evaluation of the genome assembly showed that 92.31% of resequencing reads and an average of 96.52% of assembled transcripts could be aligned to the assembled genome. We found that a positively selected gene, evm.model.Chr16.1133, was shared with the results of transcriptome analysis. This gene belongs to the susceptible gene and negatively regulates the disease resistance process. Furthermore, we identified the Cercospora genus which causes the leaf spot disease in the infected leaves. The high-quality chromosome-level genomic information presented here may facilitate the conservation and utilization of germplasm resources of this subspecies and provide an essential genetic basis for further research into the differences in oil content and resistance between subsp. cuspidata and europaea.Entities:
Keywords: Olea europaea; demographic history; genome assembly; nature selection; susceptibility gene
Year: 2022 PMID: 35656016 PMCID: PMC9152427 DOI: 10.3389/fpls.2022.879822
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 6.627
Statistics of assembled subsp. cuspidata genome.
| Term | Contig size (bp) | Contig number | Scaffold size (bp) | Scaffold number |
| N90 | 350,652 | 257 | 885,961 | 52 |
| N80 | 2,313,398 | 94 | 34,033,801 | 21 |
| N70 | 6,370,930 | 55 | 40,664,807 | 17 |
| N60 | 11,521,701 | 40 | 44,152,604 | 14 |
| N50 | 14,716,965 | 30 | 52,676,021 | 11 |
| Max length (bp) | 38,043,138 | 90,127,509 | ||
| Total size (bp) | 1,379,115,243 | 1,379,304,243 | ||
| Total number | 3,073 | 2,695 | ||
| Average length | 448,784.65 | 511,801.20 |
FIGURE 1Genome-wide Hi-C interaction heatmap and Genomic landscape. (A) Hi-C interaction heat map between 23 chromosomes for the subsp. cuspidata genome. (B) Genomic landscape of subsp. cuspidata chromosomes. Visualize the genome assembly chromosome, gene density, GC content, repeat content, SNP density, and gene collinearity on a single pseudochromosome from the outer ring to the inside.
Statistics of chromosomal level assembly of subsp. cuspidata.
| Chr ID | Length (bp) | Chr ID | Length (bp) | Chr ID | Length (bp) |
| Chr1 | 90,127,509 | Chr9 | 57,282,915 | Chr17 | 40,664,807 |
| Chr2 | 83,097,257 | Chr10 | 52,971,700 | Chr18 | 39,899,167 |
| Chr3 | 70,287,963 | Chr11 | 52,676,021 | Chr19 | 37,263,953 |
| Chr4 | 64,129,678 | Chr12 | 47,592,757 | Chr20 | 37,211,276 |
| Chr5 | 61,350,988 | Chr13 | 45,546,967 | Chr21 | 34,033,801 |
| Chr6 | 59,983,315 | Chr14 | 44,152,604 | Chr22 | 31,166,573 |
| Chr7 | 58,685,853 | Chr15 | 42,848,148 | Chr23 | 29,903,841 |
| Chr8 | 58,506,042 | Chr16 | 40,951,526 | ||
| Total chromosome level contig length | 1,180,334,661 | ||||
| Total contig length | 1,379,304,243 | ||||
| Chromosome length/Total length | 85.57% | ||||
Completeness assessment of subsp. cuspidata genome by BUSCO.
| Library | eudicotyledons_odb10 |
| Complete BUSCOs (C) | 2048 |
| Complete and single-copy BUSCOs (S) | 1717 |
| Complete and duplicated BUSCOs (D) | 331 |
| Fragmented BUSCOs (F) | 24 |
| N50Missing BUSCOs (M) | 49 |
| Total BUSCO groups searched | 2121 |
| Summary (Complete BUSCOs/Total BUSCOs) | 96.6% |
Statistics of TE annotated repeat sequences in subsp. cuspidata genome.
| Class | Sub-Class | Type | Length (bp) | Percent (%) |
|
|
| Ty1/Copia | 137,408,274 | 9.96% |
| Ty3/Gypsy | 205,089,955 | 14.87% | ||
| unknown | 63,995,280 | 4.64% | ||
|
| LINE | 1,905,667 | 0.14% | |
| unknown | 423,876 | 0.03% | ||
|
|
| CACTA | 18,631,143 | 1.35% |
| Mutator | 347,273,079 | 25.18% | ||
| PIF/Harbinger | 19,703,347 | 1.43% | ||
| Tc1/Mariner | 2,351,832 | 0.17% | ||
| hAT | 28,621,867 | 2.08% | ||
|
| helitron | 48,941,818 | 3.55% | |
|
| 960,043,533 | 69.61% |
Statistics of functional annotation of protein-coding genes in subsp. cuspidata genome.
| Database | Annotated gene number | Percent (%) |
| GO | 26,012 | 57.60 |
| KEGG | 8,327 | 18.44 |
| KOG | 8,941 | 19.80 |
| SwissProt | 33,018 | 73.12 |
| Pfam annotation | 32,739 | 72.50 |
| Nr annotation | 45,146 | 99.98 |
FIGURE 2Phylogenetic relationship and divergence time among species. Pie charts show the proportion of gene families that are expanded (red), contracted (blue), and conserved (yellow) across all gene families in the 14 species. The red number in each node represents the bootstrap value. The number in parentheses in each internal node indicates the estimated divergence time interval (in millions of years).
FIGURE 3Population history analysis of subsp. cuspidata and “Arbequina”. SMC++ estimates the effective population size (Ne) changes for subsp. cuspidata and “Arbequina,” and estimates the splice time between subsp. cuspidata and Arbequina”.
FIGURE 4Whole-genome duplication (WGD) analysis. (A) Ks distributions analysis. Peaks of intraspecies Ks distributions indicate whole genome polyploidization events, and peaks of interspecies Ks distributions indicate speciation events. (B) The 4DTv distribution of gene pairs in subsp. cuspidata and other genomes. The x-coordinate is the 4DTv value, and the y-coordinate represents the proportion of genes corresponding to the 4DTv values.
FIGURE 5Dot synteny diagram of chromosomes in subsp. cuspidata and “Arbequina”.
FIGURE 6Petal diagram of the gene families for six oil species. The middle number represents the gene families shared by all species and the number of gene families unique to each species is on the side.
FIGURE 7The leaves of the three subspecies. The symptom of infected leaves is identical in “Arbequina” and “Arbosana”. (A,B) Represents the front and back of the infected and healthy leaves, respectively. (C) Represents the front and back of the healthy leaves of subsp. cuspidata.
Statistics of the FPKM values for evm.model.Chr16.1133.
| Species | Healthy leaves | Mean | Infected leaves | Mean |
| 1.948 | 1.047 | |||
| ‘ | 0.999 | 1.793 | 0.240 | 0.558 |
| 2.433 | 0.388 | |||
| 2.871 | 1.212 | |||
| ‘ | 2.239 | 3.150 | 0.677 | 0.818 |
| 4.341 | 0.566 | |||
| 0.449 | - | |||
| subsp. | 0.870 | 0.583 | - | - |
| 0.431 | - |
Genetic diversity of evm.model.Chr16.1133 in 29 cuspidata and 25 cultivar individuals.
| Species | Tajima’s | θπ | Polymorphic sites |
| subsp. | – | – | – |
| Cultivars | 0.929 | 0.003 | 21 |