| Literature DB >> 36212310 |
Biao Han1, Longxin Wang2, Yang Xian1, Xiao-Man Xie1, Wen-Qing Li1, Ye Zhao3, Ren-Gang Zhang4, Xiaochun Qin2, De-Zhu Li5, Kai-Hua Jia6.
Abstract
Quercus variabilis (Fagaceae) is an ecologically and economically important deciduous broadleaved tree species native to and widespread in East Asia. It is a valuable woody species and an indicator of local forest health, and occupies a dominant position in forest ecosystems in East Asia. However, genomic resources from Q. variabilis are still lacking. Here, we present a high-quality Q. variabilis genome generated by PacBio HiFi and Hi-C sequencing. The assembled genome size is 787 Mb, with a contig N50 of 26.04 Mb and scaffold N50 of 64.86 Mb, comprising 12 pseudo-chromosomes. The repetitive sequences constitute 67.6% of the genome, of which the majority are long terminal repeats, accounting for 46.62% of the genome. We used ab initio, RNA sequence-based and homology-based predictions to identify protein-coding genes. A total of 32,466 protein-coding genes were identified, of which 95.11% could be functionally annotated. Evolutionary analysis showed that Q. variabilis was more closely related to Q. suber than to Q. lobata or Q. robur. We found no evidence for species-specific whole genome duplications in Quercus after the species had diverged. This study provides the first genome assembly and the first gene annotation data for Q. variabilis. These resources will inform the design of further breeding strategies, and will be valuable in the study of genome editing and comparative genomics in oak species.Entities:
Keywords: Hi-C sequencing; PacBio HiFi sequencing; Quercus variabilis; comparative genomics; genome assembly
Year: 2022 PMID: 36212310 PMCID: PMC9538376 DOI: 10.3389/fpls.2022.1001583
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 6.627
The statistics for genome assembly of six Quercus species.
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
| Sequencing platform | DNBSEQ, Pacbio Sequel II, Hi-C | PacBio, 10X Genomics | Illumina,PacBio, Hi-C | Illumina, Roche 454 | Illumina, PacBio, Hi-C | Illumina |
|
| ||||||
| Assembly level | Chromosome | Chromosome | Chromosome | Chromosome | Chromosome | Scaffold |
| Total contig length (Mb) | 796 | 756 | 810 | 790 |
| 934 |
| Number of contigs | 327 | 770 | 645 | 22,615 |
| 36,760 |
| N50 of contigs (Mb) |
| 1.44 | 2.64 | 0.07 | 1.9 | 0.08 |
| Total scaffold length (Mb) | 796 | 758 | 810 | 814 | 847 | 953 |
| Number of scaffolds | 245 | 388 | 330 | 1,409 | 2,014 | 23,344 |
| N50 of scaffolds (Mb) | 64.9 | 2.9 | 66.7 | 1.3 |
| 0.5 |
| Number of chromosomes | 12 | 12 | 12 | 12 | 12 |
|
| Total chromosome length (Mb) | 787 | 750 | 775 | 717 | 811 |
|
| % Sequence anchored on chromosome |
|
| 96 | 96 | 96 | 0 |
| Complete BUSCOs (%) |
| 91 | 93 | 91 | 95 | 95 |
Data not shown in the original articles; numbers in bold represent the best in each category.
Information on the genome assemblies of Q. acutissima, Q. mongolica, Q. robur, Q. lobata, and Q. suber was taken from previous reports (Plomion et al., 2018; Ramos et al., 2018; Ai et al., 2020; Fu et al., 2022; Sork et al., 2022).
Figure 1Overview of Quercus variabilis genome assembly and annotation. (A) Genome-wide chromatin interaction analysis of the Q. variabilis genome based on Hi-C data. (B) The distribution of gene and repeat sequences (DNA, LINE, LTR_Copia, LTR_Gypsy, and LTR_other) across Chr12. The height of the bars represents the distribution density of these categories and the window size was set to 50 kb.
Figure 2Genome functional annotation and evolution of Q. variabilis. (A) Venn diagram showing shared and unique gene functional annotations among InterPro, KEGG, SwissProt, KOG and NR databases. (B) Phylogenetic tree, divergence time and gene family expansion and contraction among 14 plant species. (C) Ks distribution of Q. variabilis-Q. lobata, Q. variabilis-Q. robur, Q. variabilis-Q. suber, Q. variabilis-Q. variabilis and P. persica-P. persica based on orthologous and paralogous gene pairs.
Figure 3Gene ontology (GO) functional enrichment analysis of the unique genes of Q. variabilis.
Figure 4Synteny blocks identified between Q. variabilis and Q. lobata, and Q. variabilis and Q. robur.