| Literature DB >> 35281801 |
Yingfeng Niu1, Guohua Li1, Shubang Ni1, Xiyong He1, Cheng Zheng1, Ziyan Liu1, Lidan Gong1, Guanghong Kong1, Wei Li2, Jin Liu1.
Abstract
Macadamia is an evergreen tree belonging to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. The M. integrifolia genome was recently sequenced, but the genome of M. tetraphylla has to date not been published, which limits the study of biological research and breeding in this species. This study reports a high-quality genome sequence of M. tetraphylla based on the Oxford Nanopore Technologies technology and high-throughput chromosome conformation capture techniques (Hi-C). An assembly of 750.87 Mb with 51.11 Mb N50 length was generated, close to the 740 and 758 Mb size estimates by flow cytometry and k-mer analysis, respectively. Genome annotation indicated that 61.42% of the genome is composed of repetitive sequences and 34.95% is composed of long terminal repeat retrotransposons. Up to 31,571 protein-coding genes were predicted, of which 92.59% were functionally annotated. The average gene length was 6,055 bp. Comparative genome analysis revealed that the gene families associated with defense response, lipid transport, steroid biosynthesis, triglyceride lipase activity, and fatty acid metabolism are expanded in the M. tetraphylla genome. The distribution of fourfold synonymous third-codon transversion showed a recent whole-genome duplication event in M. tetraphylla. Genomic and transcriptomic analysis identified 187 genes encoding 33 crucial oil biosynthesis enzymes, depicting a comprehensive map of macadamia lipid biosynthesis. Besides, the 55 identified WRKY genes exhibited preferential expression in root as compared to that in other tissues. The genome sequence of M. tetraphylla provides novel insights for breeding novel varieties and genetic improvement of agronomic traits.Entities:
Keywords: Hi-C; Macadamia tetraphylla; fatty acid biosynthesis; nanopore sequencing; whole genome duplication
Year: 2022 PMID: 35281801 PMCID: PMC8906886 DOI: 10.3389/fgene.2022.835363
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Summary of the M. tetraphylla genome assembly and annotation.
| Assembly | |
| Sequencing Depth (×) | 89.93 |
| Estimated genome size (Mb) | 758 |
| Assembled sequence length (Mb) | 750.54 |
| Scaffold N50 (bp) | 51,109,939 |
| Contig N50 (bp) | 1,182,547 |
| Annotation, | |
| Number of predicted protein-coding genes | 31,571 |
| Average gene length (bp) | 6,055 |
| tRNAs | 1,286 |
| rRNAs | 542 |
| snoRNAs | 74 |
| snRNAs | 251 |
| Transposable elements (%) | 61.42 |
FIGURE 1Landscape of macadimia genome. (A) Visualization of assembly stats (https://github.com/rjchallis/assembly-stats): the inner radius (highligthed in red color) represents the length of the longest scaffold, the radial axis originates at the circumference indicates the scaffold length, the N50 and N90 scaffold lengths are indicated respectively by dark and light orange arcs, respectively. The cumulative number of scaffolds within a given percentge of the genome is plotted in purple. The outermost circular layer shows the base composition at the given coverage of the genome. (B) Hic-contact map of macadimia genome. (C) Circos plot of macadimia genome. Tracks from outside to inside are the 14 chromosomes of M. tetraphylla, gene density (density measured in 1000-Kb sliding windows), transposable element (TE) density, Gypsy-type LTR retrotransposons density, Copia-type LTR retrotransposons density, DNA transposable element density. The syntenic blocks within chromosomes of macadimia genome are displayed with connecting lines in different colors.
FIGURE 2Evolution of macadimia genome. (A) Venn diagram showing shared and unique gene families among macadimia and other plant species. (B) Comparative genomic analysis of macadimia and other plant species. (C) Distribution of 4DTv for pairs of syntenic paralogs.
FIGURE 3Expression level of oil biosynthesis-related genes. Acetyl-CoA is converted into C16 and C18 fatty acids in the plastid. TAG is synthesized in the endoplasmic reticulum and packed in the oil bodies. The isozymes and metabolites involved in oil biosynthesis were colored in red and black, respectively. The expression levels of oil-biosynthesis genes from leaf, young flower, mature flower, root and bark, are presented with the heat map.
FIGURE 4Genome-wide investigation of WRKY gene family. (A) Unrooted phylogenetic tree among WRKY domains from macadimia genome. (B) Exon-intron structure of WRKY genes. (C) Expression profiles of the WRKY genes.