| Literature DB >> 35646087 |
Talambedu Usha1, Sushil Kumar Middha2, Dinesh Babu3, Arvind Kumar Goyal4, Anupam J Das5, Deepti Saini6, Aditya Sarangi7, Venkatesh Krishnamurthy8, Mothukapalli Krishnareddy Prasannakumar9, Deepak Kumar Saini10, Kora Rudraiah Sidhalinghamurthy1.
Abstract
The wonder fruit pomegranate (Punica granatum, family Lythraceae) is one of India's economically important fruit crops that can grow in different agro-climatic conditions ranging from tropical to temperate regions. This study reports high-quality de novo draft hybrid genome assembly of diploid Punica cultivar "Bhagwa" and identifies its genomic features. This cultivar is most common among the farmers due to its high sustainability, glossy red color, soft seed, and nutraceutical properties with high market value. The draft genome assembly is about 361.76 Mb (N50 = 40 Mb), ∼9.0 Mb more than the genome size estimated by flow cytometry. The genome is 90.9% complete, and only 26.68% of the genome is occupied by transposable elements and has a relative abundance of 369.93 SSRs/Mb of the genome. A total of 30,803 proteins and their putative functions were predicted. Comparative whole-genome analysis revealed Eucalyptus grandis as the nearest neighbor. KEGG-KASS annotations indicated an abundance of genes involved in the biosynthesis of flavonoids, phenylpropanoids, and secondary metabolites, which are responsible for various medicinal properties of pomegranate, including anticancer, antihyperglycemic, antioxidant, and anti-inflammatory activities. The genome and gene annotations provide new insights into the pharmacological properties of the secondary metabolites synthesized in pomegranate. They will also serve as a valuable resource in mining biosynthetic pathways for key metabolites, novel genes, and variations associated with disease resistance, which can facilitate the breeding of new varieties with high yield and superior quality.Entities:
Keywords: Punica granatum (cultivar Bhagwa); flavonoids biosynthesis; hybrid assembly; next-generation sequencing; oxford nanopore; phenylpropanoid pathway; whole genome
Year: 2022 PMID: 35646087 PMCID: PMC9130716 DOI: 10.3389/fgene.2022.786825
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Overview of the study from the isolation of DNA to assembly and analysis of the genome.
Statistics of draft genome assembly of P. granatum.
| Statistics without reference |
|
|
|---|---|---|
| Estimated genome size (Mb) | 313.18 Mb | 352.38 Mb |
| Assembled genome size (Mb) | 320.31 Mb | 361.76 Mb |
| Number of scaffolds (≥1 kb) | 473 | 122 |
| N50 scaffold length (Mb) | 39.02 | 39.79 |
| Longest scaffold (Mb) | 54.25 | 86.06 |
| Total size of assembled contigs (Mb) | 320.33 Mb | 361.76 Mb |
| Number of contigs (≥1 kb) | 661 | 7,640 |
| Number of contigs (≥ 50,000 bp) | 399 | 1,555 |
| N50 contig length (kb) | 4,489.929 | 61.031 |
| Largest contig (kb) | 14,772.832 kb | 487.599 kb |
| Total length | 320494280 | 361760465 |
| N50 length (Mb) | 39.95 Mb | 40.75 Mb |
| GC (%) | 40.38 | 38.86 |
| Number of genes | 33,594 | 30,803 |
Qualitative analysis of draft genome assembly.
| Measures |
|
|
|
|
|
|---|---|---|---|---|---|
| No. (percentage) of complete BUSCOs (C) | 2,114 (90.9%) | 2,156 (92.7%) | 2,159 (92.8%) | 2,150 (92.5%) | 2,114 (90.9%) |
| No. (percentage) of complete and single-copy BUSCOs (S) | 2,012 (86.5%) | 2,096 (90.1%) | 2,099 (90.2%) | 2,069 (89.0%) | 2,062 (88.7%) |
| No. (percentage) of complete and duplicated BUSCOs (D) | 102 (4.4%) | 60 (2.6%) | 60 (2.6%) | 81 (3.5%) | 52 (2.2%) |
| No. (percentage) of fragmented BUSCOs (F) | 91 (3.9%) | 79 (3.4%) | 76 (3.3%) | 84 (3.6%) | 93 (4.0%) |
| No. (percentage) of missing BUSCOs (M) | 121 (5.2%) | 91 (3.9%) | 91 (3.9%) | 92 (3.9%) | 119 (5.1%) |
| Total BUSCO groups searched | 2,326 | 2,326 | 2,326 | 2,326 | 2,326 |
BUSCO (Benchmarking set of universal single-copy orthologues) result for draft assembly, Dabenzi, Taishanhong, isolate Tunisia 2019, and strain AG2017 of P. granatum.
Types of transposable elements identified in P. granatum genome. The total interspresed repeats mentioned at the bottom of the table 26.68 is the total of retroelements (13.69), DNA transposons (11.23) and unclassified (1.76) Hence these values are highlighted.
| Types of Transposable element | Number of elements | Length occupied in bp | Percentage of sequences |
|---|---|---|---|
| Retroelements | 112,404 | 49,529,468 |
|
| SINEs | 13,048 | 1,498,787 | 0.41 |
| LINEs | 25,893 | 6,693,226 | 1.85 |
| (i) CRE/SLACS | 0 | 0 | 0 |
| (ii) L2/CR1/Rex | 0 | 0 | 0 |
| (iii)R1/LOA/Jockey | 21 | 5,707 | 0 |
| (iv) R2/R4/NeSL | 0 | 0 | 0 |
| (v)RTE/Bov-B | 0 | 0 | 0 |
| (vi)L1/CIN4 | 3,843 | 1,720,126 | 0.48 |
| LTR elements | 73,463 | 41,337,455 | 11.43 |
| (i) BEL/Pao | 0 | 0 | 0 |
| (ii) Ty1/Copia | 8,660 | 5,573,403 | 1.54 |
| (iii) Gypsy/DIRS1 | 3,262 | 2,504,146 | 0.69 |
| (iv) Retroviral | 0 | 0 | 0 |
| DNA transposons | 182,314 | 40,633,158 |
|
| (i) hobo-Activator | 683 | 418,562 | 0.12 |
| (ii) Tc1-IS630-Pogo | 24 | 11,586 | 0 |
| (iii) En-Spm | 0 | 0 | 0 |
| (iv) MuDR-IS905 | 0 | 0 | 0 |
| (v) PiggyBac | 0 | 0 | 0 |
| (vi) Tourist/Harbinger | 954 | 607,183 | 0.17 |
| (vii) Other (Mirage,P-element, Transib) | 0 | 0 | |
| Rolling-circles | 835 | 750,813 | 0.21 |
| (i)Unclassified | 30,347 | 6,356,886 |
|
| Total interspersed repeats | 96,519,512 |
|
FIGURE 2Classification and distribution of microsatellites alias SSRs identified in the P. granatum genome. (A) Proportions of microsatellites with different motif types. P1: mono-nucleotide repeats; P2: di-nucleotide repeats; P3: tri-nucleotide repeats, P4: tetra-nucleotide repeats; P5: penta-nucleotide repeats; p6: hexa-nucleotide repeats; C: complex: no. of SSRs involved in compound formation. (B) Percentage of hypervariable class I and variable class II microsatellites in the P. granatum genome. (C) Frequency of distribution of the most frequently occurring SSR motif families.
FIGURE 3Bar chart exhibits gene annotations of the functional classes in each of the three major categories, biological process (BP), cellular component (CC), and molecular function (MF), of gene ontology classification.
FIGURE 4Phenylpropanoid biosynthesis pathway in P. granatum. Numbers 1 to 17 represent the enzymes that catalyze the respective reactions. 1) PAL; phenylalanine ammonia-lyase [EC:4.3.1.24]; 2) 4CL; 4-coumarate--CoA ligase [EC:6.2.1.12]; 3) CCR; cinnamoyl-CoA reductase [EC:1.2.1.44]; 4) CYP73A; trans-cinnamate 4-monooxygenase [EC:1.14.13.11]; 5) E2.1.1.68; caffeic acid 3-O-methyltransferase [EC:2.1.1.68]; 6) CYP84A; ferulate-5-hydroxylase [EC:1.14.-.-]; 7) E2.1.1.104; caffeoyl-CoA O-methyltransferase [EC:2.1.1.104]; 8) E2.3.1.133; shikimate O-hydroxycinnamoyltransferase [EC:2.3.1.133]; 9) CYP98A; coumaroylquinate (coumaroylshikimate) 3′-monooxygenase [EC:1.14.13.36]; 10) E1.1.1.195; cinnamyl-alcohol dehydrogenase [EC:1.1.1.195]; 11) E1.11.1.7; peroxidase [EC:1.11.1.7]; 12) UGT72E; coniferyl-alcohol glucosyltransferase [EC:2.4.1.111]; 13) bglB; beta-glucosidase [EC:3.2.1.21]; 14) CSE; caffeoylshikimate esterase [EC:3.1.1.-]; 15) REF1; coniferyl-aldehyde dehydrogenase [EC:1.2.1.68]; 16) serine carboxypeptidase-like 19 [EC:3.4.16.- 2.3.1.91]; 17) eugenol synthase [EC:1.1.1.318].
FIGURE 5Flavonoid biosynthetic pathway found in P. granatum. Numbers 1 to 14 represent the enzymes that catalyze the respective reactions. 1) CHS; chalcone synthase [EC:2.3.1.74]; 2) E5.5.1.6; chalcone isomerase [EC:5.5.1.6]; 3) E1.14.11.9; naringenin 3-dioxygenase [EC:1.14.11.9]; 4) FLS; flavonol synthase [EC:1.14.11.23]; 5) DFR; bifunctional dihydroflavonol 4-reductase/flavanone 4-reductase [EC:1.1.1.219 1.1.1.234]; 6) E1.14.13.21; flavonoid 3′-monooxygenase [EC:1.14.13.21]; 7) ANR; anthocyanidin reductase [EC:1.3.1.77]; 8) CYP75A; flavonoid 3′,5′-hydroxylase [EC:1.14.13.88]; 9) LAR; leucoanthocyanidin reductase [EC:1.17.1.3]; 10) E2.3.1.133; shikimate O-hydroxycinnamoyltransferase [EC:2.3.1.133]; 11) CYP98A; coumaroylquinate (coumaroylshikimate) 3′-monooxygenase [EC:1.14.13.36]; 12) E2.1.1.104; caffeoyl-CoA O-methyltransferase [EC:2.1.1.104]; 13) CYP73A; trans-cinnamate 4-monooxygenase [EC:1.14.13.11]; 14) E1.14.11.19; leucoanthocyanidin dioxygenase [EC:1.14.11.19].
FIGURE 6Venn diagram of shared orthologous gene families in Punica granatum, Eucalyptus grandis, Malus domestica, Vitis vinifera, and Arabidopsis thaliana. The gene family number is listed in each component.
FIGURE 7Heatmap of ANIm percentage identity: species-level assignments and isolate identifiers as indicated at source given as row and column labels. Cells in the heatmap corresponding to 95% ANIm sequence identity are colored red. Blue cells correspond to ANIm comparisons indicating that the corresponding organisms do not belong to the same species. Color intensity fades as the comparisons approach 95% ANIm sequence identity. Color bars above and to the left of the heatmap correspond to source species-level assignments for each isolate in the analysis. Hierarchical clustering of the analysis results in two dimensions is represented by dendrograms, constructed by simple linkage of ANIm percentage identities.