| Literature DB >> 35328015 |
Weicai Song1, Zimeng Chen1, Li He2, Qi Feng1, Hongrui Zhang1, Guilin Du3, Chao Shi1,4, Shuo Wang1.
Abstract
Benincasa hispida (wax gourd) is an important Cucurbitaceae crop, with enormous economic and medicinal importance. Here, we report the de novo assembly and annotation of the complete chloroplast genome of wax gourd with 156,758 bp in total. The quadripartite structure of the chloroplast genome comprises a large single-copy (LSC) region with 86,538 bp and a small single-copy (SSC) region with 18,060 bp, separated by a pair of inverted repeats (IRa and IRb) with 26,080 bp each. Comparison analyses among B. hispida and three other species from Benincaseae presented a significant conversion regarding nucleotide content, genome structure, codon usage, synonymous and non-synonymous substitutions, putative RNA editing sites, microsatellites, and oligonucleotide repeats. The LSC and SSC regions were found to be much more varied than the IR regions through a divergent analysis of the species within Benincaseae. Notable IR contractions and expansions were observed, suggesting a difference in genome size, gene duplication and deletion, and the presence of pseudogenes. Intronic gene sequences, such as trnR-UCU-atpA and atpH-atpI, were observed as highly divergent regions. Two types of phylogenetic analysis based on the complete cp genome and 72 genes suggested sister relationships between B. hispida with the Citrullus, Lagenaria, and Cucumis. Variations and consistency with previous studies regarding phylogenetic relationships are discussed. The cp genome of B. hispida provides valuable genetic information for the detection of molecular markers, research on taxonomic discrepancies, and the inference of the phylogenetic relationships of Cucurbitaceae.Entities:
Keywords: Benincasa hispida; chloroplast genome; comparative analysis; divergence region; phylogenetic
Mesh:
Year: 2022 PMID: 35328015 PMCID: PMC8954987 DOI: 10.3390/genes13030461
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Chloroplast genome general features of Benincasa hispida.
| Characteristics |
| |
|---|---|---|
| Size (base pair, bp) | 156,758 | |
| LSC length (bp) | 86,538 | |
| SSC length (bp) | 18,060 | |
| IR length (bp) | 26,080 | |
| Number of genes | 131 | |
| Number of protein-coding genes | 86 | |
| Number of tRNA genes | 37 | |
| Number of rRNA genes | 8 | |
| Duplicate genes | 18 | |
| GC content | Total (%) | 37.2 |
| LSC (%) | 35 | |
| SSC (%) | 31.7 | |
| IR (%) | 42.9 | |
| CDS (%) | 37.9 | |
| rRNA (%) | 55.2 | |
| tRNA (%) | 53.2 | |
| ALL gene % | 39.4 | |
| Protein-coding part (CDS) (% bp) | 51.1 | |
| All genes (% bp) | 71.6 | |
| Non-coding region (% bp) | 28.4 | |
Figure 1Gene map of the Benincasa hispida chloroplast genome. The genes drawn outside and inside of the circle are transcribed in clockwise and counterclockwise directions. Genes are colored based on their function. The borders of chloroplast genome are defined with LSC, SSR, IRa, and IRb. The dashed gray color of the inner circle shows the GC content, whereas the lighter gray color shows AT content. Asterisks mark genes that have introns.
Genes predicted in the chloroplast genome of Benincasa hispida. The number of asterisks after the gene names indicates the number of introns contained in the genes.
| Category of Genes | Group of Genes | Gene Name |
|---|---|---|
| Photosynthesis-related genes | Large subunit of rubisco |
|
| Photosystem I | ||
| Assembly/srability of photosystem I | ||
| Photosystem II | ||
| ATP synthase | ||
| Cytochrome b6/f complex | ||
| Cytochrome c synthesis |
| |
| NADH dehydrogenase | ||
| Transcription and translation related genes | RNA polymerase subunits/transcription | |
| Small subunit of ribosomal proteins | ||
| Large subunit of ribosomal proteins | ||
| Translation initiation factor |
| |
| RNA genes | Ribosomal RNA | |
| transfer RNA | ||
| Other genes | RNA processing |
|
| Carbon metabolism |
| |
| Fatty acid synthesis |
| |
| Proteolysis |
| |
| Component of TIC complex |
| |
| Hypothetical proteins |
|
* Gene with one intron, ** gene with two introns, (*2) gene with two copies.
Figure 2Comparison of microsatellites and oligonucleotide repeats in the chloroplast genomes of Benincaseae species. (A) The number of SSRs in the three main regions of the chloroplast genome. LSC: large single-copy region, SSC: small single-copy region, IR: inverted repeat region. (B) The density of the SSRs in the IGSs (intergenic sequences) and gene regions. (C) The number of different types of SSR. Mono- represent mononucleotide SSRs, Di- represent dinucleotide SSRs, etc. (D) Different types of oligonucleotide repeat. F: forward repeats, P: palindromic repeats, R: reverse repeats, C: complementary repeats. (E) The number of oligonucleotide repeats in different regions. LSC: large single-copy region, SSC: small single-copy region, IR: inverted repeat region, LSC/IR: repeat sequences crossed LSC and IR regions, etc. (F) The number of repeats in different repeat units.
Figure 3Comparison of junctions between the LSC, SSC, and IRs among eight species. Number above indicates the distance in bp between the ends of the genes and the border sites (distances are not to scale in this figure).
Figure 4Sequence identity plot comparing the chloroplast genomes among Benincaseae species with Lagenaria siceraria set as a reference using mVISTA. Pink bars represent noncoding sequences (CNS), and white peaks represent genome divergence. The y-axis represents the percentage identity (shown: 50–100%).
Likelihood ratio tests of five potential genes under positive selection.
| Gene Name | Models | np | ln L | Likelihood RatioTest | Positively | |
|---|---|---|---|---|---|---|
| AA-Site | Score | |||||
|
| M8 (beta) | 10 | −2173.400149 | 0.007931755 | 159 W | 0.984 * |
| M7 (beta & ω > 1) | 8 | −2178.23703 | ||||
|
| M8 (beta) | 10 | −926.578492 | 0.070969008 | ||
| M7 (beta & ω > 1) | 8 | −929.224004 | ||||
|
| M8 (beta) | 10 | −877.349259 | 0.030217199 | 158 Q | 0.971 * |
| M7 (beta & ω > 1) | 8 | −880.848603 | ||||
|
| M8 (beta) | 10 | −6139.981658 | 0.156641895 | ||
| M7 (beta & ω > 1) | 8 | −6141.835451 | ||||
|
| M8 (beta) | 10 | −9230.970637 | 0.063743376 | ||
| M7 (beta & ω > 1) | 8 | −9233.723527 | ||||
np represents degree of freedom; ln L represents log likelihood values; *: empirical Bayes values > 0.95.
Figure 5Nucleotide diversity (π) values among the Benincaseae species.
Figure 6Maximum likelihood (ML) tree of Cucurbitales. (A) The phylogenetic tree constructed by complete chloroplast genome of 23 species. (B) The phylogenetic tree builds with 72 genes. The positions of Benincasa hispida are marked with green triangles. Numbers above branches are bootstrap values, and the bootstrap values higher or lower than those of the other tree are marked as red or blue, respectively. Glycine max set as the root in both trees.