Literature DB >> 29449878

Complete chloroplast genome of the medicinal plant Amomum compactum: gene organization, comparative analysis, and phylogenetic relationships within Zingiberales.

Ming-Li Wu1,2, Qing Li3, Jiang Xu1, Xi-Wen Li1.   

Abstract

BACKGROUND: Amomum compactum is one of the basic species of the traditional herbal medicine amomi fructus rotundus, with great pharmacology effect. The system position of A. compactum is not clear yet, and the introduction of this plant has been hindered by many plant diseases. However, the correlational molecular studies are relatively scarce.
METHODS: The total chloroplast (cp) DNA was extracted according to previous studies, and then sequenced by 454 GS FLX Titanium platform. Sequence assembly was complished by Newbler. Genome annotation was preformed by CPGAVAS and tRNA-SCAN. Then, general characteristics of the A. compactum cp genome and genome comparsion with three Zingiberaceae species was analyzed by corresponding softwares. Additionally, phylogenetical trees were reconstructed, based on the shared protein-coding gene sequences among 15 plant taxa by maximum parsimony (MP) and maximum likelihood (ML) methods.
RESULTS: The A. compactum cp genome with a classic quadripartite structure, consisting of a pair of reverse complement repeat regions (IRa/IRb) of 29,824 bp, a large single copy (LSC, 88,535 bp) region as well as a small single copy (SSC, 15,370 bp) region, is 163,553 bp in total size. The total GC content of this cp genome is 36.0%. The A. compactum cp genome owns 135 functional genes, that 113 genes are unique, containing eighty protein-coding genes, twenty-nine tRNA (transfer RNA) genes and four rRNA (ribosomal RNA) genes. Codon usage of the A. compactum cp genome is biased toward codons ending with A/T. Total 58 SSR loci and 24 large repeats are detected in the A. compactum cp genome. Relative to three other Zingiberaceae cp genomes, the A. compactum cp genome exhibits an obvious expansion in the IR regions. In A. compactum cp genome, the ycf1 pseudogene is 2969 bp away from the IRa/SSC border, whereas in other Zingiberaceae species, it is only 4-5 bp away from the IRa/SSC border. Comparative cp genome sequences analysis of A. compactum with other Zingiberaceae reveals that the gene order and gene content differ slightly among Zingiberaceae species. The phylogenetic analysis based on 67 protein-coding gene sequences supports the phylogenetic position of A. compactum.
CONCLUSIONS: The study has identified unique features of the A. compactum cp genome which would be helpful for us to understand the cp genome evolution and offer useful information for phylogenetics and further studies of this traditional medicinal plant.

Entities:  

Keywords:  Amomum compactum; Chloroplast genome; High-throughput sequencing technology; Phylogeny; SSR

Year:  2018        PMID: 29449878      PMCID: PMC5811967          DOI: 10.1186/s13020-018-0164-2

Source DB:  PubMed          Journal:  Chin Med        ISSN: 1749-8546            Impact factor:   5.455


Background

Chloroplasts can provide necessary energy for plants growth as photosynthetic organelles, which also participate in other major life activities such as starch storage, sugar synthesis and many critical biological metabolic pathways. As circular DNA molecules, cp genomes mainly vary from 120 to 160 kb in size with a typical quadripartite organization in angiosperms [1]. Two reverse complement copies of IR region (20–28 kb) separate the whole cp genome into a LSC region (80–90 kb) and a SSC region (16–27 kb) [2]. In angiosperms, cp genomes usually encode approximately 80 unique proteins, 30 tRNAs and four rRNAs. Previous studies have corroborated that cp gene order, gene content, and genome organization are highly conserved in plants [3, 4]. Owing to the high conservation and monolepsis, cp genomes are widely used in species identification, phyletic evolution studies and genetic engineering. The availability of whole cp genomes has helped to resolve phylogenetic relationships among major clades of angiosperms with greater accuracy [5, 6]. Nevertheless, with the number of cp genomes increasing, gene losses, structural rearrangements and IR contractions/expansions have been reported, which can also be exploited for the reconstruction of plant phylogenies [7-9]. Amomum compactum (genus Amomum, family Zingiberaceae) is one of the basic species of the traditional Chinese medicine amomi fructus rotundus, which is mainly produced in Vietnam and Thailand and is cultivated as a medicinal plant in the Guangdong, Guangxi and Yunnan provinces of China with great pharmacology effect. However, bacterial wilt, damping-off, leaf spot and other major plant diseases have become a severe obstacle for the introduction of this plant. Many plants belonging to the Zingiberaceae family are used as important seasoning and medicinal plants, such as Zingiber officinale, Amomum villosum, Curcuma longa, Zingiber mioga, Elettaria cardamomum, and Alpinia officinarum. In addition, previous studies have shown that the efficacy, chemical composition and pharmacological effects among the five genera of Zingiberaceae are strongly correlated. It is of great significance and broad interest to investigate the genetic relationships of traditional Chinese medicinal plants to find alternative medicinal plants. With the number of whole cp genomes in the Zingiberaceae increasing, the cp genome sequences of other species in Zingiberaceae are becoming easier to be assembled. However, studies of amomi fructus rotundus are scarce both inside and outside China, especially molecular studies. This study reports the assembly, annotation and structural analysis of A. compactum cp genome for the first time. And to reveal the structure of this cp genome, we compare the organization (IR expansion/contraction and divergent regions) of complete cp genomes between A. compactum and other Zingiberaceae species. We also provide the result of phylogenetic analyses on basis of 67 protein-coding gene sequences from A. compactum and 14 monocot cp genomes.

Methods

DNA extraction and sequencing

Fresh A. compactum leaves were acquired from cultivated bases in Guangdong Province, China. The total cp DNA was extracted from roughly 100 g of leaves through an improved method by Li et al. [10]. The quality of cp DNA was checked by Nanodrop-2000 spectrometer (Nanodrop Technologies, Wilmington, DE, USA), and agarose gel electrophoresis. Pure cp DNA was used for shotgun library construction with 454 GS FLX Titanium platform. The obtained SFF file was preprocessed by trimming short (L < 50 bp) and low-quality (Q < 20) reads. Trimmed reads were assembled using Newbler V2.6 (GS FLX De Novo Assembler Software). In order to verify the assembly, the four junctional regions were further confirmed by Sanger sequencing.

Genome assembly and annotation

Preliminary gene annotation of this cp genome was performed by CpGAVAS, a program available online (http://www.herbalgenomics.org/0506/cpgavas) [11]. The position of each gene was then manually corrected by Apollo [12] after alignment to the reference genomes by MEGA 5.0. In addition, according to start and stop codons, minor revisions were performed. The tRNAs were further confirmed by the online tool tRNAscan-SE with default settings (http://lowelab.ucsc.edu/tRNAscan-SE/). [13]. Then, the circular map of this cp genome was accomplished by OrganellarGenomeDRAW program (http://ogdraw.mpimp-golm.mpg.de/) [14]. Finally, the complete cp genome of A. compactum was submitted to NCBI GenBank database (Accession Number: MG000589).

Sequence analyses

Relative synonymous codon usage (RSCU) values, which were used to research the features of variations in synonymous and nonsynonymous codon usage by disregarding the composition impact of amino acid, were determined using MEGA 6.0 [15]. Additionally, GC content and codon usage were determined by MEGA 6.0. SSRs (simple sequence repeats) loci were detected by MISA software (http://pgrc.ipk-gatersleben.de/misa/), with following thresholds: ten, six, five, five, five, and five repeat units for mono-nucleotide, di-nucleotide, tri-nucleotide, tetra-nucleotide, penta-nucleotide, and hexa-nucleotide SSRs, respectively. To analyze the repeat structure, REPuter [16] (http://bibiserv.techfak.uni-bielefeld.de/reputer/) was performed to detect forward (direct) and palindromic (inverted) repeats in the cp genome. The minimum repeat unit was set to 30 bp in length, the identity of repeats was set to > 90%, and the Hamming distance equals three. All identified results were verified and redundant repeats were manually removed.

Genome comparison

Pairwise alignments of several cp genome sequences were conducted by MUMmer [17], and the dot plots were drawn using a Perl script. The complete cp genomes of A. compactum and three other Zingiberaceae species (Additional file 1), Curcuma flaviflora (KR967361), Curcuma roscoeana (KF601574), and Zingiber spectabile (JX088661), were used for comparative analysis by mVISTA program (http://genome.lbl.gov/vista/index.shtml) [18] in Shuffle-LAGAN mode. A. compactum was set as the reference.

Phylogenomic analysis

To examine the phylogenetic position of A. compactum, 14 complete chloroplast genomes were downloaded from NCBI. The 67 shared protein-coding gene sequences were extracted using a Python script and aligned separately by ClustalW2. Phylogenetical trees were reconstructed based on 67 concatenated protein-coding gene sequences by MP and ML methods. The best-fitting model was filtrated by jModelTest 2.1.7 through the Akaike information criterion (AIC) [19]. The MP tree was reconstructed by PAUP ver. 4.0b10 [20] with a heuristic search, while ML analysis was calculated by RAxML-HPC 2.7.6.3 on XSEDE in the CIPRES Science Gateway (http://www.phylo.org/) with default parameters. Based on APGIII, Fritillaria cirrhosa was set as an outgroup. Both MP and ML analyses used 1000 bootstrap replicates. The Minimum Standards of Reporting Checklist includes details of the experimental design, statistics, and resources used in this study.

Results and discussion

General characteristics of the A. compactum cp genome

The complete cp genome sequence of A. compactum is 163,553 bp in length with a obvious quadripartite structure (Fig. 1). A pair of inverted region (IR) with 29,824 bp in length partition the rest sequence into a LSC region (88,535 bp) and a SSC region (15,370 bp) (Table 1). The universal GC content of this cp sequence was 36.0%, which has been reported to act a significant role in evolution of genomic structures. Nevertheless, the overall GC content is unequally distributed across the cp genome, which is lowest in SSC region (29.8%) but highest in IR regions (41.1%), followed by LSC region (33.7%).
Fig. 1

A. compactum cp genome map. Genes drawn outside the circle are counterclockwise, whereas inside are transcribed clockwise. Genes are color-coded according to different functional groups. The darker gray represents GC content in the inner circle, conversely the lighter one represents AT content

Table 1

Base composition in the A. compactum cp genome

T(U)%C%A%G%Length (bp)
LSC33.817.232.516.588,535
IR28.819.830.121.329,824
SSC34.315.635.914.215,370
Total32.318.331.717.8163,553
CDS31.617.231.519.879,701
 1st position2418.231.326.726,567
 2nd position3220.230.017.426,567
 3rd position3913.133.115.326,567

CDS protein-coding regions

A. compactum cp genome map. Genes drawn outside the circle are counterclockwise, whereas inside are transcribed clockwise. Genes are color-coded according to different functional groups. The darker gray represents GC content in the inner circle, conversely the lighter one represents AT content Base composition in the A. compactum cp genome CDS protein-coding regions As shown in Fig. 1, the A. compactum cp genome totally encodes 135 functional genes, that 113 are unique, containing eighty protein-coding genes, twenty-nine tRNAs and four rRNAs (Table 2). Among the functional genes, all rRNAs, eight tRNAs and seven protein-coding genes are duplicated in IR regions. The LSC region includes 60 protein-coding genes and 21 tRNAs, whereas the SSC region includes 11 protein-coding genes and one tRNA gene. Among the protein-coding genes, 72 are single-copy, whereas eight are duplicated. Among the tRNA genes, 20 are single-copy genes and nine are duplicated. Among the 113 unique genes, 13 include one intron (eight protein-coding and five tRNAs) and three (ycf3, clpP, and rps12) include two introns (Table 2). Unusually, the rps12 gene is trans-spliced, of which the 5′ end is situated in LSC region whereas two replicative 3′ ends are located in IRa and IRb regions respectively. What’s more, the ndhA gene contains the longest intron region (1033 bp).
Table 2

Gene content of the A. compactum cp genome

Gene categoryGene groupGene name
Self-replicationrRNA genesrrn16c, rrn23c, rrn5c, rrn4.5c
tRNA genestrnH-GUGc, trnK-UUUa, trnQ-UUG, trnS-GCU, trnC-GCA, trnD-GUC, trnY-GUA, trnE-UUC, trnR-UCU, trnT-GGU, trnS-UGA, trnG-GCCc, trnfM-CAU, trnS-GGA, trnT-UGU, trnL-UAAa, trnF-GAA, trnV-UACa, trnW-CCA, trnP-UGG, trnI-CAUc, trnL-CAAc, trnV-GACc, trnI-GAUa, c, trnA-UGCa, c, trnR-ACGc, trnN-GUUc, trnL-UAG, trnM-CAU
Small subunit of ribosomerps4, rps14, rps18, rps2, rps12b, c, rps11, rps8, rps3, rps19, rps7c, rps15, rps16a
Large subunit of ribosomerpl33, rpl20, rpl36, rpl14, rpl16a, rpl22, rpl2a, c, rpl23c, rpl32
DNA dependent RNA polymeraserpoB, rpoC1a, rpoC2, rpoA
Translational initiation factor infA
Genes for photosynthesisSubunits of NADH dehydrogenasendhAa, ndhBa, c, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Subunits of photosystem IpsaA, psaB, psaC, psaI, psaJ, ycf3b, ycf4
Subunits of photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Subunits of cytochrome b/f complexpetN, petA, petL, petG, petBa, petD
Subunits of ATP synthaseatpI, atpH, atpFa, atpA, atpE, atpB
Large subunit of rubisco rbcL
Genes of unknown functionOpen reading frames (ORF, ycf)ycf1, ycf15c, ycf2c
Pseudogenes ycf1

aGene with one intron

bGene with two introns

cGene with two copies

Gene content of the A. compactum cp genome aGene with one intron bGene with two introns cGene with two copies The protein-coding gene sequences are 79,701 bp in length, which comprise 26,567 codons. And the usage frequency of codon was counted and exhibited in Table 3. In protein-coding sequences (CDSs), the AT content are 55.3% at the first codon positions, 62.0% at the second codon positions and 72.1% at the third codon positions, respectively (Table 1). Most protein-coding genes in land plant cp genomes use the standard ATG as the initiation codon. However, in the A. compactum cp genome, two genes use alternatives to ATG as start codon, as following: ATC for ndhD and ATA for rpl2. Relative synonymous codon usage (RSCU) is a statistics of uneven usage of synonymous and nonsynonymous codons in the coding sequences. An RSCU value < 1.00 indicates that the use of a codon is less frequent than expected, whereas a codon used more frequently will attain an RSCU value > 1.00. A total of 96.7% (29/30) of preferred synonymous codons, i.e., RSCU values > 1, end with A/U, whereas 90.6% (29/32) of non-preferred synonymous codons, i.e., RSCU values < 1, end with G/C. This codon usage pattern is similar with other reported cp genomes [21, 22], which might be driven by the high proportion of A/T. The usage of the start codon (ATG) and UGG (coding TRP) show no bias (RSCU value = 1).
Table 3

Codon-anticodon recognition patterns and codon usage in the A. compactum cp genome

Amino acidCodonNo.RSCUtRNAAmino acidCodonCountRSCUtRNA
PheUUU9711.31TyrUAU8111.57
PheUUC5160.69trnF-GAATyrUAC2210.43trnY-GUA
LeuUUA8921.96trnL-UAAStopUAA481.66
LeuUUG5591.23trnL-CAAStopUAG220.76
LeuCUU5671.25HisCAU5191.6
LeuCUC1810.4HisCAC1290.4trnH-GUG
LeuCUA3810.84trnL-UAGGlnCAA7061.54trnQ-UUG
LeuCUG1510.33GlnCAG2100.46
IleAUU11461.47AsnAAU9891.55
IleAUC4260.55trnI-GAUAsnAAC2890.45trnN-GUU
IleAUA7630.98trnI-CAULysAAA11141.49trnK-UUU
MetAUG6141trn(f)M-CAULysAAG3830.51
ValGUU5211.45AspGAU8751.64
ValGUC1590.44trnV-GACAspGAC1920.36trnD-GUC
ValGUA5591.56trnV-UACGluGAA11251.53trnE-UUC
ValGUG1940.54GluGAG3500.47
SerUCU5981.74CysUGU2321.56
SerUCC3370.98trnS-GGACysUGC660.44trnC-GCA
SerUCA4121.2trnS-UGAStopUGA170.59
SerUCG1820.53TrpUGG4521trnW-CCA
ProCCU4421.62ArgCGU3651.37trnR-ACG
ProCCC2020.74ArgCGC860.32
ProCCA3251.19trnP-UGGArgCGA3421.29
ProCCG1200.44ArgCGG1130.43
ThrACU5371.57ArgAGA5191.95trnR-UCU
ThrACC2370.7trnT-GGUArgAGG1680.63
ThrACA4331.27trnT-UGUSerAGU4301.25
ThrACG1570.46SerAGC1020.3trnS-GCU
AlaGCU6261.82GlyGGU6041.39
AlaGCC2030.59GlyGGC1410.33trnG-GCC
AlaGCA4341.26trnA-UGCGlyGGA7141.65
AlaGCG1120.33GlyGGG2760.64

RSCU relative synonymous codon usage

Codon-anticodon recognition patterns and codon usage in the A. compactum cp genome RSCU relative synonymous codon usage

Repeat and SSR analysis

SSRs are a class of tandemly repeated sequences that consists of 1–6 nucleotide repeat units. SSRs are important in plant typing and widely developed as molecular genetic markers for species identification. Total 58 SSRs loci were found in the A. compactum cp genome (Table 4), and 47 SSRs were only composed of A/T bases. Furthermore, 10 SSRs were composed of di-nucleotide (AT/TA) repeats, and one SSR was composed of trinucleotide (ATA) repeats. Obviously, the SSRs in the A. compactum cp genome were rich in A/T, which has been reported in many plant families [23-25]. Among these SSRs, 17 SSRs were situated in protein-coding genes and one was located in a tRNA gene. Furthermore, five were in coding regions and 12 in intronic regions. No tetra-, penta- or hexa-nucleotide repeats over 15 bp long was detected. REPuter allowed us to identify 24 repeats, including 13 forward and 11 palindromic repeats (Table 5). Almost all repeats were situated in the intronic and intergenic regions, although few of them were situated in protein-coding regions [26]. As reported in other genomes, the gene richest in repeats was ycf2, carrying two direct and two palindromic repeats.
Table 4

Simple sequence repeats in the A. compactum cp genome

cpSSR IDRepeat motifLength (bp)StartEndRegionAnnotation
1(T)101039753984LSCtrnK-UUU
2(A)101043284337LSC
3(TA)61249004911LSC
4(A)101052875296LSCrps16 intron
5(A)111162536263LSC
6(TA)61266096620LSC
7(A)101072047213LSC
8(AT)61275217532LSC
9(A)101077007709LSC
10(T)121286338644LSC
11(A)131314,88514,897LSC
12(T)101017,47417,483LSC
13(A)101019,83119,840LSC rpoC2
14(T)111124,12124,131LSCrpoC1 intron
15(A)101028,80228,811LSC
16(A)151529,01329,027LSC
17(A)111130,86830,878LSC
18(T)101035,12935,138LSC
19(TA)71438,63238,645LSC
20(A)121239,29239,303LSC
21(A)121247,48147,492LSC
22(T)101048,98648,995LSC
23(A)101050,23650,245LSC
24(AT)71450,39550,408LSC
25(T)101051,82951,838LSC
26(T)111152,70952,719LSC
27(ATA)51554,34554,359LSC
28(A)111154,56254,572LSC
29(T)101058,77858,787LSC
30(T)111159,26959,279LSC
31(A)121260,91960,930LSC
32(T)101061,62161,630LSC
33(AT)61263,48963,500LSC
34(A)121268,71568,726LSC
35(AT)102069,26669,285LSC
36(T)101070,71670,725LSC
37(A)101072,60072,609LSC rps18
38(TA)71474,09474,107LSCrps12 intron
39(A)101074,56974,578LSCclpP intron
40(T)111174,84574,855LSCclpP intron
41(T)101075,10875,117LSCclpP intron
42(T)101075,57275,581LSCclpP intron
43(T)101075,83175,840LSCclpP intron
44(A)101079,17779,186LSC
45(AT)61279,75179,762LSCpetB intron
46(T)101086,40786,416LSCrpl16 intron
47(T)111188,97088,980IRa
48(T)1010116,573116,582IRa ycf1
49(A)1111120,872120,882SSC
50(T)1111121,055121,065SSC
51(A)1111128,865128,875SSCndhA intron
52(T)1010129,188129,197SSCndhA intron
53(AT)612131,778131,789SSC
54(T)1111133,103133,113SSC
55(T)1212133,236133,247SSC
56(T)1111133,374133,384SSC ycf1
57(A)1010135,507135,516IRb ycf1
58(A)1111163,109163,119IRb
Table 5

Long repeat sequences in A. compactum cp genome

IDRepeat start 1TypeSize (bp)Repeat start 2Mismatch (bp)E valueGeneRegion
13990P343996− 34.12E−06trnK-UUU (intron)LSC
28768P3148,057− 31.98E−04IGS; trnS-GGALSC
310,522F3039,347− 37.15E−04trnG-GCC (intron)LSC
431,322P3231,352− 35.46E−05IGSLSC
532,991F3033,020− 37.15E−04IGSLSC
639,660P3239,70104.08E−10IGSLSC
741,551F5843,775− 37.54E−20psaB; psaALSC
841,595F3743,819− 22.39E−09psaB; psaALSC
963,481P31126,101− 31.98E−04IGSLSC; SSC
1063,481F31126,106− 31.98E−04IGSLSC; SSC
1163,487F3269,264− 35.46E−05IGSLSC
1267,809P3167,864− 26.83E−06IGSLSC
1371,632F3071,65906.53E−09IGSLSC
1472,281F4272,302− 31.21E−10 rps18 LSC
1591,249F4691,299− 12.10E−16trnI-CAU; IGSIRa
1691,249P46160,743− 12.10E−16trnI-CAU; IGSIRa; IRb
1791,299P46160,793− 12.10E−16IGSIRa; IRb
1893,917F3093,938− 37.15E−04 ycf2 IRa
1993,917P30158,120− 37.15E−04 ycf2 IRa; IRb
2093,938P30158,141− 37.15E−04 ycf2 IRa; IRb
21121,695P30121,723− 37.15E−04IGSSSC
22158,122F30158,143− 37.15E−04 ycf2 IRb
23160,743F46160,793− 12.10E−16IGSIRb
24160,762F30160,812− 37.15E−04IGSIRb

F forward, P palindromic, IGS intergenic space

Simple sequence repeats in the A. compactum cp genome Long repeat sequences in A. compactum cp genome F forward, P palindromic, IGS intergenic space

IR expansion/contraction in the A. compactum cp genome

The variations of angiosperm cp genomes in length are mainly because of the contraction and expansion of boundary regions between the IR regions with single copy (SC) regions. A minute comparison of junctional regions between the IR and SC boundaries among A. compactum, C. flaviflora, C. roscoeana, and Z. spectabile is presented in Fig. 2. In addition, a size comparison of cp genome among the four Zingiberaceae species is shown in Additional file 2: Table S1. In spite of the alike lengths of IR regions in these four species (from 25,618 to 29,824 bp), few IR contractions/expansions were still detected. rpl22, ycf1 and rps19 pseudogenes with various lengths were situated in IRb/LSC or IRb/SSC boundaries. The borderline of the IRb/LSC junction was situated in left side of the rps19 gene in examined cp genomes, except in Z. spectabile, which resulted from the contraction of the IRa region in the Z. spectabile cp genome. By contrast, the ycf1 pseudogene was situated in the left side of the IRa-SSC border and was 4–5 bp away from the IRa-SSC borderline, except in the A. compactum cp genome. The size of the ycf1 pseudogene was 918 bp in A. compactum, 1068 bp in C. flaviflora and C. roscoeana, and 924 bp in Z. spectabile. In addition, in the A. compactum cp genome, the ycf1 pseudogene was 2969 bp away from the IRa-SSC borderline, that indicated the expansion of the IR region. The trnH gene was situated in LSC region, except in Z. spectabile cp genome, where it was situated in SSC region and was 136 bp away from the IRb-LSC borderline.
Fig. 2

Comparison of the border positions of the LSC, SSC, and IR regions among four complete Zingiberaceae chloroplast genomes. Gene names are indicated in boxes, and their lengths in the corresponding regions are displayed above the boxes

Comparison of the border positions of the LSC, SSC, and IR regions among four complete Zingiberaceae chloroplast genomes. Gene names are indicated in boxes, and their lengths in the corresponding regions are displayed above the boxes

Comparison with other Zingiberaceae cp genomes

Three sequences representing the Zingiberaceae (C. flaviflora, C. roscoeana and Z. spectabile) were selected for comparison with A. compactum. Pairwise cp genome alignments between A. compactum and other three cp genomes regained a high degree of synteny (Additional file 3: Figure S1, Additional file 4: Figure S1 and Additional file 5: Figure S3). To detect the divergent regions in the cp genome, this study compared the sequence identities among four Zingiberaceae cp genomes by mVISTA, using the annotation of A. compactum as a reference. The multiple sequences alignment showed the coding regions are highly conserved, however the non-coding regions are divergent (Fig. 3). As an example, the intergenic sequences between the trnT-GGU–psbD, rps16–trnQ-UUG, atpH–atpI, trnE-UUC–trnT-GGU, trnT-UGU–trnL-UAA, petA–psbL and psaC–ndhE regions were highly divergent, parts of which have been also reported as divergent sequences in other plant. Obviously, the LSC region and SSC region were more divergent than IR regions.
Fig. 3

Sequence comparison of the A. compactum, C. flaviflora, C. roscoeana and Z. spectabile cp genomes generated by mVISTA. Black lines designate regions of sequence identity by a 50% identity cutoff with A. compactum. Dashed rectangles indicate highly divergent regions of A. compactum compared with C. flaviflora, C. roscoeana and Z. spectabile

Sequence comparison of the A. compactum, C. flaviflora, C. roscoeana and Z. spectabile cp genomes generated by mVISTA. Black lines designate regions of sequence identity by a 50% identity cutoff with A. compactum. Dashed rectangles indicate highly divergent regions of A. compactum compared with C. flaviflora, C. roscoeana and Z. spectabile

Phylogenetic analysis

Cp genomes are widely employed in the study of evolution through phylogenetics. To examine the phylogenetic position of A. compactum and its relationship within Zingiberales, MP and ML phylogenetical analyses were performed based on 67 protein-coding gene sequences from 15 plant taxa, including A. compactum, as sequenced in the study. The total alignment was 51,452 bp in length. The results are presented in Figs. 4 and 5. The basic topologies were similar in the MP and ML analyses, but there were few differences. Bootstrap values were all extremely high, and nine of the 12 nodes with bootstrap values of ≥ 90% were found in MP tree, whereas eight of 12 nodes were found in ML tree with 100% bootstrap values. The Zingiberaceae species A. compactum, C. flaviflora, C. roscoeana and Z. spectabile were grouped in both MP and ML phylogenetic trees with 100% bootstrap values. In the MP trees, the four Zingiberaceae species composed a unique clade and were separated from the rest of Zingiberales with high bootstrap values in every node. By contrast, the ML tree was mainly separated into two clades, one of which included Strelitziaceae, Heliconiaceae, Musaceae and Lowiaceae species, whereas another included Zingiberaceae, Costaceae, Cannaceae and Marantaceae species. However, the Zingiberaceae and Costaceae species were grouped with a very low bootstrap value (15%) in the ML tree. These phylogenetic results strongly support the position of A. compactum and provide some helpful hints about relationships within the order Zingiberales.
Fig. 4

MP phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genomes. Bootstrap values are indicated upon the branches

Fig. 5

ML phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genome. Bootstrap values are indicated upon the branches

MP phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genomes. Bootstrap values are indicated upon the branches ML phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genome. Bootstrap values are indicated upon the branches

Conclusion

The research assembled, annotated and analyzed the whole cp genome of A. compactum, which reveals that the cp genome of A. compactum shares a quadruple structure, gene order, GC content, and codon usage features, similar to those of other land plant cp genomes. This Amomum cp genome was compared with three available Zingiberaceae cp genomes, while the genome structure and composition are similar. Also phylogenetic analysis provides new insight into phyletic evolution of this genus. Our research will contribute to species identification and evolutionary mechanisms required for the further study of A. compactum. Additional file 1. Minimum Standards of Reporting Checklist. Additional file 2: Table S1. Size comparison of A. compactum cp genomic regions with those of 3 other Zingiberaceae cp genomes. Additional file 3: Figure S1. Chloroplast genomic alignment between A. compactum and C. flaviflora (KR967361). Additional file 4: Figure S2. Chloroplast genomic alignment between A. compactum and C. roscoeana (KF601574). Additional file 5: Figure S3. Chloroplast genomic alignment between A. compactum and Z. spectabile (JX088661).
  21 in total

1.  Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus.

Authors:  R S Millen; R G Olmstead; K L Adams; J D Palmer; N T Lao; L Heggie; T A Kavanagh; J M Hibberd; J C Gray; C W Morden; P J Calie; L S Jermiin; K H Wolfe
Journal:  Plant Cell       Date:  2001-03       Impact factor: 11.277

2.  jModelTest: phylogenetic model averaging.

Authors:  David Posada
Journal:  Mol Biol Evol       Date:  2008-04-08       Impact factor: 16.240

3.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

4.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Authors:  T M Lowe; S R Eddy
Journal:  Nucleic Acids Res       Date:  1997-03-01       Impact factor: 16.971

5.  Characterization of chloroplast DNA microsatellites from Saccharum spp and related species.

Authors:  D M Melotto-Passarin; E V Tambarussi; K Dressano; V F De Martin; H Carrer
Journal:  Genet Mol Res       Date:  2011-09-12

6.  [High-throughput pyrosequencing of the complete chloroplast genome of Magnolia officinalis and its application in species identification].

Authors:  Xi-Wen Li; Zhi-Gang Hu; Xiao-Han Lin; Qing Li; Huan-Huan Gao; Guo-An Luo; Shi-Lin Chen
Journal:  Yao Xue Xue Bao       Date:  2012-01

7.  Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

Authors:  Robert K Jansen; Zhengqiu Cai; Linda A Raubeson; Henry Daniell; Claude W Depamphilis; James Leebens-Mack; Kai F Müller; Mary Guisinger-Bellian; Rosemarie C Haberle; Anne K Hansen; Timothy W Chumley; Seung-Bum Lee; Rhiannon Peery; Joel R McNeal; Jennifer V Kuehl; Jeffrey L Boore
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-28       Impact factor: 11.205

8.  Complete chloroplast genome sequence of poisonous and medicinal plant Datura stramonium: organizations and implications for genetic engineering.

Authors:  Yang Yang; Yuanye Dang; Dang Yuanye; Qing Li; Li Qing; Jinjian Lu; Lu Jinjian; Xiwen Li; Li Xiwen; Yitao Wang; Wang Yitao
Journal:  PLoS One       Date:  2014-11-03       Impact factor: 3.240

9.  CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences.

Authors:  Chang Liu; Linchun Shi; Yingjie Zhu; Haimei Chen; Jianhui Zhang; Xiaohan Lin; Xiaojun Guan
Journal:  BMC Genomics       Date:  2012-12-20       Impact factor: 3.969

10.  OrganellarGenomeDRAW--a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets.

Authors:  Marc Lohse; Oliver Drechsel; Sabine Kahlau; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2013-04-22       Impact factor: 16.971

View more
  9 in total

1.  Characteristics analysis of the complete Wurfbainia villosa chloroplast genome.

Authors:  Wenli An; Jing Li; Zerui Yang; Yuying Huang; Song Huang; Xiasheng Zheng
Journal:  Physiol Mol Biol Plants       Date:  2020-03-19

2.  Medical Plants and Immunological Regulation.

Authors:  Cheng Xiao; Qingdong Guan; Yong Tan; Lifei Hou; Wuxiang Xie
Journal:  J Immunol Res       Date:  2018-06-12       Impact factor: 4.818

3.  Plastid genome and composition analysis of two medical ferns: Dryopteris crassirhizoma Nakai and Osmunda japonica Thunb.

Authors:  Liang Xu; Yanping Xing; Bing Wang; Chunsheng Liu; Wenquan Wang; Tingguo Kang
Journal:  Chin Med       Date:  2019-03-14       Impact factor: 5.455

4.  Comparative genomics of 11 complete chloroplast genomes of Senecioneae (Asteraceae) species: DNA barcodes and phylogenetics.

Authors:  Andrew Wanyoike Gichira; Sheila Avoga; Zhizhong Li; Guangwan Hu; Qingfeng Wang; Jinming Chen
Journal:  Bot Stud       Date:  2019-08-22       Impact factor: 2.787

5.  Comparison and Phylogenetic Analysis of Chloroplast Genomes of Three Medicinal and Edible Amomum Species.

Authors:  Yingxian Cui; Xinlian Chen; Liping Nie; Wei Sun; Haoyu Hu; Yulin Lin; Haitao Li; Xilong Zheng; Jingyuan Song; Hui Yao
Journal:  Int J Mol Sci       Date:  2019-08-19       Impact factor: 5.923

6.  Comparative and Phylogenetic Analyses of Ginger (Zingiber officinale) in the Family Zingiberaceae Based on the Complete Chloroplast Genome.

Authors:  Yingxian Cui; Liping Nie; Wei Sun; Zhichao Xu; Yu Wang; Jing Yu; Jingyuan Song; Hui Yao
Journal:  Plants (Basel)       Date:  2019-08-12

7.  The plastome sequence of Bactris gasipaes and evolutionary analysis in tribe Cocoseae (Arecaceae).

Authors:  Raquel Santos da Silva; Charles Roland Clement; Eduardo Balsanelli; Valter Antonio de Baura; Emanuel Maltempi de Souza; Hugo Pacheco de Freitas Fraga; Leila do Nascimento Vieira
Journal:  PLoS One       Date:  2021-08-24       Impact factor: 3.240

8.  Complete chloroplast genome sequence of Amomum villosum and comparative analysis with other Zingiberaceae plants.

Authors:  Li Yang; Chong Feng; Miao-Miao Cai; Jie-Hu Chen; Ping Ding
Journal:  Chin Herb Med       Date:  2020-09-16

9.  Comparative analysis of the complete chloroplast genome sequences of six species of Pulsatilla Miller, Ranunculaceae.

Authors:  Tingting Zhang; Yanping Xing; Liang Xu; Guihua Bao; Zhilai Zhan; Yanyun Yang; Jiahao Wang; Shengnan Li; Dachuan Zhang; Tingguo Kang
Journal:  Chin Med       Date:  2019-11-28       Impact factor: 5.455

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.