| Literature DB >> 31644545 |
Xiao-Fei Liu1,2, Gen-Fa Zhu2, Dong-Mei Li2, Xiao-Jing Wang1.
Abstract
Spathiphyllum is a very important tropical plant used as a small, potted, ornamental plant in South China, with an annual output value of hundreds of millions of yuan. In this study, we sequenced and analyzed the complete nucleotide sequence of the Spathiphyllum 'Parrish' chloroplast genome. The whole chloroplast genome is 168,493 bp in length, and includes a pair of inverted repeat (IR) regions (IRa and IRb, each 31,600 bp), separated by a small single-copy (SSC, 15,799 bp) region and a large single-copy (LSC, 89,494 bp) region. Our annotation revealed that the S. 'Parrish' chloroplast genome contained 132 genes, including 87 protein coding genes, 37 transfer RNA genes, and 8 ribosomal RNA genes. In the repeat structure analysis, we detected 281 simple sequence repeats (SSRs) which included mononucleotides (223), dinucleotides (28), trinucleotides (12), tetranucleotides (11), pentanucleotides (6), and hexanucleotides (1), in the S. 'Parrish' chloroplast genome. In addition, we identified 50 long repeats, comprising 18 forward repeats, 13 reverse repeats, 17 palindromic repeats, and 2 complementary repeats. Single nucleotide polymorphism (SNP) and insertion/deletion (indel) analyses of the chloroplast genome of the S. 'Parrish' relative S. cannifolium revealed 962 SNPs in S. 'Parrish'. There were 158 indels (90 insertions and 68 deletions) in the S. 'Parrish' chloroplast genome relative to the S. cannifolium chloroplast genome. Phylogenetic analysis of five species found S. 'Parrish' to be more closely related to S. kochii than to S. cannifolium. This study identified the characteristics of the S. 'Parrish' chloroplast genome, which will facilitate species identification and phylogenetic analysis within the genus Spathiphyllum.Entities:
Year: 2019 PMID: 31644545 PMCID: PMC6808432 DOI: 10.1371/journal.pone.0224038
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Gene map of S. 'Parrish'.
Genes lying outside the circle are transcribed in a clockwise direction, whereas genes inside are transcribed in a counterclockwise direction. Different colors denote known functional groups. The GC and AT contents of the genome are denoted by dashed darker and lighter gray in the inner circle. LSC, SSC, and IR indicate large single-copy, small single-copy, and inverted repeat regions, respectively.
Summary of the S. 'Parrish' chloroplast genome features.
| Attribute | |
|---|---|
| Genome size/GC content | 168,493/36.19 |
| Coding gene number/size | 87/85,110 |
| tRNA gene number/size | 37/2,814 |
| rRNA gene number/size | 8/9,050 |
| LSC size/percent/GC content | 89,494/53.11/34.72 |
| SSC size/percent/GC content | 15,799/9.37/29.35 |
| IR size/percent/GC content | 31,600/18.75/39.98 |
| Intron size/percent | 17754/10.53 |
| Intergentic spacer size/percent | 53,685/31.86 |
List of annotated genes in the S. 'Parrish' chloroplast genome.
| Function | Genes |
|---|---|
| RNAs, transfer | |
| RNAs, ribosomal | |
| Transcription and splicing | |
| Translation, ribosomal proteins | |
| Small subunit | |
| Large subunit | |
| Photosynthesis | |
| ATP synthase | |
| Photosystem I | |
| Photosystem II | |
| Calvin cycle | |
| Cytochrome complex | |
| NADH dehydrogenase | |
| Others |
* Genes containing one intron;
** Genes containing two introns;
a Duplicated gene (genes present in the IR regions).
Codon usage in S. 'Parrish'.
| Codon | Count | RSCU | Codon | Count | RSCU | Codon | Count | RSCU | Codon | Count | RSCU |
|---|---|---|---|---|---|---|---|---|---|---|---|
| UUU(F) | 1054 | 1.29 | UCU(S) | 609 | 1.62 | UAU(Y) | 832 | 1.56 | UGU(C) | 256 | 1.54 |
| UUC(F) | 574 | 0.71 | UCC(S) | 389 | 1.03 | UAC(Y) | 234 | 0.44 | UGC(C) | 77 | 0.46 |
| UUA(L) | 882 | 1.82 | UCA(S) | 496 | 1.32 | UAA(*) | 36 | 1.24 | UGA(*) | 22 | 0.76 |
| UUG(L) | 602 | 1.24 | UCG(S) | 176 | 0.47 | UAG(*) | 29 | 1 | UGG(W) | 502 | 1 |
| CUU(L) | 614 | 1.27 | CCU(P) | 454 | 1.54 | CAU(H) | 523 | 1.51 | CGU(R) | 373 | 1.29 |
| CUC(L) | 207 | 0.43 | CCC(P) | 228 | 0.77 | CAC(H) | 172 | 0.49 | CGC(R) | 96 | 0.33 |
| CUA(L) | 411 | 0.85 | CCA(P) | 366 | 1.24 | CAA(Q) | 768 | 1.52 | CGA(R) | 377 | 1.31 |
| CUG(L) | 187 | 0.39 | CCG(P) | 133 | 0.45 | CAG(Q) | 241 | 0.48 | CGG(R) | 124 | 0.43 |
| AUU(I) | 1186 | 1.5 | ACU(T) | 572 | 1.55 | AAU(N) | 1075 | 1.53 | AGU(S) | 465 | 1.24 |
| AUC(I) | 444 | 0.56 | ACC(T) | 253 | 0.68 | AAC(N) | 334 | 0.47 | AGC(S) | 124 | 0.33 |
| AUA(I) | 747 | 0.94 | ACA(T) | 476 | 1.29 | AAA(K) | 1208 | 1.5 | AGA(R) | 590 | 2.05 |
| AUG(M) | 639 | 1 | ACG(T) | 177 | 0.48 | AAG(K) | 402 | 0.5 | AGG(R) | 171 | 0.59 |
| GUU(V) | 553 | 1.45 | GCU(A) | 643 | 1.78 | GAU(D) | 935 | 1.6 | GGU(G) | 611 | 1.31 |
| GUC(V) | 213 | 0.56 | GCC(A) | 223 | 0.62 | GAC(D) | 235 | 0.4 | GGC(G) | 174 | 0.37 |
| GUA(V) | 544 | 1.43 | GCA(A) | 421 | 1.17 | GAA(E) | 1129 | 1.49 | GGA(G) | 764 | 1.64 |
| GUG(V) | 216 | 0.57 | GCG(A) | 155 | 0.43 | GAG(E) | 388 | 0.51 | GGG(G) | 312 | 0.67 |
The length of exons and introns in genes with introns in the S. 'Parrish' chloroplast genome.
| Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
| LSC | 42 | 2569 | 37 | |||
| LSC | 24 | 746 | 48 | |||
| LSC | 35 | 527 | 50 | |||
| LSC | 37 | 597 | 38 | |||
| IR | 42 | 943 | 35 | |||
| IR | 38 | 805 | 35 | |||
| LSC | 26 | 542 | 232 | 126 | ||
| LSC | 197 | 1099 | 40 | |||
| LSC | 386 | 829 | 154 | |||
| LSC | 1639 | 725 | 455 | |||
| LSC | 141 | 787 | 209 | 777 | 124 | |
| LSC | 275 | 649 | 277 | 701 | 66 | |
| LSC | 6 | 55 | 642 | |||
| LSC | 8 | 747 | 475 | |||
| LSC | 399 | 1195 | 9 | |||
| IR | 443 | 652 | 391 | |||
| IR | 778 | 650 | 782 | |||
| IR | 42 | 33 | 411 | |||
| SSC | 518 | 1136 | 562 |
* The rps12 gene is a trans-spliced gene with the 5’ end located in the LSC region and the duplicated 3’ ends in the IR regions.
Fig 2Analysis of repeated sequences in five Araceae chloroplast genomes.
(A) Totals of four repeat types; (B) frequency of forward repeats by length; (C) frequency of reverse repeats by length; (D) frequency of palindromic repeats by length; (E) frequency of complementary repeats by length.
Fig 3Analysis of SSRs in the five Araceae chloroplast genomes.
Fig 4Comparison of the LSC, SSC, and IR regions among five chloroplast genomes.
Boxes above the main line indicate the adjacent border genes. The figure is not to scale with respect to sequence length, and shows only relative changes at or near the IR/SC borders.
Fig 5Comparison of five chloroplast genomes using mVISTA.
Gray arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, blue bars represent rRNA or tRNA genes, and pink bars represent noncoding sequences (NCS). The y-axis represents the percent identity (shown: 50–100%).
Fig 6(A) Phylogenetic tree reconstruction of 15 species based on sequences from whole chloroplast genomes by the maximum likelihood method. (B) Phylogenetic tree derived from all protein-coding genes from 15 species by the maximum likelihood method. Raphanus sativus and Brassica napus were used as outgroups. All bootstrap supports are indicated near the node.