| Literature DB >> 28119727 |
Rui-Sen Lu1, Pan Li1, Ying-Xiong Qiu1.
Abstract
The genus Cardiocrinum (Endlicher) Lindley (Liliaceae) comprises three herbaceous perennial species that are distributed in East Asian temperate-deciduous forests. Although all three Cardiocrinum species have horticultural and medical uses, studies related to species identification and molecular phylogenetic analysis of this genus have not been reported. Here, we report the complete chloroplast (cp) sequences of each Cardiocrinum species using Illumina paired-end sequencing technology. The cp genomes of C. giganteum, C. cathayanum, and C. cordatum were found to be 152,653, 152,415, and 152,410 bp in length, respectively, including a pair of inverted repeat (IR) regions (26,364-26,500 bp) separated by a large single-copy (LSC) region (82,186-82,368 bp) and a small single-copy (SSC) region (17,309-17,344 bp). Each cp genome contained the same 112 unique genes consisting of 30 transfer RNA genes, 4 ribosomal RNA genes, and 78 protein-coding genes. Gene content, gene order, AT content, and IR/SC boundary structures were almost the same among the three Cardiocrinum cp genomes, yet their lengths varied due to contraction/expansion of the IR/SC borders. Simple sequence repeat (SSR) analysis further indicated the richest SSRs in these cp genomes to be A/T mononucleotides. A total of 45, 57, and 45 repeats were identified in C. giganteum, C. cathayanum, and C. cordatum, respectively. Six cpDNA markers (rps19, rpoC2-rpoC1, trnS-psbZ, trnM-atpE, psaC-ndhE, ycf15-ycf1) with the percentage of variable sites higher than 0.95% were identified. Phylogenomic analyses of the complete cp genomes and 74 protein-coding genes strongly supported the monophyly of Cardiocrinum and a sister relationship between C. cathayanum and C. cordatum. The availability of these cp genomes provides valuable genetic information for further population genetics and phylogeography studies on Cardiocrinum.Entities:
Keywords: Cardiocrinum; Liliaceae; chloroplast genome; genomic structure; phylogenomics; taxonomic identification
Year: 2017 PMID: 28119727 PMCID: PMC5222849 DOI: 10.3389/fpls.2016.02054
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
The basic characteristics of three .
| Clean reads | 16,593,274 | 17,071,940 | 16,590,680 |
| Average read length (bp) | 125 | 125 | 125 |
| Number of contigs | 26,859 | 17,157 | 20,859 |
| Total length of contigs (bp) | 11,547,060 | 6,383,866 | 8,245,296 |
| N50 length of contigs (bp) | 391 | 351 | 366 |
| Total cpDNA size (bp) | 152,653 | 152,415 | 152,410 |
| LSC length (bp) | 82,344 | 82,368 | 82,186 |
| SSC length (bp) | 17,309 | 17,319 | 17,344 |
| IR length (bp) | 26,500 | 26,364 | 26,440 |
| Total CDS length (bp) | 72,870 | 72,201 | 72,846 |
| Total tRNA length (bp) | 2879 | 2880 | 2881 |
| Total rRNA length | 9046 | 9046 | 9050 |
| Total GC content (%) | 37.1 | 37.1 | 37.1 |
| LSC | 34.9 | 34.9 | 34.9 |
| SSC | 30.8 | 30.9 | 30.9 |
| IR | 42.5 | 42.5 | 42.5 |
| Total number of genes | 132 | 132 | 132 |
| Protein-coding genes | 78 | 78 | 78 |
| rRNAs genes | 4 | 4 | 4 |
| tRNAs genes | 30 | 30 | 30 |
| Duplicated genes | 20 | 20 | 20 |
Figure 1Gene maps of the three Cardiocrinum giganteum; (B) C. cathayanum; (C) C. cordatum. Genes shown on the outside of the circle are transcribed clockwise, and genes inside are transcribed counter-clockwise. Genes belonging to different functional groups are color-coded. The darker gray in the inner corresponds to GC content, and the lighter gray corresponds to AT content.
Gene composition of .
| Ribosomal RNAs | |
| Transfer RNAs | |
| Photosystem I | |
| Photosystem II | |
| Cytochrome | |
| ATP synthase | |
| Rubisco | |
| NADH dehydrogenase | |
| ATP-dependent protease subunit P | |
| Chloroplast envelope membrane protein | |
| large units | |
| small units | |
| RNA polymerase | |
| Miscellaneous proteins | |
| Hypothetical proteins & conserved reading frames | |
| Pseudogenes |
Indicates the genes containing a single intron.
Indicates the genes containing two introns; (× 2) indicates genes duplicated in the IR regions; pseudogene is represented by .
Figure 2Comparison of LSC, IR, and SSC junction positions among the three .
Figure 3Sequence identity plots among the three . Annotated genes are displayed along the top. The vertical scale represents the percent identity between 50 and 100%. Genome regions are color coded as exon, intron, and conserved non-coding sequences (CNS).
Figure 4The nucleotide variability (Pi) values were compared among .
Figure 5Analysis of repeated sequences in the three Frequency of repeats by length; (B) Frequency of repeat types; (C) Summary of the shared repeats among the Cardiocrinum cp genomes.
Figure 6Simple sequence repeats (SSRs) in the three Numbers of SSRs by length; (B) Distribution of SSR loci. IGS: intergenic spacer region.
Figure 7Phylogenetic relationships of the three . Numbers above the lines represent ML bootstrap values and BI posterior probability. The phylogenetic tree based on 74 protein-coding genes is completely consistent with this topology.