| Literature DB >> 29861452 |
Hai-Ying Liu1, Yan Yu2, Yi-Qi Deng3, Juan Li4, Zi-Xuan Huang5, Song-Dong Zhou6.
Abstract
Lilium henrici Franchet, which belongs to the family Liliaceae, is an endangered plant native to China. The wild populations of L. henrici have been largely reduced by habitat degradation or loss. In our study, we determined the whole chloroplast genome sequence for L. henrici and compared its structure with other Lilium (including Nomocharis) species. The chloroplast genome of L. henrici is a circular structure and 152,784 bp in length. The large single copy and small single copy is 82,429 bp and 17,533 bp in size, respectively, and the inverted repeats are 26,411 bp in size. The L. henrici chloroplast genome contains 116 different genes, including 78 protein coding genes, 30 tRNA genes, 4 rRNA genes, and 4 pseudogenes. There were 51 SSRs detected in the L. henrici chloroplast genome sequence. Genic comparison among L. henrici with other Lilium (including Nomocharis) chloroplast genomes shows that the sequence lengths and gene contents show little variation, the only differences being in three pseudogenes. Phylogenetic analysis revealed that N. pardanthina was a sister species to L. henrici. Overall, this study, providing L. henrici genomic resources and the comparative analysis of Lilium chloroplast genomes, will be beneficial for the evolutionary study and phylogenetic reconstruction of the genus Lilium, molecular barcoding in population genetics.Entities:
Keywords: Liliaceae; Lilium henrici; chloroplast genome; comparative analysis; phylogeny
Mesh:
Substances:
Year: 2018 PMID: 29861452 PMCID: PMC6100032 DOI: 10.3390/molecules23061276
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Gene map of the Lilium henrici chloroplast genome. The genes drawn outside and inside the outer circle transcribed clockwise and counter-clockwise, respectively. Genes of different functional groups are color coded. GC content and AT content are represented on the inner circle by darker gray and lighter gray, respectively.
Genome features of L. henrici complete chloroplast genome.
| Region | Chloroplast Features |
|---|---|
| Chloroplast genome size (bp) | 152,784 |
| LSC (bp) | 82,429 |
| SSC (bp) | 17,533 |
| IR (bp) | 26,411 |
| Total GC contents (%) | 37.0 |
| LSC GC contents (%) | 34.83 |
| SSC GC contents (%) | 30.59 |
| IR GC contents (%) | 42.50 |
| No. of total/unique genes | 136/116 |
| Total CDS length (bp) | 113,441 |
| Intergenic spacer (bp) | 39,343 |
| Protein-coding genes | 78 |
| tRNAs | 30 |
| rRNAs | 4 |
| Genes duplicated | 20 |
| Genes with intron(s) | 18 |
| Genes with a single intron | 15 |
| Genes with two introns | 3 |
| tRNAs with intron(s) | 6 |
List of genes present in the L. henrici chloroplast genome.
| Classification of Genes | Name of Gene(s) | Number | |
|---|---|---|---|
| RNA genes | Ribosomal RNAs | 8 | |
| Transfer RNAs | 38 | ||
| Protein genes | Photosynthesis | ||
| Photosystem I | 5 | ||
| Photosystem II | 15 | ||
| Cytochrome | 6 | ||
| ATP synthase | 6 | ||
| Rubisco |
| 1 | |
| NADH dehydrogenease | 12 | ||
| ATP-dependent protease subunit P |
| 1 | |
| Chloroplast envelope membrane protein |
| 1 | |
| Ribosomal proteins | large units | 11 | |
| small units | 14 | ||
| Transcription/trnslation | RNA polymerase | 4 | |
| Miscellaneous proteins | 3 | ||
| Hypothetical proteins & Conserved reading frame | 5 | ||
| Pseudogenes | 6 | ||
| Total | 136 | ||
Figure 2Analysis of simple sequence repeats (SSRs) in twenty Lilium (including Nomocharis) chloroplast genome sequences. (A) Number of different SSRs types detected in twenty Lilium (including Nomocharis) chloroplast genome sequences; (B) Presence of different SSRs types in all SSRs of twenty Lilium (including Nomocharis) chloroplast genome sequences; (C) Number of SSRs in the LSC, IR, SSC regions in twenty Lilium (including Nomocharis) chloroplast genome sequences; (D) Number of common SSRs in twenty Lilium (including Nomocharis) chloroplast genome sequences.
Codon usage for L. henrici chloroplast genome.
| Amino Acid | Codon | Number | RSCU | Amino Acid | Codon | Number | RSCU |
|---|---|---|---|---|---|---|---|
| Phe | UUU | 794 | 1.33 | Ser | UCU | 446 | 1.66 |
| UUC | 396 | 0.67 | UCC | 261 | 0.97 | ||
| Leu | UUA | 757 | 2.08 | UCA | 336 | 1.25 | |
| UUG | 421 | 1.16 | UCG | 142 | 0.53 | ||
| CUU | 463 | 1.27 | Pro | CCU | 335 | 1.55 | |
| CUC | 137 | 0.38 | CCC | 190 | 0.88 | ||
| CUA | 288 | 0.79 | CCA | 245 | 1.13 | ||
| CUG | 118 | 0.32 | CCG | 95 | 0.44 | ||
| Ile | AUU | 893 | 1.42 | Thr | ACU | 434 | 1.62 |
| AUC | 354 | 0.56 | ACC | 189 | 0.7 | ||
| AUA | 638 | 1.02 | ACA | 343 | 1.28 | ||
| Met | AUG | 510 | 1 | ACG | 108 | 1.4 | |
| Val | GUU | 439 | 1.5 | Ala | GCU | 506 | 1.75 |
| GUC | 142 | 0.48 | GCC | 181 | 0.63 | ||
| GUA | 435 | 1.48 | GCA | 335 | 1.16 | ||
| GUG | 157 | 0.54 | GCG | 136 | 0.47 | ||
| Tyr | UAU | 686 | 1.64 | Cys | UGU | 181 | 1.5 |
| UAC | 153 | 0.36 | UGC | 60 | 0.5 | ||
| Ter | UAA | 27 | 1.53 | Ter | UGA | 13 | 0.74 |
| UAG | 13 | 0.74 | Trp | UGG | 383 | 1 | |
| His | CAU | 408 | 1.59 | Arg | CGU | 284 | 1.37 |
| CAC | 106 | 0.41 | CGC | 80 | 0.39 | ||
| Gln | CAA | 571 | 1.5 | CGA | 278 | 1.34 | |
| CAG | 190 | 0.5 | CGG | 103 | 0.5 | ||
| Asn | AAU | 828 | 1.57 | Ser | AGU | 348 | 1.29 |
| AAC | 226 | 0.43 | AGC | 81 | 0.3 | ||
| Lys | AAA | 848 | 1.51 | Arg | AGA | 390 | 1.88 |
| AAG | 273 | 0.49 | AGG | 110 | 0.53 | ||
| Asp | GAU | 692 | 1.6 | Gly | GGU | 466 | 1.27 |
| GAC | 174 | 0.4 | GGC | 166 | 0.45 | ||
| Glu | GAA | 856 | 1.51 | GGA | 586 | 1.6 | |
| GAG | 275 | 0.49 | GGG | 245 | 0.67 |
Figure 3Sequence alignment of twenty Lilium (including Nomocharis) chloroplast genomes, with L. longiflorum as a reference. The y-axis indicates the percent identity between 50% and 100%. Genome regions colored represent protein coding regions, rRNA coding regions, tRNA coding regions or conserved noncoding sequences (CNS).
Figure 4Molecular phylogenetic tree of the family Liliaceae based on the complete chloroplast genomes among 25 species. The tree was constructed using maximum likelihood (ML) algorithm and the GTR + I + G model.