| Literature DB >> 31497389 |
Qi Chen1, Xiaobo Wu1, Dequan Zhang1,2.
Abstract
Fritillaria cirrhosa D. Don, whose bulb is used in a well-known traditional Chinese medicine to relieve cough and eliminate phlegm, is one of the most important medicinal plants of Fritillaria L. The species is widely distributed among the alpine regions in southwestern China and possesses complex morphological variations in different distributions. A series of newly related species were reported, based on obscure morphological differences. As a result, F. cirrhosa and its closely related species constitute a taxonomically complex group. However, it is difficult to accurately identify these species and reveal their phylogenetic relationships using traditional taxonomy. Molecular markers and gene fragments have been adopted but they are not able to afford sufficient phylogenetic resolution in the genus. Here, we report the complete chloroplast genome sequences of F. cirrhosa and its closely related species using next generation sequencing (NGS) technology. Eight plastid genomes ranged from 151,058 bp to 152,064 bp in length and consisted of 115 genes. Gene content, gene order, GC content, and IR/SC boundary structures were highly similar among these genomes. SSRs and five large repeat sequences were identified and the total number of them ranged from 73 to 79 and 63 to 75, respectively. Six highly divergent regions were successfully identified that could be used as potential genetic markers of Fritillaria. Phylogenetic analyses revealed that eight Fritillaria species were clustered into three clades with strong supports and F. cirrhosa was closely related to F. przewalskii and F. sinica. Overall, this study indicated that the complete chloroplast genome sequence was an efficient tool for identifying species in taxonomically complex groups and exploring their phylogenetic relationships.Entities:
Keywords: Closely related species; Complete chloroplast genome; Fritillaria cirrhosa D. Don; Phylogenetic relationship; Taxonomically complex groups
Year: 2019 PMID: 31497389 PMCID: PMC6708372 DOI: 10.7717/peerj.7480
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Distribution of Fritillaria cirrhosa and its closely related species.
The distribution area of each species is drawn according to the records of Luo & Chen (1996), Liu, Wang & Chen (2009) and some existing voucher specimens (http://www.cvh.ac.cn/). Photos of representative living plants of eight Fritillaria species: (A) F. cirrhosa, (B) F. sichuanica, (C) F. taipaiensis, (D) F. yuzhongensis, (E) F. unibracteata, (F) F. przewalskii, (G) F. sinica, (H) F. dajinensis.Topographic data digital elevation modeling (DEM) data were required from the USGS website (https://glovis.usgs.gov/app?tour) with a 90-m spatial resolution grid.
Summary of complete chloroplast genomes for eight Fritillariaspecies.
| Total (bp) | Large single copy (LSC,bp) | Small single copy (SSC,bp) | Inverted repeat (IR,bp) | GC% | Total | Protein coding genes | tRNA | rRNA | Accession number in GenBank | |
|---|---|---|---|---|---|---|---|---|---|---|
| 151,998 | 81,755 | 17,545 | 26,349 | 36.9% | 115 | 78 | 30 | 4 |
| |
| 151,958 | 81,726 | 17,542 | 26,345 | 37.0% | 115 | 78 | 30 | 4 |
| |
| 151,983 | 81,744 | 17,539 | 26,350 | 36.9% | 115 | 78 | 30 | 4 |
| |
| 151,058 | 81,339 | 17,539 | 26,090 | 37.0% | 115 | 78 | 30 | 4 |
| |
| 151,707 | 81,451 | 17,552 | 26,352 | 37.0% | 115 | 78 | 30 | 4 |
| |
| 151,645 | 81,417 | 17,526 | 26,351 | 37.0% | 115 | 78 | 30 | 4 |
| |
| 152,064 | 81,827 | 17,537 | 26,350 | 36.9% | 115 | 78 | 30 | 4 |
| |
| 151,991 | 81,723 | 17,540 | 26,364 | 36.9% | 115 | 78 | 30 | 4 |
|
Figure 2Gene map of Fritillaria chloroplast genomes.
Genes outside the circle are transcribed clockwise, and genes shown on the inside of the circle are counter-clockwise. Genes belonging to functional group are color-coded. The darker gray in the inner corresponds to GC content, and the lighter gray corresponds to AT content.
Base composition in Fritillaria cirrhosa chloroplast genome.
| T/U% | C% | A% | G% | AT% | Length (bp) | |
|---|---|---|---|---|---|---|
| Genome | 31.9 | 18.8 | 31.1 | 18.1 | 63.1 | 151,998 |
| LSC | 33.3 | 17.9 | 31.9 | 17.0 | 65.1 | 81,755 |
| SSC | 35.0 | 16.1 | 34.5 | 14.4 | 69.5 | 17,545 |
| IR | 28.5 | 20.5 | 29 | 22.0 | 57.5 | 26,349 |
| tRNA | 25.0 | 23.7 | 21.9 | 29.4 | 46.9 | 2,877 |
| rRNA | 18.9 | 23.5 | 26.0 | 31.5 | 45.0 | 9,052 |
| Protein Coding genes | 31.7 | 17.3 | 31.0 | 20.0 | 62.7 | 68,234 |
| 1st position codon | 24.6 | 18.1 | 30.9 | 26.4 | 55.5 | 22,745 |
| 2nd position codon | 32.2 | 19.9 | 29.9 | 18.1 | 62.0 | 22,745 |
| 3rd position codon | 38.3 | 14.0 | 32.1 | 15.6 | 70.4 | 22,744 |
Gene contents in eight Fritillaria chloroplast genome.
| Category for gene | Group of genes | Name of genes |
|---|---|---|
| Self-replication | Large subunit of ribosome | |
| Small subunit of ribosome | ||
| DNA dependent RNA polymerase | ||
| rRNA gene | ||
| tRNA gene | ||
| Gene for photosynthesis | Subunits of photosystem I | |
| Subunits of photosystem II | ||
| Subunits of NADH-dehydrogenase | ||
| Subunits of cytochrome b/f complex | ||
| Subunit for ATP synthase | ||
| Large subunit of rubisco | ||
| Other genes | Translational initiation factor | |
| Maturase | ||
| Protease | ||
| Envelope membrane protein | ||
| Subunit of Acetyl-carboxylase | ||
| C-type cytochrome synthesis gene | ||
| Open reading frames(ORF,ycf) |
Notes.
The I label after gene names reflect genes located in IR regions. Intron containing gene is indicated by one asterisk.
Figure 3Analysis of simple sequence repeat (SSR) in eight Fritillaria cp genomes.
(A) Number different SSRs type detected in nine genomes; (B) frequency of SSR motifs in different repeat types of F. cirrhosa cp genome; (C) frequency of identified SSR in LSC, SSC, and IR regions; (D) frequency of identified SSR in IGS, CDS, and intron.
Figure 4Analysis of large repeat sequences in eight Fritillaria cp genomes.
(A) Total of five repeat types; (B) frequency of tandem repeats in IGS, CDS, and intron; (C) frequency of tandem repeats by length; (D) frequency of palindromic repeats by length; (E) frequency of forward repeats by length; (F) frequency of reverse repeats by length.
Figure 5Visualization alignment of nine Fritillaria cp genomes.
VISTA-based identify plot showing sequence identify among eight Fritillaria species using Fritillaria cirrhosa D. Don as a reference. The thick black line shows the inverted repeats (IRs) in the chloroplast genomes.
Figure 6Comparison of LSC, SSC, and IR border regions among eight Fritillaria cp genomes.
Colored boxes for genes represent the gene position.
Number of nucleotide substitutions and sequence distance in eight complete chloroplast genomes.
| 311 | 112 | 314 | 335 | 310 | 117 | 311 | ||
| 0.0021 | 328 | 95 | 290 | 261 | 331 | 52 | ||
| 0.0007 | 0.0022 | 317 | 340 | 314 | 81 | 328 | ||
| 0.0021 | 0.0006 | 0.0021 | 277 | 252 | 320 | 105 | ||
| 0.0022 | 0.0019 | 0.0023 | 0.0018 | 169 | 337 | 294 | ||
| 0.0021 | 0.0017 | 0.0021 | 0.0017 | 0.0011 | 313 | 261 | ||
| 0.0008 | 0.0022 | 0.0005 | 0.0021 | 0.0022 | 0.0021 | 333 | ||
| 0.0021 | 0.0003 | 0.0022 | 0.0007 | 0.0020 | 0.0017 | 0.0022 |
Notes.
The upper triangle shows number of nucleotide substitutions and the lower triangle indicates genetic distance in complete cp genomes among species.
Variable site analysis in Fritillaria chloroplast genomes.
| Number of | Number of | Number of parsimony information sites | Nucleotide | |
|---|---|---|---|---|
| Complete cp genome | 152,707 | 728 | 342 | 0.00172 |
| LSC | 82,378 | 514 | 243 | 0.00223 |
| SSC | 17,582 | 162 | 74 | 0.00332 |
| IR | 26,372 | 27 | 13 | 0.00038 |
| Protein coding genes | 68,709 | 237 | 112 | 0.00129 |
Figure 7Sliding window analysis of eight Fritillaria cp genomes (window length: 600 bp, step size: 200 bp).
X-axis: position of the midpoint of a window; Y-axis: nucleotide diversity of each window.
Figure 8Phylogenetic relationship of nine Fritillaria species inferred from Bayesian analyses (BI), maximum parsimony (MP), and maximum likelihood (ML) of different datasets.
(A) Chloroplast genome (Only contains one IR); (B) LSC region; (C) SSC region; (D) protein coding region. Number above nodes are support values with Bayesian posterior probabilities (PP) values on the left, MP bootstrap values in the middle, ML bootstrap values on the right.