| Literature DB >> 33188288 |
Qiu-Jie Li1,2, Na Su1,2, Ling Zhang3, Ru-Chang Tong1,2, Xiao-Hui Zhang4, Jun-Ru Wang1,2, Zhao-Yang Chang1,2, Liang Zhao5,6, Daniel Potter7.
Abstract
Pulsatilla (Ranunculaceae) consists of about 40 species, and many of them have horticultural and/or medicinal value. However, it is difficult to recognize and identify wild Pulsatilla species. Universal molecular markers have been used to identify these species, but insufficient phylogenetic signal was available. Here, we compared the complete chloroplast genomes of seven Pulsatilla species. The chloroplast genomes of Pulsatilla were very similar and their length ranges from 161,501 to 162,669 bp. Eight highly variable regions and potential sources of molecular markers such as simple sequence repeats, large repeat sequences, and single nucleotide polymorphisms were identified, which are valuable for studies of infra- and inter-specific genetic diversity. The SNP number differentiating any two Pulsatilla chloroplast genomes ranged from 112 to 1214, and provided sufficient data for species delimitation. Phylogenetic trees based on different data sets were consistent with one another, with the IR, SSC regions and the barcode combination rbcL + matK + trnH-psbA produced slightly different results. Phylogenetic relationships within Pulsatilla were certainly resolved using the complete cp genome sequences. Overall, this study provides plentiful chloroplast genomic resources, which will be helpful to identify members of this taxonomically challenging group in further investigation.Entities:
Mesh:
Year: 2020 PMID: 33188288 PMCID: PMC7666119 DOI: 10.1038/s41598-020-76699-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Voucher information and GenBank accession numbers for Pulsatilla and outgroups.
| Taxon | Location | Date | Herbarium | Accession | SRA accession |
|---|---|---|---|---|---|
| Europe MoaiJuni | 1904 | US | MN025344 | SRR12822481 | |
| U.S.S.R. | 1957 | US | MN025347 | SRR12822474 | |
| America Graubunden | 1936 | US | MN025343 | SRR12822486 | |
| America Siskiyon Calfornia | 1943 | US | MN025348 | SRR12822484 | |
| America Albany county | 1898 | US | MN025346 | SRR12822477 | |
| America | – | US | MN025345 | SRR12822480 | |
| China | 2014 | WUK | MN025349 | SRR12822482 | |
| NA | NA | NA | KR_297058 | NA | |
| NA | NA | NA | KR_297060 | NA | |
| NA | NA | NA | KR_297062 | NA | |
| NA | NA | NA | MG_001341 | NA | |
| NA | NA | NA | MH_205609 | NA |
Species with asterisks were collected by this study, whereas others were obtained from Genbank.
NA not applicable.
Summary of complete chloroplast genomes of Pulsatilla.
| Species | Number of reads | Average depth of coverage (×) | Size (bp) | Length (bp) | Coding (bp) | Non-coding (bp) | GC% | ||
|---|---|---|---|---|---|---|---|---|---|
| LSC | SSC | IR | |||||||
| 13,012,108 | 194.8 | 161,501 | 81,672 | 17,431 | 31,199 | 78,377 | 78,377 | 37.6 | |
| 11,405,519 | 151.5 | 161,743 | 81,653 | 17,648 | 31,221 | 81,800 | 81,800 | 37.6 | |
| 15,816,765 | 367.1 | 162,669 | 82,149 | 17,688 | 31,416 | 81,246 | 81,246 | 37.6 | |
| 12,554,200 | 276.9 | 161,764 | 81,615 | 17,755 | 31,197 | 79,089 | 79,089 | 37.6 | |
| 9,251,830 | 257.5 | 162,051 | 81,860 | 17,771 | 31,210 | 79,280 | 79,280 | 37.6 | |
| 13,337,339 | 503.4 | 161,936 | 81,866 | 17,702 | 31,184 | 80,206 | 80,206 | 37.6 | |
| 6,468,944 | 410.1 | 162,064 | 81,688 | 17,908 | 31,234 | 79,114 | 79,114 | 37.5 | |
Figure 1Gene map of the Pulsatilla chloroplast genome. Dashed area in the inner circle indicates the GC content of the chloroplast genome. LSC, SSC and IR mean large single copy, small single copy and inverted repeat, respectively. Genes belonging to different functional groups are color-coded as indicated by icons on the lower left corner. The red line on the outside of the gene is three inversions.
Figure 2Comparisons of LSC, SSC, and IR region borders among the seven Pulsatilla chloroplast genomes.
Figure 3Visualized alignment of the eight Pulsatilla chloroplast genomes. The mVISTA-based identity plots show the sequence identity among the seven chloroplast genomes, with P. chinensis serving as a reference. Blue represents coding regions, and pink represents non-coding regions.
Numbers of nucleotide substitutions and sequence distance (Pi) in eleven complete cp genomes.
| 0.0028 | 0.0027 | 0.0029 | 0.0030 | 0.0035 | 0.0024 | 0.0034 | 0.0028 | 0.0058 | 0.0064 | ||
| 563 | 0.0033 | 0.0034 | 0.0034 | 0.0038 | 0.0029 | 0.0039 | 0.0033 | 0.0063 | 0.0068 | ||
| 558 | 642 | 0.0005 | 0.0011 | 0.0015 | 0.0019 | 0.0028 | 0.0010 | 0.0046 | 0.0052 | ||
| 572 | 638 | 112 | 0.0011 | 0.0015 | 0.0019 | 0.0028 | 0.0010 | 0.0046 | 0.0051 | ||
| 652 | 676 | 205 | 229 | 0.0009 | 0.0020 | 0.0029 | 0.0008 | 0.0048 | 0.0053 | ||
| 685 | 735 | 290 | 303 | 200 | 0.0024 | 0.0034 | 0.0013 | 0.0052 | 0.0058 | ||
| 470 | 516 | 378 | 383 | 392 | 547 | 0.0020 | 0.0019 | 0.0050 | 0.0055 | ||
| 662 | 750 | 586 | 577 | 604 | 701 | 414 | 0.0029 | 0.0059 | 0.0064 | ||
| 650 | 735 | 215 | 233 | 198 | 284 | 463 | 527 | 0.0046 | 0.0052 | ||
| 1212 | 1214 | 915 | 903 | 916 | 1007 | 964 | 1128 | 908 | 0.0045 | ||
| 1207 | 1293 | 979 | 968 | 1006 | 1081 | 1046 | 1219 | 930 | 869 |
The uppertriangle shows the number of nucleotide substitutions and the lower triangle indicates the number of sequence distance in complete cp genomes.
Figure 4Sliding window analysis of the whole chloroplast genomes of Pulsatilla taxa.
Sequence characteristics of eight high variable regionsamong eleven complete cp genomes of Pulsatilla.
| Region | Aligned length | Variable sites | Indels | Nucleotide diversity (Pi) | ||
|---|---|---|---|---|---|---|
| No | % | No | Length range | |||
| 986 | 22 | 2.23 | 11 | 1–44 | 0.01497 | |
| 1985 | 62 | 3.12 | 23 | 1–79 | 0.00998 | |
| 1484 | 61 | 4.11 | 18 | 1–34 | 0.01502 | |
| 1469 | 70 | 4.77 | 24 | 1–139 | 0.01140 | |
| 732 | 39 | 5.33 | 10 | 1–87 | 0.04368 | |
| 2757 | 196 | 7.11 | 69 | 1–43 | 0.02212 | |
| 5807 | 148 | 2.55 | 18 | 1–24 | 0.00802 | |
| 2345 | 86 | 3.67 | 8 | 1–18 | 0.00813 | |
Figure 5Analyses of repeated sequences in seven newly sequenced chloroplast genomes. (A) Number of five repeat types; (B) frequency of four repeats by length; (C) frequency of microsatellites by base composition; (D) frequency of microsatellites by types; (E) frequency of microsatellites by length; (F) number of all repeats by location.
Figure 6Phylogenetic relationships of the eleven Pulsatilla species inferred from maximum likelihood (ML). Including whole chloroplast genome, rbcL + matK + trnH-psbA, LSC region, coding region, SSC region, IR region, and the concatenation of the eight highly variable regions mentioned in Table 4 (Numbers above nodes are support values with ML bootstrap values on the left, and MP bootstrap values on the right).