| Literature DB >> 28303008 |
Ning Zhang1, David L Erickson2, Padmini Ramachandran2, Andrea R Ottesen2, Ruth E Timme2, Vicki A Funk3, Yan Luo2, Sara M Handy2.
Abstract
Echinacea is a common botanical used in dietary supplements, primarily to treat upper respiratory tract infections and to support immune function. There are currently thought to be nine species in the genus Echinacea. Due to very low molecular divergence among sister species, traditional DNA barcoding has not been successful for differentiation of Echinacea species. Here, we present the use of full chloroplast genomes to distinguish between all 9 reported species. Total DNA was extracted from specimens stored at the National Museum of Natural History, Smithsonian Institution, which had been collected from the wild with species identification documented by experts in the field. We used Next Generation Sequencing (NGS) and CLC Genomics Workbench to assemble complete chloroplast genomes for all nine species. Full chloroplasts unambiguously differentiated all nine species, compared with the very few single nucleotide polymorphisms (SNPs) available with core DNA barcoding markers. SNPs for any two Echinacea chloroplast genomes ranged from 181 to 910, and provided robust data for unambiguous species delimitation. Implications for DNA-based species identification assays derived from chloroplast genome sequences are discussed in light of product safety, adulteration and quality issues.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28303008 PMCID: PMC5428300 DOI: 10.1038/s41598-017-00321-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The nine species sampled in this study and information on the chloroplast genome assembly.
| Species | Raw data size (MB) | Number of reads | Size of reads (bp) | Coverage of chloroplaste genome | Size of chloroplast genome (bp) | Accession number |
|---|---|---|---|---|---|---|
|
| 2,531 | 10,394,828 | 2 × 300 | 40 | 151,913 | KX548224 |
|
| 2,437 | 10,966,208 | 2 × 250 | 51 | 151,926 | KX548225 |
|
| 434 | 1,814,356 | 2 × 250 | 20 | 151,877 | KX548223 |
|
| 832 | 4,078,614 | 2 × 250 | 33 | 151,883 | KX548218 |
|
| 1,692 | 6,202,480 | 2 × 300 | 51 | 151,837 | KX548217 |
|
| 472 | 1,923,846 | 2 × 250 | 31 | 151,912 | KX548220 |
|
| 545 | 2,198,622 | 2 × 250 | 28 | 151,886 | KX548219 |
|
| 878 | 3,338,742 | 2 × 300 | 65 | 151,935 | KX548221 |
|
| 483 | 1,941,430 | 2 × 250 | 22 | 151,860 | KX548222 |
Figure 1Gene map of the Echinacea purpurea chloroplast genome. Genes shown outside the circle are transcribed clockwise and those inside are transcribed counterclockwise. Gene belonging to different functional groups are color-coded as indicated by icons on the lower left corner. Dashed area in the inner circle indicates the GC content of the chloroplast genome. LSC, SSC and IR means large single copy, small single copy and inverted repeat, respectively.
Number and percentage of differences among nine Echinacea chloroplast genomes.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.12% | 0.23% | 0.18% | 0.44% | 0.52% | 0.51% | 0.50% | 0.56% | |
|
| 181 | 0.20% | 0.18% | 0.48% | 0.55% | 0.55% | 0.55% | 0.60% | |
|
| 345 | 308 | 0.16% | 0.45% | 0.54% | 0.53% | 0.54% | 0.60% | |
|
| 273 | 276 | 247 | 0.41% | 0.50% | 0.50% | 0.50% | 0.55% | |
|
| 672 | 727 | 685 | 629 | 0.47% | 0.45% | 0.45% | 0.53% | |
|
| 787 | 837 | 827 | 765 | 711 | 0.29% | 0.20% | 0.31% | |
|
| 772 | 835 | 813 | 764 | 677 | 445 | 0.24% | 0.31% | |
|
| 768 | 830 | 827 | 767 | 689 | 309 | 365 | 0.23% | |
|
| 849 | 910 | 908 | 842 | 811 | 469 | 478 | 350 |
The 10 most-divergent coding regions among nine Echinacea species.
| Genes | Length | Variable sites | Indels | Percentage of identical sites (%) | Timme |
|---|---|---|---|---|---|
|
| 5,049 | 31 | 4 | 99.0 | √ |
|
| 405 | 3 | 0 | 99.3 | |
|
| 1,009 | 4 | 1 | 99.3 | |
|
| 3,198 | 7 | 1 | 99.3 | |
|
| 483 | 3 | 0 | 99.4 | |
|
| 1,282 | 6 | 0 | 99.4 | √ |
|
| 1458 | 7 | 0 | 99.5 | |
|
| 2,232 | 11 | 0 | 99.5 | √ |
|
| 501 | 3 | 0 | 99.6 | |
|
| 252 | 1 | 0 | 99.6 |
The 25 most-divergent non-coding regions among nine Echinacea species.
| Genes | Length (bp) | Variable sites | Indels | Percentage of identical sites (%) | Timme | Shaw |
|---|---|---|---|---|---|---|
|
| 138 | 2 | 3 | 81.9 | ||
|
| 144 | 4 | 5 | 86.8 | √ | |
|
| 312 | 0 | 2 | 86.9 | ||
|
| 72 | 0 | 2 | 88.9 | ||
|
| 904 | 4 | 7 | 89.9 | √ | √ |
|
| 603 | 5 | 8 | 90.9 | √ | |
|
| 539 | 3 | 4 | 90.9 | √ | |
|
| 392 | 3 | 3 | 91.6 | ||
|
| 205 | 3 | 3 | 91.7 | ||
|
| 388 | 3 | 1 | 92.5 | √ | |
|
| 1270 | 11 | 8 | 92.9 | √ | |
|
| 234 | 2 | 4 | 93.2 | √ | |
|
| 385 | 8 | 4 | 93.2 | √ | |
|
| 304 | 1 | 3 | 93.4 | √ | |
|
| 246 | 1 | 3 | 93.6 | ||
|
| 998 | 9 | 7 | 93.9 | √ | √ |
|
| 910 | 8 | 4 | 94.0 | √ | |
|
| 783 | 2 | 5 | 94.1 | ||
|
| 221 | 5 | 2 | 94.6 | √ | |
|
| 203 | 1 | 3 | 94.6 | ||
|
| 747 | 6 | 5 | 94.9 | ||
|
| 396 | 0 | 2 | 94.9 | ||
|
| 259 | 0 | 2 | 95.0 | ||
|
| 580 | 3 | 2 | 95.0 | ||
|
| 233 | 1 | 1 | 95.3 |
Figure 2The ML tree of Echinacea reconstructed using chloroplast genomes. Numbers on branch nodes are bootstrap values. The branch connecting the outgroup Parthenium argentatum and nine Echinacea species was collapsed.
Figure 3ML trees reconstructed using matK + rbcL (left) and using chloroplast genomes (right) Numbers are bootstrap values, branches with bootstrap values <50% are collapsed. These two phylogenies show the power of chloroplast genomes for delimitation of Echinacea species when compared with core DNA barcodes.
Figure 4ML trees reconstructed using ITS (a) and ITS + trnH-psbA (b). Numbers are bootstrap values, branches with the bootstrap value <50% are collapsed. Both phylogenies show the lack of resolution among Echinacea species using either combination of genes.
Sampling in this Echinacea study.
| Species | Voucher | Year collected |
|---|---|---|
|
| US 2349097 | 1958 |
|
| US 1468035 | 1930 |
|
| US 980416 | 1916 |
|
| US 2233063 | 1948 |
|
| US 1653013 | 1935 |
|
| US 2235164 | 1955 |
|
| US 3360860 | 1998 |
|
| US 2802433 | 1974 |
|
| US 2349080 | 1960 |