| Literature DB >> 28944019 |
Fazeeda N Hosein1, Nigel Austin1, Shobha Maharaj1, Winston Johnson1, Luke Rostant1, Amanda C Ramdass1, Sephra N Rampersad1.
Abstract
The islands of the Caribbean are considered to be a "biodiversity hotspot." Collectively, a high level of endemism for several plant groups has been reported for this region. Biodiversity conservation should, in part, be informed by taxonomy, population status, and distribution of flora. One taxonomic impediment to species inventory and management is correct identification as conventional morphology-based assessment is subject to several caveats. DNA barcoding can be a useful tool to quickly and accurately identify species and has the potential to prompt the discovery of new species. In this study, the ability of DNA barcoding to confirm the identities of 14 endangered endemic vascular plant species in Trinidad was assessed using three DNA barcodes (matK, rbcL, and rpoC1). Herbarium identifications were previously made for all species under study. matK, rbcL, and rpoC1 markers were successful in amplifying target regions for seven of the 14 species. rpoC1 sequences required extensive editing and were unusable. rbcL primers resulted in cleanest reads, however, matK appeared to be superior to rbcL based on a number of parameters assessed including level of DNA polymorphism in the sequences, genetic distance, reference library coverage based on BLASTN statistics, direct sequence comparisons within "best match" and "best close match" criteria, and finally, degree of clustering with moderate to strong bootstrap support (>60%) in neighbor-joining tree-based comparisons. The performance of both markers seemed to be species-specific based on the parameters examined. Overall, the Trinidad sequences were accurately identified to the genus level for all endemic plant species successfully amplified and sequenced using both matK and rbcL markers. DNA barcoding can contribute to taxonomic and biodiversity research and will complement efforts to select taxa for various molecular ecology and population genetics studies.Entities:
Keywords: DNA barcoding; biodiversity conservation; endemics; molecular genetics
Year: 2017 PMID: 28944019 PMCID: PMC5606854 DOI: 10.1002/ece3.3220
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Plant species collection data
| Family | Species | IUCN Status |
|---|---|---|
| Araceae |
| Endangered |
| Aristolochiaceae |
| Endangered |
| Begoniaceae |
| Critically endangered |
| Caesalpinaceae |
| Near endangered |
| Celastraceae |
| Near threatened |
| Clusiaceae |
| Least concern |
| Clusiaceae |
| Deficient data |
| Clusiaceae |
| Endangered |
| Cyperaceae |
| Vulnerable |
| Euphorbiaceae |
| Vulnerable |
| Xyridaceae |
| Critically endangered |
| Asclepiadaceae |
| Endangered |
| Aquifoliaceae |
| Least concern |
| Myrtaceae |
| Near endangered |
IUCN Status—International Union for Conservation of Nature (IUCN) Red List categories.
Figure 1Location map of endemic vascular plant species sampled in this study. The white shaded areas indicate elevation, and it is noted that the majority of endemic species included in this study were located in mountainous regions in North Trinidad
Primer data
| Marker/Barcode | Primers F/R | Primer sequence (5′‐3′) | Average amplicon size/bp (amplicon size range) |
|---|---|---|---|
|
| 2F | GGCAAAGAGGGAAGATTTCG | 494 (462–556) |
| 4R | CCATAAGCATATCTTGAGTTGG | ||
|
| 3F | CGTACAGTACTTTTGTGTTTACGAG | 794 (656–861) |
| 1R | ACCCAGTCCATCTGGAAATCTTGGTTC | ||
|
| rbcLa_R | GTAAAATCAAGTCCACCRCG | 704 (702–883) |
| rbcLa_F | ATGTCACCACAAACAGAGACTAAAGC |
PCR amplification and sequencing success
| Species | Primer success | PCR amplification after optimization | Best sequence reads | Worst sequence reads |
|---|---|---|---|---|
|
|
| 100% |
|
|
|
|
| 100% |
|
|
|
| none | N/A | N/A | N/A |
|
|
| 100% |
|
|
|
| none | N/A | N/A | N/A |
|
|
| 100% |
|
|
|
| none | N/A | N/A | N/A |
|
|
| 100% |
|
|
|
|
| 100% |
|
|
|
| None | N/A | N/A | N/A |
|
|
| 100% |
|
|
|
| None | N/A | N/A | N/A |
|
| None | N/A | N/A | N/A |
|
| None | N/A | N/A | N/A |
Best sequence reads—clear reads without incorporation of ambiguous bases.
Worst sequence reads—sequence reads with numerous ambiguous bases, base deletion or addition, premature termination of sequence.
DNA polymorphism data for the matK barcode
| Marker | DNA Polymorphism Parameters |
|
|
|
|
|
|---|---|---|---|---|---|---|
|
|
| 81 | 89 | 65 | 57 | 101 |
| Aligned sequence length (nt) | 815 | 698 | 628 | 799 | 715 | |
| # monomorphic sites | 582 | 657 | 640 | 581 | 615 | |
| # polymorphic sites | 214 | 41 | 128 | 188 | 82 | |
| # singleton sites | 58 | 21 | 69 | 75 | 31 | |
| # parsimony informative sites | 150 | 20 | 59 | 113 | 50 | |
| # indel sites | 24 | 0 | 0 | 30 | 54 | |
| # mutations (Eta) | 55 | 41 | 140 | 223 | 88 | |
| # nucleotide differences ( | 29.206 | 3.061 | 11.641 | 30.031 | 7.754 | |
| Nucleotide diversity (π) | 0.039 | 0.004 | 0.015 | 0.038 | 0.011 | |
| Conservation threshold (CT) | 0.83 | 1 | 0.93 | 0.83 | 0.98 | |
| Sequence conservation (C) | 0.734 | 0.941 | 0.833 | 0.734 | 0.884 | |
| Conservation | NCRF | Region 1 = 0.022 (nt370–429 Region 2 = 0.004 (nt435–518) | Region = 0.003 (nt655–745) | Region = 0.011 (nt1–83) | NCRF |
NCRF, No conserved region found.
DNA polymorphism data for the rbcL barcode
| Marker | DNA Polymorphism Parameters |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
|
| 52 | 98 | 35 | 101 | 97 | 30 | 71 |
| Aligned sequence length (nt) | 527 | 516 | 503 | 527 | 521 | 517 | 526 | |
| # monomorphic sites | 463 | 492 | 448 | 512 | 460 | 481 | 436 | |
| # polymorphic sites | 43 | 24 | 55 | 14 | 61 | 36 | 82 | |
| # singleton sites | 17 | 10 | 28 | 9 | 18 | 18 | 14 | |
| # parsimony informative sites | 26 | 14 | 27 | 5 | 43 | 18 | 68 | |
| # indel sites | 21 | 0 | 0 | 0 | 0 | 0 | 8 | |
| # mutations (Eta) | 49 | 26 | 59 | 14 | 74 | 36 | 96 | |
| # nucleotide differences ( | 6.675 | 3.146 | 4.747 | 1.402 | 10.189 | 5.524 | 8 | |
| Nucleotide diversity (π) | 0.013 | 0.006 | 0.009 | 0.003 | 0.019 | 0.011 | 0.043 | |
| Conservation threshold (CT) | 1 | 1 | 0.99 | 1 | 0.98 | 1 | 0.94 | |
| Sequence conservation (C) | 0.915 | 0.953 | 0.891 | 0.973 | 0.893 | 0.93 | 0.844 | |
| Conservation | Region 1 = 0.029 (nt32–69) | Region 1 = 0.006 (nt28–125) | NCRF | Region = 0.016 (nt53–186) | NCRF | Region 1 = 0.039 (nt30–72) | Region = 0.002 (nt13–92) | |
| Region 2 = 0.048 (nt72–104) | Region 2 = 0.041 (nt372–434) | Region 2 = 0.036 (nt74–117) | ||||||
| Region 3 = 0.011 (nt151–198) | Region 3 = 0.028 (nt239–285) | |||||||
| Region 4 = 0.039 (nt274–308) | Region 4 = 0.028 (nt455–501) | |||||||
| Region 5 = 0.039 (nt403–437) |
NCRF, No conserved region found.
Figure 2(a–e) Neighbor‐joining tree for five species based on matK sequences. Clustering of all query sequences of species under study was inferred using the neighbor‐joining method in MEGA6. The condensed tree (50% bootstrap consensus tree) showing only clustering topology is presented and the percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) is indicated next to the branches. The genetic distances were computed using the Kimura 2‐parameter method and are in the units of the number of base substitutions per site. All positions containing gaps and missing data were eliminated. a—Aristolochia boosi, b—Ilex arimensis, c—Maytenus monticola, d—Metastelma freemani, e—Philodendron simmondsii
Figure 3(a–f) Neighbor‐joining tree for six species based on rbcL sequences. Clustering of all query sequences of species under study was inferred using the neighbor‐Joining method in MEGA6. The condensed tree (50% boot strap consensus tree) showing only clustering topology is presented, and the percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) is indicated next to the branches. The genetic distances were computed using the Kimura 2‐parameter method and are in the units of the number of base substitutions per site. All positions containing gaps and missing data were eliminated. a—Aristolochia boosi, b—Clusia aripoensis, c—Ilex arimensis, d—Maytenus monticola, e—Metastelma freemani, f—Philodendron simmondsii
Kimura 2‐parameter threshold data and sequence matches in the reference library
| Marker | Species | K‐2‐P pairwise distance and threshold (%) | Sequences with at least one matching sequence in the dataset | Sequences with at least one matching conspecific sequence in the dataset | Sequences with a closest match at 0% |
|---|---|---|---|---|---|
|
|
| 4.96 | 80 | 48 | 29 |
|
| 2.92 | 63 | 18 | 21 | |
|
| 0.25 | 56 | 8 | 11 | |
|
| 0.53 | 100 | 76 | 76 | |
|
| 0.72 | 84 | 63 | 68 | |
|
|
| 1.72 | 52 | 30 | 32 |
|
| 0.79 | 101 | 43 | 65 | |
|
| 0.19 | 34 | 16 | 28 | |
|
| 0.57 | 97 | 32 | 55 | |
|
| 1.16 | 83 | 31 | 59 | |
|
| 1.75 | 30 | 20 | 19 | |
|
| 0.19 | 71 | 37 | 58 |
Species identification success based on best match and best close match analyses
| Marker | Species | Best match criterion | Best close match criterion | Trinidad species ID classification | ||||
|---|---|---|---|---|---|---|---|---|
| Correct identification (%) | Ambiguous Identification (%) | Incorrect Identification (%) | Correct identification (%) | Ambiguous Identification (%) | Incorrect Identification (%) | |||
|
|
| 34 (42.5) | 13 (16.25) | 33 (41.25) | 34 (42.5) | 13 (16.25) | 33 (41.25) | Ambiguous |
|
| 12 (18.75) | 24 (37.5) | 28 (43.75) | 12 (18.75) | 24 (37.5) | 28 (43.75) | Ambiguous | |
|
| 8 (14.28) | 13 (23.21) | 35 (62.5) | 8 (14.28) | 13 (23.21) | 35 (62.5) | Ambiguous | |
|
| 17 (17.0) | 66 (66.0) | 17 (17.0) | 17 (17.0) | 66 (66.0) | 13 (13.0) | Ambiguous | |
|
| 16 (19.04) | 58 (69.04) | 10 (11.90) | 16 (19.04) | 58 (69.04) | 10 (11.90) | Ambiguous | |
|
|
| 11 (21.15) | 28 (53.84) | 13 (25.0) | 11 (21.15) | 28 (53.84) | 13 (25.0) | Ambiguous |
|
| 13 (12.87) | 60 (59.4) | 28 (27.72) | 13 (12.87) | 59 (58.41) | 27 (26.73) | Ambiguous | |
|
| 0 (0) | 29 (85.29) | 5 (14.7) | 0 (0) | 24 (70.58) | 4 (11.76) | Ambiguous | |
|
| 6 (6.18) | 56 (57.73) | 35 (36.08) | 6 (6.18) | 50 (51.54) | 33 (34.02) | Ambiguous | |
|
| 3 (3.61) | 68 (81.92) | 12 (14.45) | 3 (3.61) | 68 (81.92) | 12 (14.45) | Ambiguous | |
|
| 4 (13.33) | 19 (63.33) | 7 (23.33) | 4 (13.33) | 19 (63.33) | 7 (23.33) | Ambiguous | |
|
| 24 (33.8) | 36 (50.7) | 11 (15.49) | 24 (33.8) | 28 (39.43) | 6 (8.45) | No match | |
Clustering analysis of matK sequences based on K‐2‐P genetic distances and the neighbor‐joining algorithm
| Marker | Species | Reference dataset sequence coverage | K‐2‐P pairwise distance and threshold (%) | Query coverage %; Similarity % | Placement of Trinidad sequence | Bootstrap score (bs) for Trinidad placement | Overall cluster support | Polytomies present (Yes/No) |
|---|---|---|---|---|---|---|---|---|
|
|
| Representative (100% belonged to Aristolochia genus) | 4.96 | 99%–100%; 96%–98% | Clustered with A. maxima and A. ovalifolia | 94% | Majority >90% | No |
|
| Representative (100% belonged to Ilex genus) | 0.72 | 94%–99%; 99% | <50% | Majority <50%; but 12 clusters with bs >60% | Yes | ||
|
| 50% of sequences belonged to Maytenus or to Euonymus genus which are synonyms | 2.92 | 98%–100%; 99% | Clustered with only Maytenus sequences | 96% | Maytenus cluster 96%; Euonymus cluster 88%; All other sequences clustered according to genus with high bs support (>75%) | No | |
|
| Not representative (14% of sequences belonged to either Metastelma or Ditassa genus which are synonyms) | 0.25 | 99%; 96%–99% | Clustered with all other Metastelma and Ditassa sequences | 86% | All other sequences clustered according to genus with high bs support (>75%) | No | |
|
| Representative (90% of sequences belonged to either Philodendron or Homalomena genus which are synonyms) | 0.53 | 100%; 99%–100% | Clustered with P. radiatum sequences | 62% | Majority of clusters had low bs support (bs < 50%) but five clusters had bs >60% | Yes |
Clustering analysis of rbcL sequences based on K‐2‐P genetic distances and the neighbor‐joining algorithm
| Marker | Species | Reference dataset sequence coverage | K‐2‐P pairwise distance and threshold (%) | Query coverage %; Similarity % | Placement of Trinidad sequence | Bootstrap score | Overall cluster support | Polytomies present (Yes/No) |
|---|---|---|---|---|---|---|---|---|
|
|
| Representative (100% of sequences belonged to Aristolochia genus) | 1.72 | 97%–99%; 97%–99% | Clustered with A. maxima and A. tonduzu with low bs support | <50% | Most other clusters were moderately supported (bs >60%) | Yes |
|
| Representative (100% of sequences belonged to Ilex genus) | 1.16 | 99%; 99%–100% | Clustered with low bs support | <50% | Majority of clusters had low bs support (<50%); only four clusters had bs >60% | Yes | |
|
| Not representative (14% of sequences identified as belonging to Maytenus genus) | 0.79 | 98%–100%; 99% | Clustered with only Maytenus sequences but with low bs support | <50% | Majority of clusters had low bs support (<50%) | Yes | |
|
| Not representative (29% of sequences identified as belonging to either Metastelma or Cynanchum genus) | 0.19 | 98%–100%; 99% | Trinidad sequence positioned in a separate cluster from all other sequences | <50% | Majority of clusters had low bs support (<50%) | Yes | |
|
| Not representative (13% of sequences belonged to either Philodendron or Homalomena genus | 0.57 | 99%; 98%–100% | Clustered with other Philodendron and Homalomena sequences with high bs support | 73% | Majority of main clusters had moderate bs support (>60%) | Yes | |
|
| Representative (100% of sequences belonged to Acalypha genus) | 1.75 | 97%–100%; 99%–100% | Clustered with one of two main clusters, the first consisted of the majority of sequences but with no calculated bs score; the second of the two clusters was highly supported (99%) | Not available | Main clusters had moderate bs support (>60%); majority of clusters however, had low bs support (>50%) | Yes | |
|
| Not representative (8.5% of sequences belonged to Clusia genus) | 0.19 | 100%; 94% | Did not cluster with any other species including those 6 belonging to Clusia genus | Not available | Main clusters had moderate to high bs support (>60%–90%); clusters were mostly genus‐specific | Yes |