| Literature DB >> 22942731 |
Jie Liu1,2, Jim Provan3, Lian-Ming Gao1, De-Zhu Li1,2.
Abstract
Although DNA barcoding has become a useful tool for species identification and biodiversity surveys in plant sciences, there remains little consensus concerning appropriate sampling strategies and the treatment of indels. To address these two issues, we sampled 39 populations for nine Taxus species across their entire ranges, with two to three individuals per population randomly sampled. We sequenced one core DNA barcode (matK) and three supplementary regions (trnH-psbA, trnL-trnF and ITS) for all samples to test the effects of sampling design and the utility of indels. Our results suggested that increasing sampling within-population did not change the clustering of individuals, and that meant within-population P-distances were zero for most populations in all regions. Based on the markers tested here, comparison of methods either including or excluding indels indicated that discrimination and nodal support of monophyletic groups were significantly increased when indels were included. Thus we concluded that one individual per population was adequate to represent the within-population variation in these species for DNA barcoding, and that intra-specific sampling was best focused on representing the entire ranges of certain taxa. We also found that indels occurring in the chloroplast trnL-trnF and trnH-psbA regions were informative to differentiate among for closely related taxa barcoding, and we proposed that indel-coding methods should be considered for use in future for closed related plant species DNA barcoding projects on or below generic level.Entities:
Keywords: DNA barcoding; Taxus; indel (gap) coding; noncoding chloroplast regions; sampling strategy
Mesh:
Substances:
Year: 2012 PMID: 22942731 PMCID: PMC3430262 DOI: 10.3390/ijms13078740
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1Unrooted neighbour-joining (NJ) tree based on the P-distance of the five DNA barcoding loci used. Bootstrap values are shown along the branch for each clade. Scale bar represents base substitutions per site.
Summary of data sets by indel coding scheme for alignment and analyses.
| Data set | Indel treating method | No. of indel | Aligned length | No. (%) | No. (%) | Mean interspecific distance | Mean intraspecific distance |
|---|---|---|---|---|---|---|---|
| - | 0 | 1533 | 15 (0.98) | 15 (0.98) | 0.0027 (0.00065–0.0046) | 0.00006 (0–0.00024) | |
| - | 11 | 869 | 25 (2.88) | 22 (2.53) | 0.0063 (0.00083–0.0063) | 0.00034 (0–0.0011) | |
| SIC | 880 | 36 (4.09) | 29 (3.30) | - | |||
| MCIC | 874 | 30 (3.43) | 26 (2.98) | - | |||
| - | 13 | 1321 | 17 (1.29) | 13 (0.98) | 0.0075 (0–0.013) | 0.00046 (0–0.0014) | |
| SIC | 1334 | 30 (2.25) | 22 (1.65) | - | |||
| MCIC | 1330 | 26 (1.96) | 21 (1.57) | - | |||
| ITS | - | 5 | 1143 | 51 (4.46) | 44 (3.85) | 0.010 (0.0046–0.015) | 0.00042 (0–0.00096) |
Notes: Char, character; PIC, parsimony-informative character; VC, variable character; SIC, simple indel coding; MCIC, modified complex indel coding. All data sets have 103 taxa.
Estimates of average evolutionary divergence over sequence pairs within population and between population levels, and mean intraspecific distance.
| ITS | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||||||||
| Lineage | Within population distance | Between population distance | Intraspecific distance | Within population distance | Between population distance | Intraspecific distance | Within population distance | Between population distance | Intraspecific distance | Within population distance | Between population distance | Intraspecific distance | |||||
| Hengduan type | 11 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0–0.00088 | 0–0.00088 | 0.00038 |
| Qinling type | 9 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 3 | 0–0.00188 | 0–0.00188 | 0.00094 | 1 | 0 | 0 | 0 |
| 14 | 3 | 0 | 0–0.00067 | 0.00024 | 4 | 0–0.00124 | 0–0.00369 | 0.00095 | 2 | 0 | 0–0.0387 | 0.0014 | 2 | 0–0.00351 | 0–0.00357 | 0.00050 | |
| 9 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 4 | 0–0.00527 | 0–0.00527 | 0.0019 | |
| 6 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0–0.00088 | 0.00047 | |
| 12 | 3 | 0–0.00065 | 0–0.00130 | 0.00031 | 2 | 0–0.00249 | 0–0.00249 | 0.00041 | 3 | 0–0.00376 | 0–0.00564 | 0.00094 | 1 | 0 | 0 | 0 | |
| 12 | 1 | 0 | 0 | 0 | 3 | 0–0.00125 | 0–0.00125 | 0.00060 | 1 | 0 | 0 | 0 | 6 | 0–0.00264 | 0–0.00264 | 0.00096 | |
| 22 | 1 | 0 | 0 | 0 | 8 | 0–0.00124 | 0–0.00369 | 0.0011 | 10 | 0–0.00362 | 0–0.00519 | 0.00084 | 3 | 0–0.00176 | 0–0.00176 | 0.00026 | |
| Tonkin type | 8 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
Notes: N, number of individuals; NH, number of haplotypes.
Bootstrap values (%) of different lineages based on different indel coding approaches conducted in MEGA 4.0.
| Region | Indel coding schemes | Hengduan type | Qinling type | Tonkin type | Resolution (%) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CD | 77 | n.d. | 97 | 83 | 76 | 76 | n.d. | 87 | 77 | 77.8 (7/9) | |
| PWD | 80 | n.d. | 98 | 84 | 90 | 79 | n.d. | 86 | 73 | 77.8 (7/9) | |
| SIC | 79 | 58 | 98 | 94 | 74 | 84 | n.d. | 91 | 81 | 88.9 (8/9) | |
| MCIC | 72 | 77 | 99 | 97 | 80 | 86 | 40 | 91 | 89 | 100 (9/9) | |
| CD | n.d. | n.d. | n.d. | n.d. | 62 | n.d. | n.d. | n.d. | n.d. | 11.1 (1/9) | |
| PWD | n.d. | n.d. | n.d. | 64 | 95 | n.d. | n.d. | 93 | n.d. | 33.3 (3/9) | |
| SIC | 58 | n.d. | n.d. | 64 | 96 | n.d. | n.d. | 93 | n.d. | 44.4 (4/9) | |
| MCIC | 55 | n.d. | n.d. | 64 | 96 | n.d. | n.d. | 98 | n.d. | 44.4 (4/9) |
CD, complete deletion; PWD, pairwise deletion. Note: n.d., “taxa” not distinguished.