| Literature DB >> 26107258 |
Shu Wu1, Jie Xiong2, Yuhe Yu2.
Abstract
Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1-9) within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1) 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%); and could aid in species-level analyses, but with some limitations; 2) nearly-whole-length sequences and some partial regions (around V2, V4, and V9) of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%); 3) compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%); and 4) V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26107258 PMCID: PMC4479608 DOI: 10.1371/journal.pone.0131498
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Entropy plot calculated based on site variability in an alignment of 184 copepod 18S rRNA genes, and the distribution of variable regions.
A trend line represents the mean variability for successive windows of 20 positions.
Sequence characteristics of the 18S rRNA gene from 8 species in the genus Acartia, and 184 other copepod species.
| Molecules | Length of | Length (bp) | GC | Nt | Nv | Nc | PI(%PI) |
|---|---|---|---|---|---|---|---|
| 18S | 1,689–1,787 | 1,694–1,742 | 48.5 | 1,878 | 974 | 854 | 784 (41.7) |
| TCs | 849–869 | 859–862 | 49.1 | 864 | 258 | 603 | 178 (20.6) |
| CVs | 829–938 | 834–881 | 47.8 | 1,014 | 716 | 251 | 606 (59.8) |
| V1 | 12–23 | 22–24 | 41.3 | 24 | 21 | 3 | 18 (75) |
| V2 | 184–214 | 189–213 | 45.9 | 245 | 178 | 59 | 149 (60.8) |
| V3 | 91–92 | 90–93 | 47.3 | 96 | 55 | 38 | 45 (46.9) |
| V4 | 213–242 | 219–256 | 45.9 | 286 | 188 | 70 | 165 (57.7) |
| V5 | 71–75 | 70–74 | 53.9 | 75 | 64 | 11 | 52 (69.3) |
| V7 | 86–141 | 74–93 | 48.0 | 100 | 68 | 30 | 57 (57) |
| V8 | 62–67 | 62–68 | 54.4 | 76 | 51 | 21 | 44 (57.9) |
| V9 | 85–97 | 91–98 | 49.4 | 112 | 91 | 19 | 76 (67.9) |
18S, copepod 18S rRNA genes with the two ends trimmed (all the variable regions were intact); TCs, Total conserved region sequences in 18S; CVs, combination of all variable regions; V1-9, variable regions; GC, GC contents; Nt, total number of sites compared; Nc, total number of conserved sites; Nv, total number of variable sites; PI, parsimony-informative sites.
Fig 2A comparison of the nucleotide divergences of the copepod 18S rRNA genes of different regions.
184 18S rRNA genes representing 184 copepod species (excluding species in the genus Acartia) were used to calculate pairwise genetic distances using p-distance. Values are shown as means ± SD. Variable names are consist with Table 1.
Copepod 18S rRNA genes used in the statistical similarity analysis.
| 18S rRNA genes | V regions | Lengths of V regions (bp) | Lengths of Sections (bp) | Primers pairs for Sequence amplification | Number of Sequences | Number of orders | Number of families | Number of genus | Number of species |
|---|---|---|---|---|---|---|---|---|---|
| Nearly-whole-length | V1-9 | 738–785 | 1701–1851 | 226 | 7 | 70 | 134 | 185 | |
| Section 1 | V1-3 | 302–327 | 481–544 | Euk1A [ | 320 | 7 | 76 | 169 | 265 |
| Section 2 | V4-5 | 291–329 | 537–612 | Ami6F1 [ | 410 | 7 | 82 | 179 | 309 |
| Section 3 | V7 | 83–108 | 125–150 | #3 [ | 300 | 7 | 79 | 177 | 284 |
| Section 4 | V8 | 62–68 | 158–165 | F–P [ | 390 | 7 | 82 | 178 | 301 |
| Section 5 | V9 | 91–98 | 107–139 | 1391F [ | 307 | 7 | 80 | 172 | 262 |
Fig 3The boxplot distribution of sequence similarities for different 18S rRNA gene regions in the intra-species (S), inter-species but intra-genus (G), inter-genus but intra-family (F), inter-family but intra-order (O), and inter-order (I) categories.
The central boxes represent the middle three quarters of the data. The total number of samples in each plot is listed at the bottom of each box.
The best and lowest similarity thresholds for taxonomic resolution.
| Sequences | V regions | The best similarity thresholds/success (%) | The lowest similarity thresholds/accuracy (%) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| S/G | G/F | F/O | O/I | Intra-species | Intra-genus | Intra-family | Intra-order | ||
| Nearly-whole-length | V1-9 | 99.9/93 | 98.4/64 | 94.2/86 | 90.2/81 | 100/97 | 100/100 | 97/96 | 93/96 |
| Section 1 | V1-3 | 99.4/63 | 95.4/82 | 90.6/82 | 98/95 | 93/96 | |||
| Section 2 | V4-5 | 99.7/66 | 97.8/67 | 92.6/83 | 88.6/78 | 100/99 | 97/96 | 92/95 | |
| Section 3 | V7 | 99.3/82 | 94.6/73 | 89.9/68 | 86.0/68 | 100/96 | 96/97 | 91/96 | |
| Section 4 | V8 | 99.8/62 | 95.7/89 | 90.6/68 | 98/96 | 95/97 | |||
| Section 5 | V9 | 99.3/67 | 96.6/79 | 90.2/78 | 83.2/77 | 100/95 | 96/96 | 90/96 | |
* 95% is the lowest accuracy accepted in this case.