| Literature DB >> 35346025 |
Kin Onn Chan1, Stefan T Hertwig2,3, Dario N Neokleous2,3, Jana M Flury4, Rafe M Brown5.
Abstract
BACKGROUND: The 16S mitochondrial rRNA gene is the most widely sequenced molecular marker in amphibian systematic studies, making it comparable to the universal CO1 barcode that is more commonly used in other animal groups. However, studies employ different primer combinations that target different lengths/regions of the 16S gene ranging from complete gene sequences (~ 1500 bp) to short fragments (~ 500 bp), the latter of which is the most ubiquitously used. Sequences of different lengths are often concatenated, compared, and/or jointly analyzed to infer phylogenetic relationships, estimate genetic divergence (p-distances), and justify the recognition of new species (species delimitation), making the 16S gene region, by far, the most influential molecular marker in amphibian systematics. Despite their ubiquitous and multifarious use, no studies have ever been conducted to evaluate the congruence and performance among the different fragment lengths.Entities:
Keywords: Branch support; Genetic distance; Missing data; Phylogenetics; Species delimitation; Systematics; p-distance
Mesh:
Substances:
Year: 2022 PMID: 35346025 PMCID: PMC8959075 DOI: 10.1186/s12862-022-01994-y
Source DB: PubMed Journal: BMC Ecol Evol ISSN: 2730-7182
Summary statistics for each dataset (Full, Medium, Short) and pairwise comparisons of uncorrected p-distances illustrated in Fig. 3
| Full | Medium | Short | |
|---|---|---|---|
| Summary | |||
| Length | 1495 bp | 874 bp | 516 bp |
| No. variable sites | 835 | 476 | 266 |
| No. PIS | 736 | 416 | 239 |
| Proportion PIS | 0.49 | 0.47 | 0.46 |
| Pairwise | |||
| 0.093 ± 0.003 (0.088‒0.100) | 0.097 ± 0.002 (0.092‒0.101) | 0.104 ± 0.003 (0.098‒0.111) | |
| 0.121 ± 0.002 (0.115‒0.124) | 0.130 ± 0.003 (0.123‒0.134) | 0.124 ± 0.007 (0.116‒0.141) | |
| 0.190 ± 0.003 (0.181‒0.200) | 0.191 ± 0.004 (0.182‒0.199) | 0.186 ± 0.003 (0.177‒0.194) | |
| 0.172 ± 0.006 (0.165‒0.185) | 0.169 ± 0.007 (0.162‒0.182) | 0.165 ± 0.007 (0.159‒0.182) | |
| 0.166 ± 0.005 (0.155‒0.177) | 0.165 ± 0.009 (0.147‒0.182) | 0.171 ± 0.012 (0.162‒0.194) | |
| 0.155 ± 0.005 (0.144‒0.166) | 0.154 ± 0.005 (0.146‒0.172) | 0.159 ± 0.012 (0.144‒0.194) | |
| 0.157 ± 0.009 (0.138‒0.176) | 0.157 ± 0.011 (0.130‒0.181) | 0.172 ± 0.017 (0.141‒0.202) | |
| 0.14 ± 0.005 (0.127‒0.160) | 0.153 ± 0.007 (0.129‒0.179) | 0.135 ± 0.01 (0.101‒0.164) | |
Values for p-distances are average ± standard deviation, followed by Min–Max in parenthesis
PIS parsimony-informative sites
Fig. 1Dendrograms depicting the phylogenetic relationships of nominal Occidozyga species. The species tree is based on the genomic study by [33] (a). Kernel density distributions of Robinson–Fould’s distances between bootstrap replicate trees and the maximum likelihood tree for each dataset (b). Boxplots of bootstrap support values from consensus maximum likelihood trees of the Full, Medium, and Short datasets (c)
Fig. 3An illustration depicting how the three datasets used in this study were generated. The Full dataset comprised of 147 full-length 16S sequences. The Medium and Short datasets are subsets of the Full alignment and thus, contain the same sequences. Supplementary reference sequences were used to determine the appropriate trimming sites, ensuring that subset alignments were trimmed according to established primer binding sites. Reference sequences were not included in the final datasets as they are not comparable with longer sequences
Fig. 2Left: Maximum likelihood consensus tree inferred from the Full dataset. Highlighted clades represent described (A, C, D, F, G, H) and undescribed (B, E) lineages. Right: Boxplots showing ANOVA p-values and pairwise comparisons of uncorrected p-distances between closely related lineages
Results of the ASAP species delimitation analysis
| Number of species | ASAP score | W (rank) | Threshold dist | |
|---|---|---|---|---|
| Full | ||||
| 31 | 2 | 2.00e−05 (2) | 4.56e−04 (2) | 0.077418 |
| 36 | 4.5 | 7.56e−03 (4) | 3.24e−04 (5) | 0.04772 |
| 59 | 8.5 | 1.42e−01 (7) | 2.44e−04 (10) | 0.014968 |
| 32 | 11.5 | 1.04e−02 (5) | 2.12e−04 (18) | 0.069101 |
| 26 | 13.5 | 1.00e−05 (1) | 1.84e−04 (26) | 0.094478 |
| Medium | ||||
| 37 | 2 | 7.51e−04 (3) | 6.35e−04 (1) | 0.039249 |
| 36 | 2.5 | 1.10e−04 (2) | 4.14e−04 (3) | 0.06046 |
| 39 | 9.5 | 5.08e−02 (8) | 2.24e−04 (11) | 0.034695 |
| 27 | 10 | 1.00e−05 (1) | 1.16e−04 (19) | 0.086903 |
| 39 | 10 | 6.60e−03 (7) | 1.63e−04 (13) | 0.032463 |
| Short | ||||
| 37 | 9 | 8.57e−04 (2) | 2.26e−04 (16) | 0.033333 |
| 51 | 9.5 | 3.73e−01 (14) | 3.99e−04 (5) | 0.013921 |
| 29 | 11 | 1.87e−03 (3) | 2.18e−04 (19) | 0.072876 |
| 29 | 13.5 | 1.06e−01 (9) | 2.23e−04 (18) | 0.073809 |
| 50 | 14 | 5.77e−01 (19) | 3.38e−04 (9) | 0.014268 |
A lower ASAP score indicates a better species partition. The ASAP score is the average of ranks from the p-val and W parameters combined. p-val: probability of panmixia; W: relative gap width metric. See Puillandre et al. [42] for more details