| Literature DB >> 31012255 |
Catarina Pinho1, Vera Cardoso1, Jody Hey2,3.
Abstract
Organisms sampled for population-level research are typically assigned to species by morphological criteria. However, if those criteria are limited to one sex or life stage, or the organisms come from a complex of closely related forms, the species assignments may misdirect analyses. The impact of such sampling can be assessed from the correspondence of genetic clusters, identified only from patterns of genetic variation, to the species identified using only phenotypic criteria. We undertook this protocol with the rock-dwelling mbuna cichlids of Lake Malawi, for which species within genera are usually identified using adult male coloration patterns. Given high local endemism of male colour patterns, and considerable allele sharing among species, there persists considerable taxonomic uncertainty in these fishes. Over 700 individuals from a single transect were photographed, genotyped and separately assigned: (a) to morphospecies using photographs; and (b) to genetic clusters using five widely used methods. Overall, the correspondence between clustering methods was strong for larger clusters, but methods varied widely in estimated number of clusters. The correspondence between morphospecies and genetic clusters was also strong for larger clusters, as well as some smaller clusters for some methods. These analyses generally affirm (a) adult male-limited sampling and (b) the taxonomic status of Lake Malawi mbuna, as the species in our study largely appear to be well-demarcated genetic entities. More generally, our analyses highlight the challenges for clustering methods when the number of populations is unknown, especially in cases of highly uneven sample sizes.Entities:
Keywords: cichlids; hybridization; population genetics; speciation
Mesh:
Year: 2019 PMID: 31012255 PMCID: PMC6764894 DOI: 10.1111/1755-0998.13027
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Non‐singleton clusters detected using the various software [Colour figure can be viewed at http://wileyonlinelibrary.com]
Figure 1Correspondence between the clusters revealed by baps (in red) and structure using the correlated allele frequency model (in green). Clusters are ordered by their number (with larger clusters ranked first; see Tables 1 and the main text for a description of each group). Individuals not assigned to any of the clusters (“admixed” individuals) are shown in grey as a last group of individuals for both methods. Correspondence between individuals is shown by means of ribbons linking clusters from the two methods [Colour figure can be viewed at wileyonlinelibrary.com]
Recovered population structure based on simulated data sets of 22 and 100 loci with K = 25 and unbalanced sample sizes
| Method | Data set | Mean | Mean RI (min–max) | Mean |
|---|---|---|---|---|
| DAPC | 22 loci | 5.57 (5–7) | 0.958 (0.914–0.982) | 1.000 (0.999–1.000) |
| 100 loci | 5.13 (5–6) | 0.956 (0.953–0.967) | 1.000 (1.000–1.000) | |
|
| 22 loci | 7.16 (6–8) | 0.977 (0.962–0.986) | 1.000 (0.999–1.000) |
| 100 loci | 7.24 (6–8) | 0.980 (0.967–0.987) | 1.000 (1.000–1.000) | |
|
| 22 loci | 8.9 (7–11) | 0.990 (0.983–0.996) | 0.950 (0.937–0.964) |
| IAF model | 100 loci | 16.13 (10–24) | 0.999 (0.991–1.000) | 0.972 (0.956–0.980) |
|
| 22 loci | 21.88 (14–30) | 0.997 (0.959–0.999) | 0.886 (0.850–0.918) |
| CAF model | 100 loci | 23.13 (15–29) | 1.000 (0.999–1.000) | 0.942 (0.883–0.969) |
|
| 22 loci | 32.11 (27–39) | 0.998 (0.993–1.000) | 0.994 (0.990–0.998) |
| 100 loci | 27.31 (25–31) | 1.000 (0.999–1.000) | 0.997 (0.995–0.999) |
K, number of populations; RI, Rand (1971) index; Q max, mean maximum individual probability of assignment to a cluster.
Comparison between data sets comprising only males and resampled data sets containing the same number of individuals (N = 292)
|
|
|
|
| DAPC | |
|---|---|---|---|---|---|
| Mean maximum assignment proportion | |||||
| Males only | 0.99 | 0.91 | 0.95 | 1.00 | 1.00 |
| Mean resampled data sets | 1.00 | 0.89 | 0.95 | 1.00 | 1.00 |
|
| 0.65 | 0.36 | 0.15 | 0.11 | 0.14 |
| Number of admixed individuals | |||||
| Males only | 1 | 5 | 17 | 0 | 0 |
| Mean resampled data sets | 2.1 | 11.28 | 13.84 | 0.09 | 0.11 |
|
| 0.39 | 0.09 | 0.84 | 0.91 | 0.89 |
| Number of inferred clusters | |||||
| Males only |
| 9 | 6 | 3 | 3 |
| Mean resampled data sets |
| 10.55 | 6.4 | 3.09 | 3.25 |
|
|
| 0.59 | 0.4 | 0.09 | 0.25 |
Bold values indicates cases where resampled data sets differ from the "only males" data set, p < 0.05.