| Literature DB >> 32489599 |
Joseph D DiBattista1,2,3, Pablo Saenz-Agudelo1,4, Marek J Piatek5,6, Edgar Fernando Cagua7, Brian W Bowen8, John Howard Choat9, Luiz A Rocha10, Michelle R Gaither10,11, Jean-Paul A Hobbs2,12, Tane H Sinclair-Taylor1,13, Jennifer H McIlwain2, Mark A Priest14, Camrin D Braun1,15, Nigel E Hussey16, Steven T Kessel17, Michael L Berumen1.
Abstract
Genetic structure within marine species may be driven by local adaptation to their environment, or alternatively by historical processes, such as geographic isolation. The gulfs and seas bordering the Arabian Peninsula offer an ideal setting to examine connectivity patterns in coral reef fishes with respect to environmental gradients and vicariance. The Red Sea is characterized by a unique marine fauna, historical periods of desiccation and isolation, as well as environmental gradients in salinity, temperature, and primary productivity that vary both by latitude and by season. The adjacent Arabian Sea is characterized by a sharper environmental gradient, ranging from extensive coral cover and warm temperatures in the southwest, to sparse coral cover, cooler temperatures, and seasonal upwelling in the northeast. Reef fish, however, are not confined to these seas, with some Red Sea fishes extending varying distances into the northern Arabian Sea, while their pelagic larvae are presumably capable of much greater dispersal. These species must therefore cope with a diversity of conditions that invoke the possibility of steep clines in natural selection. Here, we test for genetic structure in two widespread reef fish species (a butterflyfish and surgeonfish) and eight range-restricted butterflyfishes across the Red Sea and Arabian Sea using genome-wide single nucleotide polymorphisms. We performed multiple matrix regression with randomization analyses on genetic distances for all species, as well as reconstructed scenarios for population subdivision in the species with signatures of isolation. We found that (a) widespread species displayed more genetic subdivision than regional endemics and (b) this genetic structure was not correlated with contemporary environmental parameters but instead may reflect historical events. We propose that the endemic species may be adapted to a diversity of local conditions, but the widespread species are instead subject to ecological filtering where different combinations of genotypes persist under divergent ecological regimes.Entities:
Keywords: Indo‐West Pacific; butterflyfishes; coral reefs; ddRAD; single nucleotide polymorphism; vicariance
Year: 2020 PMID: 32489599 PMCID: PMC7246217 DOI: 10.1002/ece3.6199
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Map indicating collection sites for reef fishes sampled in the Red Sea and Arabian Sea including eight regional endemics (indicated by asterisks) and two widespread species. Colored circles indicate the proportion of samples per species at a site as indicated by the key; circle size is scaled by sample size. Major oceanographic currents and features are represented by arrows. Three putative barriers to larval dispersal are outlined by opaque orange solid lines: b1, 17°N in the Red Sea; b2, Strait of Bab Al Mandab between the Red Sea and Gulf of Aden; and b3, monsoonal upwelling system in the Arabian Sea
STACKS results and genetic diversity metrics for range‐restricted endemics and widespread reef fish sampled in the Red Sea to Arabian Sea (also see Figure 1)
| Species | Sample | Number of populations (Geographic range of sampling) | Species | Number of reads used | Number of polymorphic loci passing filter |
|
|
|
|---|---|---|---|---|---|---|---|---|
|
| 78 | 7 (Gulf of Aqaba to South Farasan Banks, Saudi Arabia) | Northern to central Red Sea | 87,565,224 | 10,711 | 0.0021 (0.2270) | 0.0023 (0.2423) | 0.0007 (0.0786) |
|
| 89 | 8 (Gulf of Aqaba to Djibouti) | Northern Red Sea to Gulf of Aden | 77,353,808 | 2,650 | 0.0018 (0.2507) | 0.0018 (0.2553) | 0.0004 (0.0525) |
|
| 54 | 5 (Thuwal to Moucha & Maskali, Djibouti) | Southern Red Sea to Gulf of Aden | 77,207,403 | 12,393 | 0.0015 (0.2456) | 0.0016 (0.2577) | 0.0005 (0.0735) |
|
| 69 | 6 (Djibouti to Muscat, Oman) | Southern Red Sea to Arabian Gulf | 88,380,960 | 4,384 | 0.0019 (0.2260) | 0.0021 (0.2415) | 0.0007 (0.0849) |
|
| 40 | 4 (Thuwal to Djibouti) | Southern Red Sea to Gulf of Aden | 49,020,448 | 11,151 | 0.0013 (0.2687) | 0.0014 (0.2764) | 0.0003 (0.0659) |
|
| 71 | 6 (Gulf of Aqaba to South Farasan Banks, Saudi Arabia) | Northern to central Red Sea | 89,837,377 | 13,539 | 0.0023 (0.2038) | 0.0025 (0.2239) | 0.0011 (0.0994) |
|
| 57 | 5 (Djibouti to Masirah Island, Oman) | Southern Red Sea to Arabian Gulf | 40,198,908 | 4,131 | 0.0022 (0.2253) | 0.0024 (0.2486) | 0.0010 (0.1088) |
|
| 93 | 8 (Jazirat Burqan to Djibouti) | Northern Red Sea to Gulf of Aden | 80,036,042 | 2,053 | 0.0012 (0.2753) | 0.0012 (0.2764) | 0.0002 (0.0426) |
| Regional endemic average | 73,700,021 | 7,627 | 0.0018 (0.2403) | 0.0019 (0.2529) | 0.0006 (0.0758) | |||
|
| 102 | 9 (Gulf of Aqaba to Masirah Island, Oman) | Indo‐Pacific | 98,948,301 | 1,271 | 0.0010 (0.2489) | 0.0010 (0.2494) | 0.0002 (0.0420) |
|
| 101 | 10 (Gulf of Aqaba to Al Hallaniyats, Oman) | Indo‐Pacific | 110,087,471 | 1,508 | 0.0019 (0.1863) | 0.0021 (0.2072) | 0.0010 (0.0945) |
| Widespread average | 104,517,886 | 1,390 | 0.0015 (0.2176) | 0.0016 (0.2283) | 0.0006 (0.1365) |
Numbers outside and inside parentheses for genetic diversity metrics are based on all single nucleotide polymorphism (SNP) loci versus only variable SNP loci, respectively.
Abbreviations: H E, expected heterozygosity; H O, observed heterozygosity; SNP, single nucleotide polymorphism.
In most cases, 12 individuals were sampled per population prior to quality filtering, except for C. trifascialis, where N = 4 were sampled from Mirbat, Al Hallaniyats, and Masirah Island.
Species distribution is based on a regional database curated over 30 years by R. Myers (see Appendix S2 from DiBattista, Roberts, et al., 2016) but was modified to reflect where species are functionally present versus rare records as waifs.
Figure 2Heat map of environmental data in the Red Sea to Arabian Gulf represented by principal component analysis (PCA) outputs (a) PC1, (b) PC2, and (c) PC3. (d,e) Biplot of the sites and the loading of the environmental drivers underlying the PCA. Collection sites are indicated by numbers
Figure 3(a,b) Summary of the single nucleotide polymorphism (SNP) admixture estimates from STRUCTURE at each sampling site. The shading in each pie indicate the mean level of admixture per sampling site for K = 2. (c,d) Principal component analysis (PCA) scatter plots for RAD‐seq data. Only data sets from the two widespread species Ctenochaetus striatus (a,c) and Chaetodon trifascialis (b,d) are presented here. For the PCA plots, circles represent individual genotypes and axes show the first two components and the percentage of variance explained in brackets. Three letter abbreviations in parentheses represent the country of sampling
Figure 4(a) Summary of the single nucleotide polymorphism (SNP) admixture estimates from STRUCTURE at each sampling site. The shading in each pie indicate the mean level of admixture per sampling site for K = 2. (b) Principal component analysis (PCA) scatter plots for RAD‐seq data. Only the data set that was comprised of SNPs shared between two closely related species (Chaetodon austriacus and Chaetodon melapterus) is presented here. For the PCA plots, circles represent individual genotypes and axes show the first two components and the percentage of variance explained in brackets. Three letter abbreviations in parentheses represent the country of sampling
Figure 5Correlation between pairwise genetic distance (F ST), geographical distance, and environmental distance for Chaetodon trifascialis around the Arabian Peninsula. The top two panels show correlations between genetic and geographic (left) and environmental (right) distances. The bottom two panels show correlations between genetic and combined geographic and environmental distances (left), and the correlation between geographic and environmental distance (right). Blue dots and regression lines correspond to pairwise comparisons among sites on different sides of the Strait of Bab Al Mandab barrier (b2), orange dots and regression lines correspond to pairwise comparisons among sites on the same side of b2
Figure 6Correlation between pairwise genetic distance (F ST), geographical distance, and environmental distance for Ctenochaetus striatus around the Arabian Peninsula. The top two panels show correlations between genetic and geographic (left) and environmental (right) distances. The bottom two panels show correlations between genetic and combined geographic and environmental distances (left), and the correlation between geographic and environmental distance (right). Blue dots and regression lines correspond to pairwise comparisons among sites on different sides of the monsoonal upwelling system barrier in the Arabian Sea (b3), orange dots and regression lines correspond to pairwise comparisons among sites on the same side of b3
Comparison of seven alternative demographic models obtained from ∂a∂I for Ctenochaetus striatus, Chaetodon trifascialis, as well as Chaetodon austriacus and Chaetodon melapterus data sets using a folded joint frequency spectrum (JSFS)
| Model | AIC | log lik | theta | N Red Sea | N Indian Ocean | m12 | m21 | me12 | me21 |
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||
| AM | 994.581 | −491.291 | 34.118 | 14.860 | 0.239 | 0.000 | 6.138 | 9.828 | 0.000 | |||
| AM2M | 934.547 | −458.274 | 34.000 | 13.794 | 0.439 | 0.000 | 6.033 | 0.011 | 0.778 | 9.961 | 0.000 | 0.819 |
| IM | 992.578 | −491.289 | 33.636 | 15.076 | 0.242 | 0.000 | 6.042 | 9.984 | ||||
| IM2M | 932.132 | −458.066 | 34.319 | 13.757 | 0.382 | 0.000 | 7.006 | 0.014 | 0.837 | 9.862 | 0.821 | |
| SC | 920.964 | −454.482 | 36.853 | 9.083 | 0.098 | 1.119 | 17.564 | 9.972 | 0.074 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| SI | 1,147.322 | −570.661 | 298.115 | 0.030 | 0.010 | 0.002 | ||||||
|
| ||||||||||||
| AM | 724.579 | −356.289 | 357.163 | 17.442 | 0.010 | 0.000 | 2.991 | 0.000 | 0.000 | |||
| AM2M | 730.536 | −356.268 | 357.235 | 16.036 | 0.025 | 47.853 | 0.037 | 0.000 | 0.000 | 0.001 | 0.000 | 0.785 |
| IM | 722.575 | −356.287 | 357.162 | 18.122 | 0.010 | 0.000 | 0.000 | 0.000 | ||||
| IM2M | 725.819 | −354.910 | 361.850 | 0.249 | 5.479 | 0.000 | 68.565 | 0.000 | 0.000 | 0.013 | 0.891 | |
|
|
|
|
|
|
|
|
|
|
| |||
| SC2M | 724.290 | −353.145 | 72.397 | 3.653 | 1.856 | 44.695 | 0.000 | 2.744 | 9.635 | 8.784 | 0.584 | 0.192 |
| SI | 718.579 | −356.290 | 357.180 | 10.464 | 0.010 | 0.000 | ||||||
|
| ||||||||||||
| AM | 897.663 | −442.832 | 50.301 | 1.331 | 0.293 | 0.000 | 19.766 | 9.799 | 0.000 | |||
| AM2M | 837.919 | −409.959 | 105.878 | 5.753 | 0.638 | 2.673 | 14.813 | 0.000 | 0.398 | 4.440 | 0.000 | 0.948 |
| IM | 893.045 | −441.523 | 53.745 | 12.669 | 0.294 | 0.606 | 19.656 | 9.344 | ||||
| IM2M | 822.972 | −403.486 | 54.638 | 12.863 | 0.191 | 1.282 | 44.642 | 0.000 | 1.862 | 9.474 | 0.948 | |
| SC | 859.133 | −423.567 | 106.055 | 2.118 | 0.413 | 1.478 | 14.776 | 9.877 | 0.260 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| SI | 1,002.267 | −498.134 | 418.976 | 0.029 | 0.010 | 0.001 | ||||||
Results of the best run for each model are provided. AIC: Akaike information criterion; log lik: maximum likelihood; theta: 4 Nrefµ; N Red Sea and N Indian Ocean: effective population sizes of each population, respectively; m12 and m21: migration rates from the Red Sea to the Indian Ocean and vice versa, respectively; me12 and me21: effective migration rates in the most differentiated regions of the genome (i.e., genomic islands) from the Red Sea to the Indian Ocean and vice versa, respectively; T s: time of split of the ancestral population into two daughter populations; T sc: duration of secondary contact episodes (only in SC and SC2M models); T am: duration of ancestral migration episodes (only in AM and AM2M models); P: proportion of the genome exchanged under neutrality. The model with the lowest AIC is indicated in bold.
Model abbreviations: SI, strict isolation; IM, isolation with migration; AM, ancient migration; SC, secondary contact. For each of IM, AM, and SC, we explored two options: (1) homogenous migration and (2) heterogeneous migration along the genome (2M).
Figure 7Results of the diffusion approximation models for the (a) Ctenochaetus striatus, (b) Chaetodon trifascialis, and (c) Chaetodon austriacus and Chaetodon melapterus data sets. For each data set, the observed “data” and the best fitting “model” are displayed. Shading indicates probability matrix as indicated by the embedded legend. Plots of all alternative models tested, for each data set, are provided as Figures S1–S3