Literature DB >> 34529049

Population Structure, Genetic Connectivity, and Signatures of Local Adaptation of the Giant Black Tiger Shrimp (Penaeus monodon) throughout the Indo-Pacific Region.

Nga T T Vu^1,2, Kyall R Zenger^1,2, Catarina N S Silva², Jarrod L Guppy^1,2, Dean R Jerry^1,2,3.

Abstract

The giant black tiger shrimp (Penaeus monodon) is native to the Indo-Pacific and is the second most farmed penaeid shrimp species globally. Understanding genetic structure, connectivity, and local adaptation among Indo-Pacific black tiger shrimp populations is important for informing sustainable fisheries management and aquaculture breeding programs. Population genetic and outlier detection analyses were undertaken using 10,593 genome-wide single nucleotide polymorphisms (SNPs) from 16 geographically disparate Indo-Pacific P. monodon populations. Levels of genetic diversity were highest for Southeast Asian populations and were lowest for Western Indian Ocean (WIO) populations. Both neutral (n = 9,930) and outlier (n = 663) loci datasets revealed a pattern of strong genetic structure of P. monodon corresponding with broad geographical regions and clear genetic breaks among samples within regions. Neutral loci revealed seven genetic clusters and the separation of Fiji and WIO clusters from all other clusters, whereas outlier loci revealed six genetic clusters and high genetic differentiation among populations. The neutral loci dataset estimated five migration events that indicated migration to Southeast Asia from the WIO, with partial connectivity to populations in both oceans. We also identified 26 putatively adaptive SNPs that exhibited significant Pearson correlation (P < 0.05) between minor allele frequency and maximum or minimum sea surface temperature. Matched transcriptome contig annotations suggest putatively adaptive SNPs involvement in cellular and metabolic processes, pigmentation, immune response, and currently unknown functions. This study provides novel genome-level insights that have direct implications for P. monodon aquaculture and fishery management practices.

Entities: Chemical

Keywords: DArTseq; SNP; aquaculture; gene flow; population genetics; prawn

Mesh：

Year: 2021 PMID： 34529049 PMCID： PMC8495139 DOI： 10.1093/gbe/evab214

Source DB: PubMed Journal: Genome Biol Evol ISSN： 1759-6653 Impact factor: 3.416

Significance Traditional genetic markers (e.g., microsatellite repeats and mitochondrial genomes) can have limited power to detect fine-scale genetic structure and subtle signatures of local adaptation both within and among populations of important aquaculture species. We determined the genetic population structure and potential signatures of local adaptation both within and among 16 geographically disparate Indo-Pacific populations of the black tiger shrimp Penaeus monodon by sequencing 131,929 unique single-nucleotide polymorphisms present within the genome of 532 individual wild-caught shrimp. This study provides novel genome-level insights about P. monodon genetic structure and signatures of local adaptation, which collectively assist in the advancement of global aquaculture breeding programs and fishery management practices.

Introduction

The evolutionary forces of gene flow, genetic drift, mutation, and natural selection, which are all influenced by organism-specific life history, collectively shape the genetic structure of wild populations (Matolweni et al. 2000; Hemmer-Hansen et al. 2007). Population genetics research provides valuable estimates regarding genetic diversity and local adaptation, which are equally important factors for the adaptive capacity of key species under climate change scenarios (Grant et al. 2017; Oleksiak 2019). Facilitation of appropriate fisheries management and aquaculture practices can also be enhanced through population genetics research by providing novel insights into broodstock selection for breeding programs and discernment of hatchery stocks from wild populations (Williams and Hoffman 2009; Gjedrem 2012; Lorenzen et al. 2012; Bernatchez et al. 2017). In order to make accurate recommendations regarding aquaculture programs (e.g., seletive breeding) or fishery management plans for commercially important species, it is important to comprehensively understand the population structure, genetic diversity, and local adaptation within and among broad-scale populations (Lal et al. 2017). The giant black tiger shrimp Penaeus monodon is native to and considered one of the most important aquaculture species throughout the Indo-Pacific region. The life history of P. monodon comprises an offshore planktonic larval phase (approximately 20 days) before migrating to nursery areas, such as in-shore areas and mangrove estuaries (Motoh 1985; Queiroga and Blanton 2004). For example, tagged juveniles and adult P. monodon from Trinity Inlet in north Queensland, Australia moved north between 5 and 100 km over a span of 10–111 days (Gribble et al. 2003). High gene flow has also been reported among P. monodon samples collected in Madagascar, Mozambique, and South Africa despite geographic distances up to 2,000 km (Forbes et al. 1999). Therefore, the capacity for moderate to high-levels of gene flow (i.e., genetic connectivity) exists among geographically disparate P. monodon populations. To date, most studies that assessed population genetic structure and connectivity patterns of P. monodon throughout the Indo-Pacific region have used microsatellites, allozymes, and mitochondrial DNA (mtDNA) markers (Benzie et al. 2002; You et al. 2008; Waqairatu et al. 2012). These genetic markers can often have limited resolution power to detect fine-scale genetic structure and signatures of adaptation within and among populations (Nielsen et al. 2009; Allendorf et al. 2010). Despite limited resolution these previous studies demonstrated low-level of population differentiation (PD) throughout the Indo-Pacific Ocean, as well as lower genetic diversity in Indian Ocean than Pacific Ocean P. monodon populations; however, levels of predicted gene flow were not determined, nor were patterns of genetic connectivity among populations consistent across these studies (You et al. 2008; Waqairatu et al. 2012). The recent development of high-throughput sequencing techniques permits determination of both genome-wide neutral and adaptive genetic variation for nonmodel species, which can provide unique insight into population structure and local adaptation. For example, analysis of neutral genetic loci has been utilized for population structure determination (Batista et al. 2016; Segovia et al. 2017; Woodings et al. 2018), whereas outlier detection methods permit the separation of putatively adaptive (i.e., outlier) and neutral loci into two discrete datasets. Recent studies have demonstrated that outlier loci provide unique and useful insights into the genetic structure and management units of aquatic species that neutral loci often do not show (Gagnaire et al. 2015; Woodings et al. 2018; Anderson et al. 2019; Nowland et al. 2019). Collectively, the high-resolution power of genome-wide single nucleotide polymorphism (SNP) genotyping offers the potential for heightened understanding of P. monodon population genetic structure, connectivity, and local adaptation, which could translate into advancements in aquaculture and fishery practices. Accordingly, the main objective of this study was to utilize both neutral and outlier SNP datasets to determine the population structure, connectivity, and local adaptation of P. monodon populations from throughout the Indo-Pacific region (fig. 1 and supplementary table S1, Supplementary Material online). More specifically, this study aimed to: 1) evaluate genetic diversity and determine the spatial population structure among P. monodon populations using a combined (i.e., neutral and outlier loci) dataset, 2) determine the genetic connectivity and evolutionary relationship among populations, and 3) identify putatively adaptive outlier SNP (i.e., candidate genes).

Fig. 1.

Sampling sites and the known distribution of Penaeus monodon sourced in this study. Orange areas denote the known distribution of P. monodon (modified from Kongkeo [2005–2020]). See Schott et al. (2009) for detailed Indo-Pacific summer and winter monsoon currents maps. FJ: Fiji; Eastern Australia (EA; BB: Bramston Beach, EB: Etty Bay; TSV: Townsville); Northern Australia (NT; GC: Gulf of Carpentaria, JBG: Joseph Bonaparte Gulf, TIW: Tiwi Island); Western Australia (WA); Vietnam (CMVN: Ca Mau, NTVN: Nha Trang); PHI: Philippines; INDO: Indonesia; THL: Thailand; SLK: Sri Lanka; KE: Kenya; SA: South Africa. Map generated using R version 4.0.3.

Sampling sites and the known distribution of Penaeus monodon sourced in this study. Orange areas denote the known distribution of P. monodon (modified from Kongkeo [2005-2020]). See Schott et al. (2009) for detailed Indo-Pacific summer and winter monsoon currents maps. FJ: Fiji; Eastern Australia (EA; BB: Bramston Beach, EB: Etty Bay; TSV: Townsville); Northern Australia (NT; GC: Gulf of Carpentaria, JBG: Joseph Bonaparte Gulf, TIW: Tiwi Island); Western Australia (WA); Vietnam (CMVN: Ca Mau, NTVN: Nha Trang); PHI: Philippines; INDO: Indonesia; THL: Thailand; SLK: Sri Lanka; KE: Kenya; SA: South Africa. Map generated using R version 4.0.3.

Results

Genotyping and SNP Quality Control

A total of 131,929 unique SNPs were obtained for each of 532 P. monodon individuals (fig. 1) using DArTseqTM. Initial quality control assessment of DArTseqTM data using dartQC removed 119,514 SNPs, or 90.6% of loci (supplementary table S2, Supplementary Material online). The remaining 12,415 loci were then tested for linkage disequilibrium (LD) and Hardy–Weinberg equilibrium (HWE), which removed an additional 1,822 SNPs, or 1.4% of loci. Following genotype filtering, 12 individuals (nine from Kenya, one from Philippines, one from Sri Lanka, and one from Western Australia) were removed due to high rates of missing data (>26%). A final set comprised 10,593 SNPs (with missing data per SNP < 21% and per individual < 26%) for each retained P. monodon individual (n = 520) was subjected to further analyses.

Population Genetic Diversity

The dataset of 10,593 SNPs for each P. monodon individual (n = 520) exhibited a mean allelic richness (AR) per location that ranged from 1.10 in SA to 1.62 in INDO (table 1). The mean private allelic richness (APR) per population ranged from 0.002 in BB to 0.014 in SLK. The percentage of polymorphic loci (PPL) ranged from 18% to 76% for KE and TIW populations, respectively. Average observed heterozygosity (HO) across populations ranged from 0.02 to 0.15, whereas average expected heterozygosity (HE) ranged from 0.10 to 0.17. Of note is that HO and HE values across 14 of 16 populations were almost the same (0.01 difference), whereas two Western Indian Ocean (WIO) populations (KE and SA) exhibited a 5-fold difference between their HO and HE values (0.02 and 0.10), respectively (table 1). Samples from SA and KE populations exhibited the lowest values for HE (0.10), average multilocus heterozygosity (Av. MLH = 0.023), and standardized multilocus heterozygosity (sMLH = 0.20), whereas their average internal relatedness (IR) was much higher than within other populations. Inbreeding coefficient (FIS) values among sampling locations varied from −0.06 in WA to 0.80 in KE and 0.78 in SA. Among populations, the four Southeast Asia populations (NTVN, CMVN, INDO, and THL) displayed higher values for observed HE (0.16–0.17), average multilocus heterozygosity (Av. MLH: 0.15), and sMLH (1.19–1.25) than other populations (table 1).

Table 1

Genetic Diversity Indices among 16 Populations of Penaeus monodon Assessed Using All 10,593 High-Quality-Filtered SNP Loci

Population	n	A _R (±SD)	A _PR (±SD)	PPL (Av. MAF)	H _o (±SD)	H _E (±SD)	Av. MLH (±SD)	sMLH (±SD)	IR (±SD)	F _IS (P < 0.05)
Fiji (FJ)	49	1.39 ± 0.45	0.008 ± 0.074	0.48	0.11 ± 0.16	0.12 ± 0.17	0.109 ± 0.005	0.88 ± 0.04	0.04 ± 0.02	0.06
Bramston Beach (BB)	51	1.49 ± 0.43	0.002 ± 0.019	0.64	0.12 ± 0.15	0.13 ± 0.16	0.117 ± 0.005	0.95 ± 0.04	0.003 ± 0.027	0.07
Etty Bay (EB)	31	1.48 ± 0.45	0.002 ± 0.027	0.58	0.12 ± 0.16	0.13 ± 0.16	0.118 ± 0.003	0.95 ± 0.03	−0.003 ± 0.021	0.06
Townsville (TSV)	22	1.49 ± 0.45	0.005 ± 0.049	0.56	0.12 ± 0.16	0.13 ± 0.16	0.117 ± 0.003	0.95 ± 0.03	0.004 ± 0.020	0.07
Gulf of Carpentaria (GC)	30	1.54 ± 0.44	0.003 ± 0.031	0.65	0.13 ± 0.16	0.14 ± 0.16	0.131 ± 0.008	1.06 ± 0.07	−0.03 ± 0.04	0.07
Joseph Bonaparte Gulf (JBG)	34	1.54 ± 0.43	0.003 ± 0.030	0.66	0.13 ± 0.16	0.14 ± 0.16	0.132 ± 0.004	1.07 ± 0.03	−0.04 ± 0.02	0.07
Tiwi Island (TIW)	50	1.55 ± 0.42	0.003 ± 0.024	0.71	0.13 ± 0.15	0.14 ± 0.16	0.131 ± 0.004	1.06 ± 0.03	−0.03 ± 0.02	0.08
Western Australia (WA)	22	1.46 ± 0.47	0.006 ± 0.060	0.5	0.14 ± 0.19	0.13 ± 0.17	0.138 ± 0.03	1.13 ± 0.26	−0.03 ± 0.06	−0.06
Philippines (PHI)	12	1.49 ± 0.50	0.005 ± 0.059	0.49	0.14 ± 0.19	0.14 ± 0.17	0.138 ± 0.004	1.12 ± 0.03	−0.06 ± 0.01	0.01
Nha Trang Vietnam (NTVN)	30	1.58 ± 0.44	0.007 ± 0.065	0.68	0.15 ± 0.17	0.16 ± 0.17	0.148 ± 0.004	1.20 ± 0.03	−0.11 ± 0.02	0.08
Ca Mau Vietnam (CMVN)	39	1.61 ± 0.40	0.004 ± 0.031	0.76	0.15 ± 0.16	0.16 ± 0.17	0.146 ± 0.005	1.19 ± 0.04	−0.10 ± 0.02	0.10
Java Indonesia (INDO)	38	1.62 ± 0.40	0.004 ± 0.035	0.76	0.15 ± 0.16	0.17 ± 0.17	0.150 ± 0.013	1.22 ± 0.11	−0.10 ± 0.03	0.09
Thailand (THL)	46	1.54 ± 0.46	0.010 ± 0.078	0.60	0.15 ± 0.18	0.16 ± 0.17	0.154 ± 0.005	1.25 ± 0.04	−0.13 ± 0.02	0.03
Sri Lanka (SLK)	19	1.49 ± 0.47	0.014 ± 0.090	0.53	0.11 ± 0.16	0.12 ± 0.16	0.113 ± 0.020	0.92 ± 0.16	0.01 ± 0.07	0.09
Kenya (KE)	16	1.11 ± 0.30	0.010 ± 0.076	0.18	0.02 ± 0.08	0.10 ± 0.28	0.023 ± 0.002	0.20 ± 0.02	0.65 ± 0.03	0.80
South Africa (SA)	31	1.10 ± 0.28	0.005 ± 0.055	0.19	0.02 ± 0.08	0.10 ± 0.27	0.024 ± 0.001	0.20 ± 0.01	0.65 ± 0.02	0.78

Note.—n, number of samples; AR, mean allelic richness; APR; private allelic richness; PPL, percentage of polymorphic loci; Av. MAF, average minor allele frequency of polymorphic loci; HO, mean observed heterozygosity; HE, average expected heterozygosity; Av. MLH, average multilocus heterozygosity; sMLH, standardized multilocus heterozygosity; IR, internal relatedness; FIS, significant (P < 0.05) inbreeding coefficient; SD, standard deviation.

Genetic Diversity Indices among 16 Populations of Penaeus monodon Assessed Using All 10,593 High-Quality-Filtered SNP Loci Note.—n, number of samples; AR, mean allelic richness; APR; private allelic richness; PPL, percentage of polymorphic loci; Av. MAF, average minor allele frequency of polymorphic loci; HO, mean observed heterozygosity; HE, average expected heterozygosity; Av. MLH, average multilocus heterozygosity; sMLH, standardized multilocus heterozygosity; IR, internal relatedness; FIS, significant (P < 0.05) inbreeding coefficient; SD, standard deviation.

Detection of Outlier Loci

Outlier loci were determined from the dataset of 10,593 SNPs for each P. monodon individual (n = 520) using two independent approaches: BAYESCAN and PCAdapt (see Materials and Methods). BAYESCAN and PCAdapt analyses (FDR = 0.01 for both) identified 965 SNPs and 1,252 SNPs as being under putative selection (associated with positive alpha values), respectively. Of these SNPs 663 were overlapping and, thus, deemed the outlier loci dataset, whereas the remaining SNPs (n = 9,930) were deemed the neutral loci data set.

Genetic Population Structure

Cross‐validation by ADMIXTURE inferred the lowest cross-validation errors (ADMIXTURE approach) ranged from K = 6 to K = 8 for the genetic structure of neutral and outlier loci (supplementary fig. S1, Supplementary Material online). We visualized clustering patterns for K values 6 through 8 to identify hierarchal clustering patterns as K increases (fig. 2).

Fig. 2.

Population structure results for Penaeus monodon populations (n = 16) using neutral loci (A, C, E) and outlier loci (B, D, F). Plots of individual admixture determined using ADMIXTURE with cross-validation error using three unique K values (K = 6, K = 7, and K = 8) for both neutral (A) and outlier (B) loci. Assignment of Southeast Asia group and SLK to cluster at K = 2 and K = 3 as inferred ADMIXTURE analysis using neutral loci (C) and outlier loci (D). DAPC analysis plots with BIC recommended K value of K = 7 for neutral loci (E) and K = 6 for outlier loci (F). Individuals of the same color belong to the same cluster. Population names are defined in figure 1. At K = 6, for both neutral and outlier datasets, P. monodon from FJ and WIO clusters were separated from the rest of the sampled populations (fig. 2). In addition, at K = 6, based on the neutral loci, all seven Australian and Southeast Asia populations (PHI, CMVN, NTVN, and INDO) exhibited a high degree of admixture, whereas the THL population divided into three distinct groups (includes two single clusters). At K = 7, for neutral loci, the main signals derived from ADMIXTURE approach was: 1) separation of FJ and WIO clusters from the rest of the sampled populations, 2) clustering of Northern Australia and PHI populations (NAP cluster; GC, JBG, TIW, and PHI), and 3) overlapping genetic clusters for five populations (CMVN, NTVN, INDO, THL, and SLK) (figs. 2). At K = 7 and K = 8 for outlier loci, shrimp from five populations (CMVN, NTVN, INDO, PHI, and SLK) exhibited some degree of admixture with six genetic clusters (fig. 2). The admixture between five populations (CMVN, NTVN, INDO, THL, and SLK) was further investigated in a substructure analysis for both neutral and outlier loci datasets (fig. 2, see Materials and Methods), which revealed THL to be an admixture between two clusters: the SLK population and Southeast Asia group (CMVN, NTVN, and INDO). The patterns identified by ADMIXTURE analysis were supported by the discriminant analysis of principal components (DAPC) analysis (fig. 2). The Bayesian Information Criterion (BIC) statistic generated by DAPC indicated that the optimal number of clusters for neutral and outlier loci was K = 7 and K = 6, respectively (supplementary fig. S1, Supplementary Material online). As with the ADMIXTURE results for neutral loci, DAPC confirmed the presence of seven clusters: 1) Fiji (FJ); 2) Eastern Australia cluster (EA; BB, EB, and TSV); 3) NAP cluster (GC, JBG, TIW, and PHI); 4) Western Australia (WA); 5) Southeast Asia cluster (CMVN, NTVN, INDO, and THL); 6) Sri Lanka (SLK); and 7) WIO cluster (KE and SA). The results of DAPC based on the outlier dataset revealed the following six clusters: 1) FJ; 2) EA; 3) NAP and WA cluster (GC, JBG, TIW, PHI, and WA); 4) Southeast Asia; 5) SLK; and 6) WIO (fig. 2). The majority of population-pairwise FST values differed significantly from one another (P < 0.01) for both neutral and outlier loci datasets with the following three exceptions: 1) KE and SA populations in both neutral and outlier loci datasets, 2) JBG and TIW populations in neutral loci dataset, and 3) EA group (BB, EB, and TSV) in outlier loci dataset (supplementary table S3, Supplementary Material online). The highest pairwise FST values occurred between the WIO (KE and SA) and all the Pacific Ocean populations, with FST values ranging from 0.33 to 0.56 and 0.51 to 0.96 for neutral and outlier loci datasets, respectively (P < 0.01). Interestingly, in both neutral and outlier datasets, pairwise FST values between WIO and Fiji, as well as between WIO and Australian populations were higher than FST values observed among WIO and Southeast Asian populations (CMVN, NTVN, INDO, and THL). We also observed that geographically proximal populations exhibited the lowest FST values (e.g., between KE and SA and among EA populations; supplementary table S3, Supplementary Material online). Overall, average FST was higher for outlier loci (FST=0.40 ± 0.29) than neutral loci (FST=0.16 ± 0.16) (supplementary table S3, Supplementary Material online). A series of Mantel tests were conducted for both neutral and outlier datasets in different subsets of sampling sites to assess the correlation between geographic and genetic distance. Significant isolation by distance (IBD) was found when all 16 locations were considered for both neutral and outlier datasets (r2 = 0.865 and r2 = 0.902, P < 0.001, respectively; fig. 3). Similarly, IBD relationships among the seven clusters were significant and slightly decreased for both neutral and outlier datasets (r2 = 0.793 and r2 = 0.824, P < 0.001, respectively; supplementary fig. S2, Supplementary Material online).

Fig. 3.

IBD plots with Mantel correlograms for the relationship between genetic (FST) and geographic (km) distances among Penaeus monodon populations (n = 16) using (A) 9,930 neutral SNPs and (B) 663 outlier SNPs. Analyses of molecular variance (AMOVAs) for the two oceans (Indian and Pacific) using neutral loci showed a weak, but significant population structure (table 2; FSC = 0.075, P < 0.01). More specifically, 22.8% of molecular variation occurred between the Indian and Pacific Oceans, whereas 5.8% of molecular variation occurred among populations within groups (n = 7). AMOVA of both neutral and outlier datasets, which was based on seven clustering populations (see Materials and Methods), revealed a significant genetic divergence within and among all sampled populations (P < 0.01; table 2). For neutral loci, the majority of genetic variability (83.01%) was explained within populations and 13.9% among groups; however, for outlier loci, the genetic variability explained within populations (51.9%) was only slightly higher than the genetic variability explained among groups (46.7%). The patterns observed for both neutral and outlier datasets were supported by pairwise FST comparisons, which were significant at the group level (table 2). AMOVA results also yielded significant FSC values in both neutral and outlier analyses (FSC = 0.01711 and 0.04685, respectively), indicating significant genetic division among these seven groups (P < 0.01).

Table 2

AMOVA among Penaeus monodon Populations (n = 16), among Regional Groups of Populations Identified by ADMIXTURE and DAPC Analyses, and among Individuals within Populations Using Neutral and Outlier SNP Datasets

Variance Partition		Degrees of Freedom	Sum of Squares	Variance Component	Variation (%)	Fixation Indexes	P Value
(A) Between the Indian and Pacific Oceans
Neutral dataset (9,930 SNPs)	Among groups	1	17,168.99	69.54	22.75	FSC: 0.07529	<0.01
	Among populations within groups	14	19,378.48	17.78	5.82	FCT: 0.22748	<0.01
	Among individuals within populations	504	112,215.79	4.27	1.4
	Within individuals	520	111,340.50	214.12	70.04	FIT: 0.29960	<0.01

(B) Among seven regional groups

Neutral dataset (9,930 SNPs)	Among groups	6	32,347.27	35.74	13.86	FSC: 0.01711	<0.01
	Among populations within groups	9	4,200.20	3.80	1.47	FCT: 0.13857	<0.01
	Among individuals within populations	504	112,215.79	4.27	1.65
	Within populations	520	111,340.50	214.12	83.01	FIT: 0.16986	<0.01
Outlier dataset (663 SNPs)	Among groups	6	15,306.66	17.93	46.68	FSC: 0.04685	<0.01
	Among populations within groups	9	727.41	0.96	2.50	FCT: 0.46681	<0.01
	Among individuals within populations	504	9,691.87	−0.29	−0.76
	Within individuals	520	10,304.5	19.82	51.58	FIT: 0.48416	<0.01

TreeMix Analysis

The generated maximum likelihood (ML) phylogenetic tree, which considered entire neutral loci dataset (n = 9,930 SNPs), was rooted to Fiji (FJ) given the geographic separation of this population from all other sampled populations. Five migration events (m = 5) provided the best explanation of sample covariance (fig. 4) out of all migration events tested (m = 0–10; supplementary fig. S3, Supplementary Material online). After adding five migration events (log-likelihood = 1,001.24), the likelihood was not further increased by increasing the number of migration events up to 10 (supplementary fig. S4, Supplementary Material online). Consistent with DAPC and ADMIXTURE analyses (with or without migration events included), TreeMix analysis indicated migration to Southeast Asia from the WIO sites and a major phylogeographic split between WIO and Pacific Ocean populations (fig. 4 and supplementary fig. S3, Supplementary Material online).

Fig. 4.

(A) Map of biogeographic break and the proportion of each of the 16 Penaeus monodon populations assigned to seven clusters identified in the program ADMIXTURE using 9,963 neutral loci. The proportion of individuals assigned to each cluster per site depicted in colored pie charts. (B) Population relationships and migration edges of Penaeus monodon inferred by TreeMix for neutral SNP dataset at five migration events (m = 5). Fiji (FJ) was use as outgroup to root the tree. The migration events are colored according to their weight. (C) and (D) show the minor allele frequency distribution of seven and 20 temperature correlated putatively adaptive SNPs for each Indo-Pacific population, respectively. Population names are the same as defined in figure 1. There were three additional migration events included in the ML tree from the WIO to Southeast Asia with strong migration weights (>0.35; fig. 4). Within these regions, THL shared a branch with Indian Ocean populations (SLK, KE, and SA), which demonstrated the significant relationship between these populations and provided evidence for gene flow from the WIO to SLK and from THL to each of the Southeast Asia populations (NTVN, CMVN, and INDO). The fourth migration event suggested a slightly weighted migration from THL to the Northern and Western Australian populations (TIW, JBG, GC, and WA). The last migration event estimated a strongly weighted migration of NTVN and CMVN to THL (fig. 4). These results indicated either migration to THL from both the Indian Ocean (SA, KE, and SLK) and the Southeast Asia (CMVN and NTVN) populations or bidirectional gene flow within the THL population. To confirm gene flow among P. monodon populations f3-statistics were also calculated, which resulted in 212 significant f3 scores with Z scores ranging from −32.35 to −0.07 (supplementary table S4, Supplementary Material online). These significant f3-statistics provided further support for the five migration events demonstrated by TreeMix analysis in that eight populations (BB, GC, JBG, TIW, CMVN, NTVN, INDO, and THL) each received gene flow from two other populations (supplementary table S4, Supplementary Material online). Sixteen f3-statistics were significantly positive (Z score < −20) when CMVN and INDO were admixed by SLK, PHI, and seven Oceania populations (GC, JBG, TIW, BB, EB, TSV, and FJ). Interestingly, all 16 significantly positive f3-statistics included SLK as one of the two source populations.

Identification of Putatively Adaptive SNPs

The transcriptome BLAST approach (see Materials and Methods) revealed that 20.5% of outlier loci (n = 136) matched coding contigs (n = 1,408 total) with an E-value < 10−5 (supplementary table S5, Supplementary Material online). Of these only 19.5% (n = 27) exhibited 100% pairwise identity to 62 transcriptome contigs (supplementary table S6, Supplementary Material online). Putative functional annotations were classified into 17 functional groups with eight (12.9%) annotated as “Predicted: uncharacterized protein”, 13 (20.9%) annotated as “N/A”, and 41 (66.2%) annotated as cellular and metabolic processes, pigmentation, and immune response genes (supplementary table S6, Supplementary Material online). Three of these putatively adaptive SNPs were missing from WIO populations and thus removed from subsequent correlation analysis. Pearson correlations revealed that the minor allele frequency (MAF) for four and three of these 24 putatively adaptive SNPs was significantly correlated (P < 0.05) with sea surface temperature maximum (SST_max) and minimum (SST_min), respectively (table 3 and supplementary fig. S5, Supplementary Material online). Of these, SNPs 2019PM_4058 (C > G) and 2019PM_6523 (T > G) matched transcriptome contigs annotated as “AChain Apocrustacyanin C1 Crystals” and “C-type lectin,” respectively (supplemental table S6, Supplementary Material online), whereas the other five temperature correlated putatively adaptive SNPs matched contigs that were annotated as “Predicted: uncharacterized protein”, “nucleic acid binding”, or “N/A” (supplemental table S6, Supplementary Material online).

Table 3

Pearson Correlations for Putatively Adaptive SNPs that Exhibited a Significant Relationship between Minor Allele Frequency and Sea Surface Temperature Maximum or Minimum

SNP ID	SST_max		SST_min
SNP ID	R	P value	R	P value
(A) Candidate SNPs identified by the first analysis
2019PM_4058	−0.666	0.005	0.005	0.986
2019PM_4727	−0.667	0.005	−0.162	0.550
2019PM_6523	−0.629	0.009	−0.248	0.354
2019PM_7307	−0.541	0.030	−0.325	0.220
2019PM_3144	−0.209	0.437	−0.504	0.047
2019PM_5152^a	−0.305	0.252	−0.650	0.006
2019PM_6229	−0.249	0.352	−0.763	0.001

(B) Candidate SNPs identified by the second analysis

2019PM_665	−0.580	0.020	−0.080	0.770
2019PM_4130	−0.600	0.010	−0.060	0.840
2019PM_1552	−0.639	0.008	−0.237	0.378
2019PM_2811	−0.655	0.006	−0.251	0.349
2019PM_3848	−0.670	0.005	−0.222	0.409
2019PM_6807	−0.596	0.015	−0.296	0.266
2019PM_7282	−0.588	0.017	−0.177	0.513
2019PM_10077	−0.633	0.008	−0.098	0.719
2019PM_10517	−0.564	0.023	−0.143	0.596
2019PM_1233	−0.313	0.238	−0.708	0.002
2019PM_1287	−0.244	0.362	−0.760	0.001
2019PM_2814	−0.220	0.413	−0.561	0.024
2019PM_3748	−0.471	0.065	−0.635	0.008
2019PM_5152^a	−0.304	0.252	−0.650	0.006
2019PM_6872	−0.209	0.437	−0.685	0.003
2019PM_6979	−0.208	0.440	−0.686	0.003
2019PM_8233	−0.263	0.324	−0.500	0.048
2019PM_9353	−0.289	0.278	−0.653	0.006
2019PM_9861	−0.254	0.343	0.540	0.031
2019PM_10081	−0.315	0.234	−0.632	0.009

Note.—SST_max, sea surface temperature maximum; SST_min, sea surface temperature minimum. Significant values (P < 0.05) are bolded.

Overlapping SNP across both utilized approaches for putatively adaptive SNPs identification.

Pearson Correlations for Putatively Adaptive SNPs that Exhibited a Significant Relationship between Minor Allele Frequency and Sea Surface Temperature Maximum or Minimum Note.—SST_max, sea surface temperature maximum; SST_min, sea surface temperature minimum. Significant values (P < 0.05) are bolded. Overlapping SNP across both utilized approaches for putatively adaptive SNPs identification. The combined environmental association (EnA) and PD approaches (see Materials and Methods) revealed full RDA model support for the role of sea surface temperature in P. monodon SNP genotype distribution (P < 0.001; R2 = 0.058; adjusted R2 = 0.055). Based on these two significantly constrained axes the RDA model identified 1,256 putatively adaptive SNPs. LFMM analysis (FDR = 0.05) identified 424 putatively adaptive SNPs that were significantly associated with SST_max or SST_min of which 66 overlapped with RDA identified putatively adaptive SNPs. Of these, 35 overlapped with the PD determined outlier SNP dataset (see Materials and Methods; supplementary fig. S6, Supplementary Material online). Three of these were missing from WIO populations and thus removed from subsequent correlation analysis. Pearson correlations revealed that the MAF for 11 and 9 of these 35 putatively adaptive SNPs was significantly correlated (P < 0.05) with SST_max and SST_min, respectively (table 3 and supplementary fig. S7, Supplementary Material online). Of these only 2019PM_5152 exhibited 100% pairwise identity with two transcriptome contigs annotated as “transcriptional regulatory–like” and “N/A” (supplementary table S6, Supplementary Material online). MAF distribution for the seven temperature correlated putatively adaptive SNPs identified by the transcriptome BLAST approach varied between 0.02 and 0.5 for all Southeast Asia populations (CMVN, NTVN, INDO, and THL) and between 0.2 and 1 for all other Indo-Pacific populations except for South Africa (SA), Kenya (KE), and Fiji (FJ), which all exhibited a MAF of 1 (fig. 4). Similarly, MAF distribution for the 20 temperature correlated putatively adaptive SNPs identified by the combined EnA and PD approach varied between 0.02 and 0.5 for Southeast Asia (CMVN, NTVN, INDO, and THL) and Sri Lanka (SLK) populations and between 0.2 and 1 for all other Indo-Pacific populations except for South Africa (SA), which exhibited a MAF of 1 (fig. 4).

Discussion

The giant black tiger shrimp is an important aquaculture and fishery species in Indo-Pacific countries and comprehensive understanding of population genetic structure, genetic connectivity, and local adaptation is of value to assist with identifying: 1) source broodstocks, 2) wild stocks with different genetic backgrounds, and 3) appropriate fishery management strategies. Using whole-genome SNP sequencing we provide the most comprehensive analysis of P. monodon population genetic structure within the Indo-Pacific region to date, as well as new insights regarding gene flow from the Indian Ocean to the Pacific Ocean. The results from this study revealed a strong pattern of genetic structure within and among P. monodon Indo-Pacific populations that corresponded with broad geographical regions and clear genetic breaks among samples within regions, which in several cases is associated with geographical barriers, ocean surface currents, and environmental variables. Analyses based on neutral SNPs (n = 9,930) revealed: 1) five migration events that indicated gene flow to Southeast Asia from the WIO sites, 2) gene flow among P. monodon populations in the Pacific region, and 3) partial connectivity among populations native to both oceans. Identification of 26 putatively adaptive SNPs revealed that sea surface temperature may be important factors driving local adaption of P. monodon across the Indo-Pacific. This study highlights the importance of conducting analyses on both neutral and outlier loci given the different requirements for P. monodon aquaculture and fishery management.

Population Genetic Structure and Gene Flow

Major Genetic Break within the Indo-Pacific

Using both neutral and outlier loci, the results from this study revealed a pattern of strong genetic structure for Indo-Pacific P. monodon that corresponded with broad geographical regions and clear genetic breaks among samples within geographically discrete regions. Neutral loci, which exhibited seven discernable genetic clusters, indicated a weak genetic structure across Southeast Asia and the Northwest Indian Ocean, as well as the separation of Fiji and WIO clusters from all other Indo-Pacific populations. Outlier loci, which exhibited six discernable genetic clusters, indicated a stronger genetic structure among Indo-Pacific populations as well as revealed a high degree of admixture (i.e., shared ancestry) among populations within each genetic cluster. Genetic differentiation among Indo-Pacific P. monodon populations included in the study exhibited notable variation (supplementary table S3, Supplementary Material online). Outlier loci revealed an increase in genetic divergence across sample sites, which could be due to directional selection because identified genetic clusters correspond to locally adapted populations (see below). In addition to genetic differences, Mantel tests based on both neutral and outlier loci also showed a significant positive correlation between genetic and geographic distances, which indicates that geographic distance plays an important role in limiting gene flow among P. monodon populations. AMOVA results indicated a weak, but significant (P < 0.01), population structure between the Indian and Pacific Oceans, as well as among the seven identified genetic clusters, which suggests that substantial gene flow may occur between the Indian and Pacific Oceans (table 2). Our results indicated significant genetic differences among P. monodon populations. This finding is supported by similar previous observations for P. monodon (Benzie et al. 2002; You et al. 2008; Waqairatu et al. 2012), giant red shrimp Aristaeomorpha foliacea (Fernández et al. 2013), skunk clownfish Amphiprion akallopisos (Huyghe and Kochzius 2018), and holothurian Stichopus chloronotus (Pirog et al. 2019). Particularly, the three previous P. monodon studies demonstrated that the greatest genetic differentiation among populations existed between Southeast Asia and Australia and between Southeast Asia and southeast Indian Ocean (Benzie et al. 2002; You et al. 2008), or between Indian and Pacific Ocean populations (Waqairatu et al. 2012). There are many factors that contribute to the genetic differentiation of Indo-Pacific P. monodon, including environmental and ecological factors, biogeographic barriers between the Indian and Pacific Oceans, and life-history traits (Vogler et al. 2012; Hui et al. 2016; Amaral et al. 2017; Huyghe and Kochzius 2017). It is possible that biogeographic barriers between the Indian and Pacific Oceans could restrict gene flow and obstruct the physical movement of many tropical marine species between these ocean basins. One such biogeographic barrier is the Indonesian Sunda Shelf, which has been considered to play a role in (if not entirely responsible for) the genetic differences observed among Indo-Pacific P. monodon (Benzie et al. 2002; You et al. 2008; Waqairatu et al. 2012), as well as other Indo-Pacific marine species (Horne et al. 2008; Gaither et al. 2009; Simmonds et al. 2018; Wainwright et al. 2018). Of note is that the influence of this biogeographic barrier on gene flow remains during times of sea level rise following last glacial maximum (Randall 1998; Rocha et al. 2007; Crandall et al. 2019).

Regional Population Structure within the Pacific Ocean

Significant population genetic structure was evident within Pacific Ocean P. monodon; neutral loci indicated five distinct genetic clusters are present, whereas outlier SNPs found evidence for four clusters (fig. 2). The main difference in genetic structure was that the Western Australian (WA) population was a discrete cluster (neutral SNPs) versus part of the Northern Australia and Philippines (NAP) cluster (outlier SNPs). Recently, Vu et al. (2020) demonstrated that P. monodon from a Western Australian population was genetically divergent from multiple Eastern and Northern Australia populations (i.e., discrete cluster) based on outlier loci determined using EnA analyses (n = 89 SNPs). We believe that the difference between the present study and Vu et al. (2020) is most likely due to outlier determination being based on PD versus EnA analyses (n = 663 vs. 89), respectively. Other non-SNP studies also reported P. monodon from Western Australia as being genetically distinct from other Australian P. monodon populations (Brooker et al. 2000; Benzie et al. 2002). Taken together, we suggest that the Western Australia P. monodon population should be managed as a unique genetic stock. Within the Pacific Ocean, the Southeast Asian populations displayed a mixed pattern of differentiation and gene flow with those from Northern Australia (fig. 2). For example, ADMIXTURE analysis of both neutral and outlier loci revealed that populations in the Southeast Asia group shared ancestry with other clusters in the Pacific Ocean; Philippines was grouped with the Northern Australian group. TreeMix analysis also revealed low-level gene flow from Thailand to Northern-Western Australian populations (GC, JBG, TIW, and WA). Additionally, many Southeast Asian countries have long histories of trading in live P. monodon broodstock for aquaculture which could lead to the high admixture between this region and hard to determine genetic relationships of the shrimp examined from this region. Based on a whole SNP dataset (n = 10,593) all Pacific Ocean populations, except Fiji, exhibited a high level of within-population genetic diversity, whereas Southeast Asia populations (e.g., Indonesia, Vietnam, and Thailand) exhibited the highest among-population genetic diversity (table 1). This elevated genetic diversity among Southeast Asia populations could be driven by the migration of genetically divergent Indian Ocean shrimp into Southeast Asia (fig. 4) and/or environmental variables within Southeast Asia that promote genetic diversification through local adaptation to niche habitats. For example, the expansive mangroves found in Southeast Asia account for almost half of the total global above‐ground biomass (Hutchison et al. 2014), which could lead to P. monodon genetic diversification through population expansion given that mangroves are known to provide ideal nursery habitat for P. monodon larvae (i.e., food and protection; Lee 2004; Sheaves et al. 2012). In addition genetic diversification being driven by migration events and/or potential local adaptation, Benzie et al. (2002) demonstrated the maintenance of large P. monodon populations within Southeast Asia as well as the evolution of genetic variants in peripheral populations, which were concluded to be the main factors driving the high level of observed genetic diversity. The observed heterogeneity and reduced genetic diversity within Fiji are most likely driven by geographic isolation, which is consistent with the significant differentiation of Fiji compared with all other Pacific Ocean populations. Waqairatu et al. (2012) demonstrated that Fijian P. monodon are genetically distinct from other Pacific Ocean populations and that this population was derived from a relatively small founder population that has remained isolated due to its geographic remoteness. The reduction in genetic diversity observed by Waqairatu et al. (2012) and this study could also be due to the introduction of P. monodon from the Philippines (Choy 1982) and Eastern Australian (Waqairatu and Pickering 2010) for aquaculture purposes because this could have led to the release or escape of less diverse aquaculture progeny into the wild and interbreeding with the small and diversity-limited founder population. However, our SNP dataset is unable to conclusively determine whether this lack of genetic diversity is due to: 1) a recent population bottleneck, 2) genetic drift from a small number of founder individuals, 3) influence of escaped or released aquaculture individuals, or 4) the various combinations of population bottleneck, genetic drift, and/or aquaculture influence. Future studies aimed at improving Fiji-based P. monodon aquaculture should focus on determining demographic processes and introgression patterns to further understand the root cause(s) and identify the most effective means of counteracting this low genetic diversity.

Regional Population Structure within the Indian Ocean

At a regional scale, P. monodon collected at three locations in the Northwest and West Indian Ocean were clustered into two distinct groups, with Sri Lanka forming one cluster and WIO (Kenya and South Africa) comprising the other cluster (pairwise FST ≥ 0.39). Genetic divergence of Sri Lanka from other WIO populations has also been documented among marine populations of other species and explained by life history characteristics, environmental transitions, large geographical distances, geographical barriers, and/or ocean surface currents (Vogler et al. 2012; Hui et al. 2016; Amaral et al. 2017; Huyghe and Kochzius 2017). For example, Vogler et al. (2012) concluded that geographical barriers and ocean surface currents were drivers of the observed differences in the population genetic structure of Indian Ocean crown-of-thorns starfish Acanthaster planci, which is also a benthic species with a pelagic larvae stage lasting 3–4 weeks. In the present study, TreeMix analysis revealed that there was a very low level of gene flow between WIO populations and Sri Lanka (fig. 4), as well as the lack of any significant negative f3-statistics for three-population tests for Sri Lanka, South Africa, and Kenya (supplementary table S4, Supplementary Material online). Genetic divergence of P. monodon in the east, north, and west Indian ocean could have been driven by the loss of habitat and geographic fragmentation experienced during the last glacial maximum when the sea level dropped by approximately 120 m given the reported recolonization (by shelf areas or populations further north/east) of lost habitat following sea level rise following last glacial maximum (Forbes et al. 1999; Benzie et al. 2002; Silva et al. 2009). Moreover, the Indian Ocean has a complex current system with the South Equatorial Current flowing westward across the Indian Ocean and splitting into the Southeast and Northeast Madagascar Current, which can facilitate long-distance larval dispersal as well as generate migration barriers (Schott et al. 2009; Otwoma and Kochzius 2016; Huyghe and Kochzius 2018). Considering the offshore planktonic larval phase (approximately 20 days) of P. monodon (Motoh 1985), a stepping-stone model could explain larval dispersal throughout the entire Indian Ocean over several generations because the period of seasonal monsoons (northeast winds in December/January and southwest winds in July/August) corresponds with the peak P. monodon spawning seasons in the Indian Ocean (Sk et al. 2015; Kaka et al. 2019). Although we do not know the exact divergence time between the ancestral (WIO populations) and the Indo-Polynesian lineage (Sri Lanka, Southeast Asia, Australia, and Fiji), it could have occurred during the late Pleistocene given the lineage divergence timing of other species in this region (Vogler et al. 2012; Hui et al. 2016; Farhadi et al. 2017; Huyghe and Kochzius 2017). Future studies should further elucidate the degree of and potential reasons for genetic divergence among populations within the Indian Ocean by undertaking extensive sampling within this region of P. monodon’s distribution. At a local scale, genetic population structure analysis using both outlier and neutral loci also failed to determine the presence of genetic structure among WIO P. monodon populations. Strong gene flow between Kenya and South Africa could be due to the southward Mozambique Current, which is a branch that splits off from the Northeast Madagascar Current at the northern tip of Madagascar and creates a number of eddies on its way through the Mozambique Channel to the southward WIO (Schott et al. 2009; Huyghe and Kochzius 2018). Population genetic structure and gene flow analyses (DAPC, ADMIXTURE, and TreeMix) revealed that the WIO cluster was genetically divergent from all Indo-Polynesian populations (Sri Lanka, Southeast Asia, Australia, and Fiji). More specifically, significantly elevated and positive FIS values were observed in WIO populations (FIS = 0.80 and 0.78 for Kenya and South Africa, respectively), which suggests heterozygosity deficiency (table 1). The significantly elevated FIS values and reduced heterozygosity observed for these WIO populations could, therefore, be due to a genuine reduction in genome-wide heterozygosity (e.g., Wahlund effect; De Meeûs 2018) and/or the presence of null alleles (i.e., technical artifacts of the RAD-seq genotyping approaches; Davey et al. 2013) because selection pressure is not likely the driving factor (i.e., neutral loci exhibited elevated FIS values; supplementary table S7, Supplementary Material online). A reanalysis of data that reduced the entire SNP dataset from 10,593 to 2,155 loci for all individuals (n = 520) by retaining only the most informative SNPs (> 95% concurrence in technical replicates with missing data per SNP < 2%) reduced the FIS values of Kenya and South Africa populations to 0.03 and 0.02 (from 0.80 and 0.78; table 1), respectively, whereas HO and HE only differed by 0.01–0.02 across all 16 populations (supplementary table S8, Supplementary Material online). Interestingly, this subset of SNPs had no effect on the diversity metrics of any other populations (e.g., significant Pearson correlations across all 16 populations were identified between HE, HO, and AR based on the whole 10,593 SNPs dataset as well as the 2,155 SNPs subset; R = 0.9–1, P < 0.00001; data not shown), which confirms that, the observed elevation in FIS values were was specific to WIO populations. Taken together, it is unclear whether the elevated FIS values observed in WIO populations are the result of null alleles (i.e., artifacts) or/and the Wahlund effect (i.e., real); however, given that elevated FIS values were exclusively observed for WIO populations, null alleles are more likely the driving factor than the Wahlund effect (Waples 2018). Regardless, both factors have been suggested as the two most common causes for the elevated FIS values observed in Pacific whiteleg shrimp Litopenaeus vannamei (Valles-Jimenez et al. 2004), and kuruma shrimp Penaeus japonicus (Tsoi et al. 2007).

Signatures of Local Adaptation

Vu et al. (2020) recently demonstrated the role of environmental factors (e.g., sea surface temperature) in shaping the genetic structure and local adaptation of Australian P. monodon populations. Based on the results of Vu et al. (2020) we hypothesized that P. monodon may also exhibit signatures of local adaption to sea surface temperature on a larger scale (e.g., across entire Indo-Pacific region). Using two discrete analytical approaches, we identified 26 temperature correlated putatively adaptive SNPs that exhibited a significant correlation with SST_max or SST_min, which suggests that sea surface temperature could be contributing to genetic variation and/or local adaptation within and among Indo-Pacific P. monodon populations. Of note is that more or different putatively adaptive SNPs might have been identified by the transcriptome BLAST approach if different criteria were used to determine the outlier loci dataset; however, the use of both transcriptome BLAST (outlier SNPs only) and combined EnA and PD (all SNPs) approaches ensures that all putatively adaptive SNPs associated with temperature were identified. Our findings are consistent with the observed role of sea surface temperature on hierarchical genetic structure and genetic differences within and among populations of American lobster Homarus americanu, giant California sea cucumber Parastichopus californicus, sea scallop Placopecten magellanicus, and European green crab Carcinus maenas (Benestan et al. 2016; Jeffery et al. 2018; Van Wyngaarden et al. 2018; Xuereb et al. 2018). One of the identified temperatures correlated putatively adaptive SNPs was 2019PM_4058, which matched transcriptome contigs annotated as “AChain Apocrustacyanin C1 Crystals” (table 3 and supplementary table S6, Supplementary Material online). In P. monodon crustacyanin protein is known to be involved in temperature sensitive hypodermis pigmentation (Wade et al. 2014, 2015; Budd et al. 2017; Fan et al. 2021) and substrate color adaptation (Wade et al. 2012), which have important implications for P. monodon, P. japonicus, and L. vannamei aquaculture (Bernal Rodríguez et al. 2017). Another identified temperature correlated putatively adaptive SNP was 2019PM_6523, which matched transcriptome contigs annotated as “C-type lectin” (table 3 and supplementary fig. S5 and table S6, Supplementary Material online). C-type lectins are known to be involved in the immune response to pathogen infection via the hemocyte prophenoloxidase system and might also enhance immune system recognition of certain bacteria and viruses in L. vannamei (Wei et al. 2012). The putatively adaptive SNPs that did not exhibit a significant correlation with sea surface temperature might still be useful candidate genes that correlate with other untested environmental factors (Milano et al. 2014). For example, 2019PM_3043 matched one transcriptome contig annotated as one of the P. monodon chitinase genes (PmChi-5). PmChi-5 may be involved in molting, larval metamorphosis, immune defense against pathogen infections, and ammonia nitrate stress response in P. monodon (Zhou et al. 2018) as well as regulation of both humoral and cellular immune responses in L. vannamei (Niu et al. 2018). Although these putatively adaptive SNP annotations suggest potential roles in P. monodon, local adaptation no firm conclusions be made given the likelihood that all identified putatively adaptive SNPs are involved in a diversity of functions and/or pathways (Pavlidis et al. 2012).

Implications for Aquaculture and Fishery Management

Within the Indo-Pacific, P. monodon is considered as one of the most important aquaculture species (Waqairatu et al. 2012). Genome-wide SNP genotyping (divided into neutral and outlier loci datasets) provides an essential tool for clarifying genetic population structure, which is of paramount importance for effective aquaculture and fishery management practices (Benestan 2019). The results of the present genome-wide SNP study may help inform P. monodon Indo-Pacific aquaculture and fishery management in several ways. Firstly, the presence of seven genetically distinct P. monodon stocks (Fiji, Eastern Australia, Northern Australia and Philippines, Western Australia, Southeast Asia, Sri Lanka, and the West Indian Ocean) could be considered as a single management unit at the regional level. However, more refined management of P. monodon wild stocks should be considered in Southeast Asia and Sri Lanka because these populations exhibited high admixture in both neutral and outlier loci analyses. In the case of Southeast Asia, where wild stocks encompass ocean territories of several countries, we recommend a multijurisdictional approach to management of these wild stocks as opposed to independent management by each country. Lastly, future research aimed at the development of selective breeding programs based on thermal adaptation, establishment of pathogen resistant, and/or disease-free aquaculture stocks should focus on outlier loci for candidate gene discovery (Lebeuf-Taylor et al. 2019).

Materials and Methods

Sample Collection and DNA Extraction

The giant black tiger shrimp P. monodon is widely distributed throughout the Indo-Pacific region, spanning from Fiji in the central Pacific, throughout Australia, Southeast Asia, and India, to the south western coast of the African continent. In order to study the genetic relationships across the entire Indo-Pacific range we collected wild adult of P. monodon (n = 532) from 16 discrete Indo-Pacific locations between 2015 and 2019 (fig. 1 and supplementary table S1, Supplementary Material online). Pleopod tissue samples were excised from each wild adult P. monodon sample with scissors, preserved in ethanol and stored at −20 °C until DNA extraction. Genomic DNA (gDNA) was extracted from each pleopod sample following a modified cetyl-trimethyl ammonium bromide protocol (Adamkewicz and Harasewych 1996; Vu et al. 2020). All gDNA extractions were subsequently cleaned using Sephadex G-50 (GE HLS 2007) and gDNA integrity was confirmed using 0.8% agarose gel containing GelGreen (ThermoFisher Scientific Pty Ltd, Australia). All gDNA samples were normalized to 50 ng/µl (20 μl final volume) before submission to Diversity Arrays Technology (DArTseq; Canberra Australia) for library preparation and DArTseqTM sequencing.

SNP Genotyping and Quality Control

All shrimp samples were sequenced and genotyped using DArTseqTM technology, as previously described (Kilian et al. 2012; Anderson et al. 2019; Vu et al. 2020). Briefly, DArTSeqTM technology relies on a complexity reduction method in order to obtain genome sequences using next-generation sequencing technology. DArTSeqTM was optimized for P. monodon by selecting the most appropriate complexity reduction method, which used a combination of PstI and SphI methylation-sensitive restriction enzymes. Only “mixed fragments” (PstI-SphI) were effectively amplified in 30 rounds of PCR using the following reaction conditions: 94 °C for 1 min, then 30 cycles of 94 °C for 20 s, 58 °C for 30 s, 72 °C for 45 s, and 72 °C for 7 min. After PCR amplification, equimolar amplicons from each sample were pooled and subjected to c-Bot (Illumina) bridge PCR before 77 cycles of single read sequencing on Illumina Hiseq2500. Raw SNP sequence data were obtained from DArT after initial SNP calling, but without filtering by proprietary analysis pipelines (DArTsoft). SNP data were then initially filtered using custom dartQC python scripts (available from https://github.com/esteinig/dartqc). More specifically, SNPs matching the following stringent criteria were excluded from the dataset: 1) low read depth (–read_counts <7), 2) average repeatability <90%, 3) call rate <80%, 4) duplicated SNPs base on similar sequence clusters (– cluster >0.95), and 5) MAF <0.02. Furthermore, all remaining loci and individuals were tested for LD (r2 > 0.2) using PLINK 1.9 (Purcell et al. 2007). For SNP pairs in LD, the SNP with the lowest call rate value was removed. Each SNP was also assessed for deviation from HWE at a population level using Arlequin version 3.5 (Excoffier et al. 2005) to run an exact test with 10,000 steps in the Markov Chain and 100,000 dememorizations. If a SNP significantly deviated from HWE (P < 0.0009) in all populations, it was removed from further analyses. This filtering pipeline left n = 10,593 high-quality SNP remaining for population genetic analyses. Population genetic diversity statistics were calculated on all available SNPs (n = 10,593). Observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient (FIS) were computed using the divBasic function of R package diveRsity version 1.9.90 (Keenan et al. 2013), with 1,000 bootstrap replicates. HP-RARE version 1.1 (Kalinowski 2005) was used to calculate allelic richness (AR), private allelic richness (ARA), and PPL using rarefaction to avoid sampling size bias. Average multilocus heterozygosity (Av. MLH) and sMLH were calculated for all individuals using R package inbreedR version 0.3.2 (Stoffel et al. 2016), whereas IR was calculated for each population using the R package Rhh version 1.0.1 (Alho et al. 2010).

Determination of Outlier Loci

Loci under putative selection were identified using two independent PD approaches: BayeScan version 2.1 (Foll and Gaggiotti 2008; Foll 2012) and R package PCAdapt (Luu et al. 2017). The two approaches were based on underlying hypotheses that take into account different genome‐wide signatures of local adaptation as both showed promise for accurately identifying outliers under a variety of complex demographic scenarios (Lotterhos and Whitlock 2015; Luu et al. 2017). BayeScan outlier detection was based on a logistic regression model that decomposed locus-population FST coefficients into population-specific and locus‐specific components. More specifically, BayeScan analysis was conducted with 20 pilot runs of 5,000 iterations each followed by 100,000 iterations and 50,000 additional burn-in iterations followed by outlier identification using Bayescan 2.01 function plot_R.r (Benjamini and Hochberg 1995). To decrease false positives due to multiple testing, we applied the false discovery rate (FDR) correction of q values at 0.01. Outlier detection using R package PCAdapt was based on principal components analysis (PCA) and identified loci that excessively related to population structure as being under putative selection. More specifically, PCAdapt was run using two principal components (K = 2) as they explained genetic structure of populations and a min.maf = 0.05; candidate loci then were determined at FDR of 0.01using R package qvalue (Storey et al. 2019).

Population Genetic Structure: Neutral versus Outlier Divergence

Population differentiation and genetic structure were calculated separately for neutral (n = 9,930) and outlier (n = 663) SNP datasets. Genetic differences among populations and individuals were estimated using R package StAMPP version 1.5.1 (Pembleton et al. 2013). Pairwise genetic differentiation values (FST) and their significance between populations, were calculated according to Wright (1949) and Weir and Cockerham (1984). Population genetic structure was determined using two approaches: 1) DAPC implemented in the R package adegenet version 2.1.1 (Jombart 2008), and 2) ADMIXTURE version 1.3.0 (Alexander et al. 2009). First, DAPC was used on individual genotypes in a multivariate analysis to determine the best number of genetic clusters (K) by running ten iterations of the “find.clusters” function and the visualize BIC values for all ten iterations along with estimated SD for each K value (K values of 1–15) before subsequently identification the optimal number of retained principal components (PC) using the optim.a.score function. Second, ADMIXTURE estimated the ancestry and genetic structure of each individual based on ML using K values that ranged from 1 to 15 (as ancestral modes) to determine the most appropriate number of distinct genetic clusters. A substructure analysis for both neutral and outlier loci datasets to determine the admixture between five populations (CMVN, NTVN, INDO, THL, and SLK) was investigated using ADMIXTURE. AMOVA was used to test genetic diversity structure between populations and geographic regions. AMOVA was run with ARLEQUIN version 3.5.2.2 (Excoffier and Lischer 2010) and statistical significance of each variance component was assessed using 10,000 permutations for each of the following hierarchical comparisons: 1) among groups, 2) among populations within groups, 3) among individuals within populations, and 4) within populations. Two independent AMOVAs were carried out based on oceans and regions. Firstly, in order to better understand the genetic structure of P. monodon between the Indian and Pacific Oceans, one hierarchical AMOVA was conducted for neutral loci based on two groups: 1) the Indian Ocean (SLK, KE, and SA) and 2) the Pacific Ocean (BB, EB, TSV, JBG, GC, TIW, WA, FJ, PHI, NTVN, CMVN, and THL). Secondly, we conducted independent analysis for both neutral and outlier loci datasets, all P. monodon individuals (n = 520) were pooled into seven groups based on clustering of individuals by ADMIXTURE and DAPC analysis: 1) BB, EB, TSV; 2) GC, JBG, TIW, PHI; 3) WA; 4) FJ; 5) CMVN, NTVN, INDO, THL; 6) SLK; 7) KE, SA. Isolation by distance was determined by discrete Mantel tests for each dataset (neutral and outlier) using the mantel.randtest function with 999 permutations in R package ade4 (Dray and Dufour 2007). Four IBD analyses were estimated separately using both neutral and outlier loci for: 1) all 16 sampled sites and 2) seven sampled groups as described in AMOVA. Contemporary least‐cost oceanographic distances between each pair of sampling sites were estimated using R package marmap (Pante and Simon-Bouhet 2013). To quantify historical connectivity across the Indo-Pacific region, a ML tree of the entire neutral dataset (n = 9,930 SNPs) was constructed using TreeMix version 1.12 (Pickrell and Pritchard 2012). An ML tree without migration (m = 0) was initially constructed and then migration events (i.e., edges) were added (m = 1–10) to improve model fitness. Turn off sample size correction (-noss), bootstrap replicates (-bootstrap), and -root flag (Fiji) were added to evaluate the confidence in the ML tree and the weight of migration events. The most likely number of migration events was selected based on the log-likelihood of the event and the plotted residuals, with each migration event analysis repeated ten times to evaluate consistency. Of note is that two root populations were assessed (Fiji and South Africa) with log-likelihood being more consistent when rooted to Fiji than South Africa. All trees and migration events were plotted in R using plotting_funcs.R script within TreeMix version 1.12. Lastly, to confirm the admixture of P. monodon among all populations, the f3-statistics were calculated using the three-pop approach (Reich et al. 2009) within TreeMix version 1.12. This approach calculated f3-statistics for all three possible populations in the form of (X; A, B) following a bifurcating tree. A significantly negative f3‐statistic value means that X is the product of admixture between A and B (Reich et al. 2009). Each outlier SNP under putative selection was assessed against the assembled P. monodon transcriptome (Huerlimann et al. 2018) using Geneious Prime version 2019.1.3 (Kearse et al. 2012) with a maximum E-value of 10−5. Contigs within the assembled P. monodon transcriptome that exhibited 100% pairwise identity with each queried SNP sequence (69 bp for each allele) were compiled and reported along with contig annotations provided by Huerlimann et al. (2018). To ensure comprehensiveness we undertook two discrete analytical approaches to identify putatively adaptive SNPs present within all 16 Indo-Pacific P. monodon populations: 1) transcriptome BLAST of entire outlier dataset and 2) combined EnA and PD analyses as per Vu et al. (2020). Putatively adaptive SNPs identified by both approaches were then independently assessed for correlation between MAF and location-specific sea surface temperature maximum (SST_max) and minimum (SST_min) using a linear regression (Pearson correlation method) following Dalongeville et al. (2018). For the transcriptome BLAST approach each outlier SNP was assessed against the assembled P. monodon transcriptome (Huerlimann et al. 2018) using Geneious Prime version 2019.1.3 (Kearse et al. 2012) with a maximum E-value of 10−5. Contigs within the assembled P. monodon transcriptome that exhibited 100% pairwise identity with each queried SNP sequence (69 bp for each allele) were compiled and reported along with contig annotations provided by Huerlimann et al. (2018). Each identified putatively adaptive SNP was then assessed for correlation between MAF and location-specific sea surface temperature maximum (SST_max) and minimum (SST_min) using R package ggpubr (Kassambara A and Kassambara MA 2020). For the Vu et al. (2020) approach the whole 10,593 SNPs dataset was subjected to two discrete EnA analyses to identify putatively adaptive loci associated with SST_max or SST_min: 1) redundancy analysis (RDA) in the R package vegan (Oksanen et al. 2013, 2018) and 2) latent factor mixed models (LFMM) in the R package LEA (Frichot and François 2015). LFMM analysis used seven latent factors because this matched the number of population clusters identified in DAPC and ADMIXTURE (fig. 2). Each identified putatively adaptive SNP was then: 1) assessed against the assembled P. monodon transcriptome (Huerlimann et al. 2018) using Geneious Prime version 2019.1.3 (Kearse et al. 2012) with 100% pairwise identity and E-value < 10−5 cut-offs and 2) assessed for correlation between MAF and SST_max and SST_min using R package ggpubr (Kassambara A and Kassambara MA 2020). SST_max and SST_min data (supplementary table S9, Supplementary Material online) for each Indo-Pacific sampling location were obtained from the Bio-ORACLE database (http://www.bio-oracle.org/downloads-to-email.php) (Tyberghein et al. 2012; Assis et al. 2018) as described in Vu et al. (2020).

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.

56 in total

Review 1. Genomics and the future of conservation genetics.

Authors: Fred W Allendorf; Paul A Hohenlohe; Gordon Luikart
Journal: Nat Rev Genet Date: 2010-10 Impact factor: 53.242

Review 2. Harnessing the Power of Genomics to Secure the Future of Seafood.

Authors: Louis Bernatchez; Maren Wellenreuther; Cristián Araneda; David T Ashton; Julia M I Barth; Terry D Beacham; Gregory E Maes; Jann T Martinsohn; Kristina M Miller; Kerry A Naish; Jennifer R Ovenden; Craig R Primmer; Ho Young Suk; Nina O Therkildsen; Ruth E Withler
Journal: Trends Ecol Evol Date: 2017-08-14 Impact factor: 17.712

3. Null Alleles and FIS × FST Correlations.

Authors: Robin S Waples
Journal: J Hered Date: 2018-05-11 Impact factor: 2.645

4. Mechanisms of colour adaptation in the prawn Penaeus monodon.

Authors: Nicholas M Wade; Mike Anderson; Melony J Sellars; Ron K Tume; Nigel P Preston; Brett D Glencross
Journal: J Exp Biol Date: 2012-01-15 Impact factor: 3.312

5. Mitochondrial DNA variation in Indo-Pacific populations of the giant tiger prawn, Penaeus monodon.

Authors: J A H Benzie; E Ballment; A T Forbes; N T Demetriades; K Sugama; S Moria
Journal: Mol Ecol Date: 2002-12 Impact factor: 6.185

6. Characterization and expression analysis of a chitinase gene (PmChi-5) from black tiger shrimp (Penaeus monodon) under pathogens infection and ambient ammonia-N stress.

Authors: Falin Zhou; Kaimin Zhou; Jianhua Huang; Qibin Yang; Song Jiang; Lihua Qiu; Lishi Yang; Shigui Jiang
Journal: Fish Shellfish Immunol Date: 2017-10-31 Impact factor: 4.581

7. Mechanisms of peripheral phylogeographic divergence in the indo-Pacific: lessons from the spiny lobster Panulirus homarus.

Authors: Ahmad Farhadi; Andrew G Jeffs; Hamid Farahmand; Thankappan Sarasam Rejiniemon; Greg Smith; Shane D Lavery
Journal: BMC Evol Biol Date: 2017-08-18 Impact factor: 3.260

8. Lineage divergence, local adaptation across a biogeographic break, and artificial transport, shape the genetic structure in the ascidian Pyura chilensis.

Authors: Nicolás I Segovia; Cristian Gallardo-Escárate; Elie Poulin; Pilar A Haye
Journal: Sci Rep Date: 2017-03-16 Impact factor: 4.379

Review 9. Using neutral, selected, and hitchhiker loci to assess connectivity of marine populations in the genomic era.

Authors: Pierre-Alexandre Gagnaire; Thomas Broquet; Didier Aurelle; Frédérique Viard; Ahmed Souissi; François Bonhomme; Sophie Arnaud-Haond; Nicolas Bierne
Journal: Evol Appl Date: 2015-07-28 Impact factor: 5.183

10. Combining six genome scan methods to detect candidate genes to salinity in the Mediterranean striped red mullet (Mullus surmuletus).

Authors: Alicia Dalongeville; Laura Benestan; David Mouillot; Stephane Lobreaux; Stéphanie Manel
Journal: BMC Genomics Date: 2018-03-27 Impact factor: 3.969

1 in total

1. Genome assembly of the Australian black tiger shrimp (Penaeus monodon) reveals a novel fragmented IHHNV EVE sequence.

Authors: Roger Huerlimann; Jeff A Cowley; Nicholas M Wade; Yinan Wang; Naga Kasinadhuni; Chon-Kit Kenneth Chan; Jafar S Jabbari; Kirby Siemering; Lavinia Gordon; Matthew Tinning; Juan D Montenegro; Gregory E Maes; Melony J Sellars; Greg J Coman; Sean McWilliam; Kyall R Zenger; Mehar S Khatkar; Herman W Raadsma; Dallas Donovan; Gopala Krishna; Dean R Jerry
Journal: G3 (Bethesda) Date: 2022-04-04 Impact factor: 3.154

1 in total