| Literature DB >> 21923777 |
Andrew R Whiteley1, Anuradha Bhat, Emilia P Martins, Richard L Mayden, M Arunachalam, Silva Uusi-Heikkilä, A T A Ahmed, Jiwan Shrestha, Matthew Clark, Derek Stemple, Louis Bernatchez.
Abstract
Understanding a wider range of genotype-phenotype associations can be achieved through ecological and evolutionary studies of traditional laboratory models. Here, we conducted the first large-scale geographic analysis of genetic variation within and among wild zebrafish (Danio rerio) populations occurring in Nepal, India, and Bangladesh, and we genetically compared wild populations to several commonly used lab strains. We examined genetic variation at 1832 polymorphic EST-based single nucleotide polymorphisms (SNPs) and the cytb mitochondrial gene in 13 wild populations and three lab strains. Natural populations were subdivided into three major mitochondrial DNA clades with an average among-clade sequence divergence of 5.8%. SNPs revealed five major evolutionarily and genetically distinct groups with an overall FST of 0.170 (95% CI 0.105-0.254). These genetic groups corresponded to discrete geographic regions and appear to reflect isolation in refugia during past climate cycles. We detected 71 significantly divergent outlier loci (3.4%) and nine loci (0.5%) with significantly low FST values. Valleys of reduced heterozygosity, consistent with selective sweeps, surrounded six of the 71 outliers (8.5%). The lab strains formed two additional groups that were genetically distinct from all wild populations. An additional subset of outlier loci was consistent with domestication selection within lab strains. Substantial genetic variation that exists in zebrafish as a whole is missing from lab strains that we analysed. A combination of laboratory and field studies that incorporates genetic variation from divergent wild populations along with the wealth of molecular information available for this model organism provides an opportunity to advance our understanding of genetic influences on phenotypic variation for a vertebrate species.Entities:
Keywords: genetic subdivision; genomics; outlier analysis; single nucleotide polymorphisms; zebrafish
Mesh:
Substances:
Year: 2011 PMID: 21923777 PMCID: PMC3627301 DOI: 10.1111/j.1365-294X.2011.05272.x
Source DB: PubMed Journal: Mol Ecol ISSN: 0962-1083 Impact factor: 6.185
Sample locations, abbreviations, geographic locations and sample sizes for both mitochondrial DNA (mtDNA) (NmtDNA) and single nucleotide polymorphisms (SNPs) (NSNP). SHK could not be examined with SNPs, and mtDNA sequence was not examined for DHO. SRN and WYD were not used for most SNP analyses because of small sample sizes (see Results)
| Location | Abbreviation (ID) | Latitude | Longitude | ||
|---|---|---|---|---|---|
| Paruwa Sota River, Western Nepal | PAR | 28.125° | 81.799° | 15 | 19 |
| Khair Khola, Central Nepal | KHA | 27.618° | 84.533° | 15 | 19 |
| Bering River, Eastern Nepal | BER | 26.642° | 87.937° | 15 | 19 |
| Shikarpur, near Coochibihar, West Bengal, India | SHK | 26.321° | 89.463° | 10 | — |
| Dharola, India | DHO | 26.282° | 89.237° | — | 15 |
| Jorai, India | JOR | 26.497° | 89.821° | 15 | 17 |
| Panigram, India | PGM | 26.436° | 89.163° | 13 | 19 |
| N. Parganas, India | PNS | 22.879° | 88.767° | 15 | 20 |
| Uttarbhag, India | UTR | 22.361° | 88.506° | 15 | 19 |
| Rice paddy between Dhaka and Chittagong, Bangladesh | RCH | 23.518° | 90.851° | 14 | 14 |
| Chittagong, Bangladesh | CHT | 22.474° | 91.783° | 15 | 18 |
| Sringeri, Thunga R., Karnataka, India | SRN | 13.417° | 75.251° | 3 | 3 |
| Wayanad, Karampuzha Dam, Kerala, India | WYD | 11.619° | 76.174° | 2 | 2 |
| AB lab strain | AB | — | — | 10 | 15 |
| SJA lab strain | SJA | — | — | 10 | 15 |
| TM1 lab strain | TM1 | — | — | 10 | 5 |
Fig. 1Map of study area (India, Nepal, Bangladesh) with sampling locations (black circles) and corresponding abbreviations from Table 1 for wild population samples.
Genetic diversity summary statistics for zebrafish from India, Nepal, Bangladesh and three lab strains
| mtDNA | SNPs | ||||
|---|---|---|---|---|---|
| ID | Haplotypes observed | ||||
| PAR | 25 | 0.87 (0.05) | 0.88 (0.48) | 2–8 | 0.154 |
| KHA | 13 | 0.92 (0.05) | 0.24 (0.15) | 49–57 | 0.060 |
| BER | 24 | 0.90 (0.07) | 0.40 (0.23) | 20, 40–48 | 0.223 |
| SHK | 16 | 0.89 (0.08) | 0.53 (0.31) | 16, 17, 20, 30, 58, 59 | — |
| DHO | — | — | — | — | 0.226 |
| JOR | 13 | 0.64 (0.13) | 0.26 (0.16) | 16, 17, 18, 19, 20 | 0.224 |
| PGM | 17 | 0.92 (0.05) | 0.55 (0.31) | 16, 17, 20, 21, 22, 23, 24, 25 | 0.253 |
| PNS | 20 | 0.83 (0.08) | 0.55 (0.31) | 20, 26, 27, 28, 29, 30, 31 | 0.272 |
| UTR | 20 | 0.90 (0.05) | 0.62 (0.35) | 9, 20, 26, 61, 62, 63, 64, 65 | 0.219 |
| RCH | 67 | 0.89 (0.06) | 1.30 (0.70) | 32–39 | 0.215 |
| CHT | 19 | 0.57 (0.15) | 0.49 (0.28) | 10–15 | 0.068 |
| SRN | 0 | — | — | 60 | — |
| WYD | 1 | — | — | 66, 67 | — |
| AB | 0 | 0 | 0 | 9 | 0.142 |
| SJA | 0 | 0 | 0 | 9 | 0.027 |
| TM1 | 0 | 0 | 0 | 20 | 0.235 |
Mitochondrial DNA (mtDNA) diversity is represented by S, number of segregating sites, h, haplotype diversity and π, nucleotide diversity. Standard deviations are in parentheses. Numbers assigned to haplotypes in the ‘haplotypes observed’ column correspond to Fig. 2. SNP diversity is summarized by HS, mean unbiased expected heterozygosity within populations or lab strains. SHK was not examined with single nucleotide polymorphisms (SNPs), and mtDNA sequence was not examined for DHO. Genetic diversity summary statistics are not presented for samples SRN and WYD because of small sample size.
Fig. 2Bayesian mitochondrial DNA (mtDNA) phylogenetic analysis of zebrafish haplotypes from wild populations and lab strains. Numbers at branch tips are haplotypes referred to in Table 1. Three genetic groups were defined by mitochondrial DNA (mtDNA) (perpendicular lines) and labelled according to sampling locations. Haplotypes are colour coded according to these groups. The scale shows mean expected number of substitutions per site. Numbers along branches show posterior probabilities of nodes.
Fig. 3Map of India with sampling locations (black circles), sample abbreviations and colour-coded genetic clusters of wild populations from (a) mitochondrial DNA (mtDNA) analysis and (b) single nucleotide polymorphism (SNP) analysis with STRUCTURE.
Fig. 4Proportion of the genome (Q) of each individual assigned by STRUCTURE to each population sample based on single nucleotide polymorphism (SNP) genotypes. Results correspond to models with (a) K = 5, (b) K = 6 and (c) K = 7. Each column corresponds to an individual, and sample locations are separated by vertical bars. Each of the seven clusters was given a separate colour that corresponds to Fig. 3b.
Fig. 5Hierarchical outlier locus analysis of (a) wild populations without lab strains and (b) wild populations with lab strains included. Black dotted lines show the 1% and 99% quantiles. Black filled circles represent loci significant at P ≤ 0.01 and with scaled heterozygosity >0.20. Heterozygosity on the x-axis is scaled by (1−FST).
Fig. 6FST as a function of chromosome position for the outlier locus analysis that included (a) wild populations without lab strains and (b) wild populations with lab strains. Asterisks are shown for significant (P ≤ 0.01) high and low outlier loci. Red asterisks show outliers surrounded by a window of significantly reduced heterozygosity. Blue asterisks represent outliers surrounded by a window of significantly elevated LD.
Fig. 7Decay plots of LD (r2) estimates for (a) a representative wild population (pooled individuals from BER, DHO, JOR and PGM) and (b) a representative lab strain (AB). Grey circles are pairwise r2. Black circles are the average r2 for each 1 Mb distance group for which a logarithmic trend line was fitted (solid lines; (a) y = −0.00082Ln (x) + 0.018, (b) y = −0.065Ln (x) + 0.306). In (b), mean r2 values were truncated at 65 Mb because of bias introduced by small number of data points in each 1 Mb interval beyond this point.