| Literature DB >> 20403197 |
Melissa H Pespeni1, Thomas A Oliver, Mollie K Manier, Stephen R Palumbi.
Abstract
High-throughput genotype data can be used to identify genes important for local adaptation in wild populations, phenotypes in lab stocks, or disease-related traits in human medicine. Here we advance microarray-based genotyping for population genomics with Restriction Site Tiling Analysis. The approach simultaneously discovers polymorphisms and provides quantitative genotype data at 10,000s of loci. It is highly accurate and free from ascertainment bias. We apply the approach to uncover genomic differentiation in the purple sea urchin.Entities:
Mesh:
Year: 2010 PMID: 20403197 PMCID: PMC2884547 DOI: 10.1186/gb-2010-11-4-r44
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Restriction site tiling analysis identifies polymorphisms and genotypes individuals by hybridization to a custom microarray. Fifty base pair tiles (white circles) are designed to be centered on restriction enzyme cut sites. DNA from an individual is extracted and randomly sheared by sonication. The sample is then divided in half: one part is treated with the restriction enzyme and labeled with green fluorescent dye (Cy3), the other part is treated as a control (without restriction enzyme) and labeled with red fluorescent dye (Cy5). The two parts are mixed and hybridized to the array. This DNA processing and hybridization result in different fluorescent signals reflecting the three possible genotypes for a polymorphic locus: when an individual is homozygous for the cut site (blue triangle) the digested DNA is cut and does not hybridize to the tile, resulting in a high red-to-green ratio (log2 Cy5/Cy3, left panel); however, if an individual is homozygous for a mutation in the cut site (yellow star) then the DNA remains intact and hybridizes to the tile, resulting in high green signal intensity or a low red-to-green ratio (right panel). Heterozygous individuals yield an intermediate red-to-green ratio. Polymorphic loci are identified based on the bi- or trimodal distribution of log ratios across sampled individuals. Individuals can be genotyped based on their log ratio.
Figure 2Frequency histograms of signal intensities for experimental and control tiles. (a) Digested DNA (green, labeled with Cy3) and non-digested DNA (red, Cy5) binding to restriction cut site centered tiles. (b) Cy5 signal intensities for negative control tiles (blue, randomly generated tiles that did not match anywhere in the genome according to BLASTN) and positive control tiles (magenta, matching multi-copy ribosomal DNA).
Figure 3Polymorphic restriction cut site in pyruvate kinase muscle isozyme across 20 individuals. (a) RSTA array log ratio data separate genotypes of individuals sampled. Cool colored circles represent individuals from Boiler Bay, Oregon; warm colored triangles represent individuals from San Diego, California. The data for each individual are in triplicate. (b) Individual genotypes confirmed by restriction digest gels. Lane 1 is an undigested PCR fragment for size reference, while lanes 2 to 10 are treated with the restriction enzyme; lanes 2, 3, 5, 6, 9, and 10 are from heterozygous individuals; lane 4 is from an individual homozygous for the cut site; lanes 7 and 8 are individuals homozygous for a mutation in the cut site. (c) Genotype data resulting from RSTA can be used to look for differences across populations.
Figure 4Principal Components Analysis using RSTA array log ratio data show a signal of population differentiation in a high gene flow species. Symbols represent individuals from Oregon (blue circles) and San Diego (red triangles). (a) All polymorphic coding loci, 6,859; (b) polymorphic coding loci excluding top FST loci, 6,555; and (c) top FST polymorphic coding loci, 304. Patterns were similar for other tiles in non-coding regions.
Figure 5Genome-wide distribution of F. Open bars show the observed distribution for 12,431 polymorphisms. Solid bars show the mean of 10,000 random permutations. Error bars represent standard deviation for permuted distributions. Numbers in boxes show excess number of loci observed over mean permuted.
Comparison of four high-throughput polymorphism detection approaches
| Parameter | SFP | RAD tagging | RAD sequencing | RSTA |
|---|---|---|---|---|
| Marker type | SNPs and indels | Restriction cut site polymorphisms | Sequence data: SNPs next to restriction cut sites | Restriction cut site polymorphisms: distinguishes SNPs and indels |
| Number of loci surveyed | 92,924 | 19,200 (elements on an enriched RAD-tag microarray designed from stickleback) | 26 nucleotides at 41,622 RAD tags | 50,935 |
| Number of polymorphisms identified (informative marker rate) | 3,806 (4% at a 5% false discovery rate cutoff) | 1,990 (10% at a two-fold signal difference cutoff) | Approximately 13,000 (31%) | 12,431 (24%) |
| False discovery rate | 3% (117 out of 121 confirmed correct by sequencing) | 9% (20 out of 22 confirmed correct by sequencing) | Not reported | <1% (113 out of 114 confirmed correct by sequencing) |
| Platform | Custom high-density oligonucleotide array (Affymetrix), 25 bp oligo | cDNA or genomic tiling array (in house synthesis) | Illumina sequencing | Custom high-density oligonucleotide array (Agilent), 50 bp oligo |
| Prior information required | EST, 454 or genome sequence | EST or RAD-tag library for array synthesis | EST or genome sequence to map short sequence reads | EST, 454 or genome sequence |
| Polymorphism identification | Hybridization signal difference among study individuals | Hybridization signal difference between two study individuals | Custom Perl scripts for sequence alignment | Genotype clusters across all study individuals |
| Individual genotype data | No | No | No | Yes |
| Organisms studied | Yeast, | Purple sea urchin |
Numbers are from studies that describe each method: SFP [26]; RAD tagging [25]; RAD sequencing [50]. aSee Gupta et al. [23] for review of high-throughput applications in crop plants.