| Literature DB >> 24282669 |
Robert J Toonen1, Jonathan B Puritz, Zac H Forsman, Jonathan L Whitney, Iria Fernandez-Silva, Kimberly R Andrews, Christopher E Bird.
Abstract
Here, we introduce ezRAD, a novel strategy for restriction site-associated DNA (RAD) that requires little technical expertise or investment in laboratory equipment, and demonstrate its utility for ten non-model organisms across a wide taxonomic range. ezRAD differs from other RAD methods primarily through its use of standard Illumina TruSeq library preparation kits, which makes it possible for any laboratory to send out to a commercial genomic core facility for library preparation and next-generation sequencing with virtually no additional investment beyond the cost of the service itself. This simplification opens RADseq to any lab with the ability to extract DNA and perform a restriction digest. ezRAD also differs from others in its flexibility to use any restriction enzyme (or combination of enzymes) that cuts frequently enough to generate fragments of the desired size range, without requiring the purchase of separate adapters for each enzyme or a sonication step, which can further decrease the cost involved in choosing optimal enzymes for particular species and research questions. We apply this method across a wide taxonomic diversity of non-model organisms to demonstrate the utility and flexibility of our approach. The simplicity of ezRAD makes it particularly useful for the discovery of single nucleotide polymorphisms and targeted amplicon sequencing in natural populations of non-model organisms that have been historically understudied because of lack of genomic information.Entities:
Keywords: Genotype-by-sequencing; NGS; Next-generation sequencing; RAD tag; RAD-seq; RADseq; Restriction site associated DNA (RAD)
Year: 2013 PMID: 24282669 PMCID: PMC3840413 DOI: 10.7717/peerj.203
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Summary of ezRAD results from 2 lanes of Illumina GAIIx sequencing across a range of taxonomic diversity.
Organisms run with ezRAD include: the limpet Cellana talcosa; sea stars Cryptasterina hystera, C. pentagona & Patiria miniata; reef fish Paracirrhites arcatus; corals Porites lobata, P. compressa & Pocillopora damicornis; and the spinner dolphin Stenella longirostris. Lane use indicates the proportion of a single lane of PE100bp sequencing on the GAIIx flow cell. Library prep specifies what those libraries contained. Paired reads are the number of reads in each index parsed file returned after initial quality control (QC) filter from the sequencer. Reads and % Pass QC are the number of sequence reads remaining after excluding all sequences for which Phred scores were <20, contained adapter sequences, or were less than 20 bp long after adapters were cut. Mapped reads and High quality mapped reads are the number of reads overall and the number reads that passed quality control, respectively, that were assembled de novo into contigs. Contig statistics and polymorphic SNP counts reported here come from the bash script pipeline described in File S2.
| Species | Lane | Library | Paired | Reads | % Pass | No. | Mapped | High | Variable | Shared | >10X | >30X | SNPs |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 1/3 | 2 pools of | 8,109,327 | 4,472,365 | 55.15 | 189,444 | 8,389,669 | 3,483,833 | 127,609 | 73,014 | 49,761 | 9,997 | 0.26 |
|
| 1/12 | 1 pool of | 6,441,832 | 3,880,080 | 60.23 | 27,007 | 2,242,589 | 655,509 | 36,666 | 36,666 | 31,127 | 12,827 | 1.15 |
|
| 1/12 | 1 pool of | 1,804,165 | 325,137 | 18.02 | 4,354 | 1,452,171 | 148,580 | 9,501 | 9,501 | 8,058 | 3,538 | 1.85 |
|
| 3/4 | 4 pools of | 15,632,982 | 13,132,267 | 84.00 | 635,376 | 47,718,931 | 14,987,372 | 1,167,981 | 187,597 | 143,254 | 21,914 | 0.23 |
|
| 2/3 | 8 tagged | 13,586,625 | 5,261,386 | 38.72 | 205,360 | 22,892,817 | 3,088,806 | 171,712 | 2,705 | 2,447 | 366 | 0.01 |
|
| 1/6 | 2 pools of | 2,733,965 | 995,543 | 36.41 | 13,340 | 2,980,976 | 324,331 | 20,512 | 10,221 | 7,249 | 2,082 | 0.54 |
|
| 1/6 | 2 tagged | 3,289,007 | 2,929,932 | 89.08 | 164,553 | 5,710,298 | 2,956,701 | 232,734 | 110,454 | 80,221 | 21,658 | 0.49 |
|
| 1/4 | 3 tagged | 5,512,212 | 3,852,694 | 69.89 | 123,874 | 7,011,801 | 2,338,160 | 297,648 | 77,346 | 60,149 | 13,040 | 0.49 |
|
| 1/4 | 3 tagged | 7,131,427 | 3,926,250 | 55.06 | 95,887 | 7,587,534 | 2,689,310 | 275,769 | 65,731 | 47,617 | 9,419 | 0.50 |
|
| 1/3 | 4 pools of | 7,660,053 | 2,563,007 | 33.46 | 43,427 | 9,910,871 | 1,022,661 | 69,208 | 7,502 | 7,280 | 3,828 | 0.17 |
Validation of ezRAD data against genomic contigs.
Comparison of ezRAD results using two different sets of reference contigs, the original ezRAD analysis pipeline contigs and published genomic contigs for the seastar Patiria miniata.
| Reference type | ezRAD | Genomic contigs |
|---|---|---|
| Number of contigs | 635,376 | 179,756 |
| Mapped reads | 47,718,931 | 26,130,869 |
| High quality mapped | 14,987,372 | 21,997,385 |
| Variable sites | 1,167,981 | 1,156,633 |
| Shared SNPs | 187,597 | 151,742 |
| >10× Shared SNPs | 143,254 | 114,620 |
Comparison of most commonly used RAD sequencing methodologies and associated costs.
| No. of | Cut | Shearing | Size | Library prep | Initial | Subsequent | Scalability to | |
|---|---|---|---|---|---|---|---|---|
| ezRAD | 1 or more | Frequent | No | Yes | Low | Very Low | Moderate | Low |
| RAD tags | 1 | Rare | Yes | Yes | High | High | Low | Low |
| GBS | 1 | Rare or | No | No | Moderate | High | Moderate to | Low |
| 2-enzyme | 2 | Rare + | No | No | Moderate | High | Moderate to | Low |
| ddRAD | 2 | Frequent | No | Yes | Moderate | High | Very low | Moderate |
| 2b-RAD | 1 | Frequent | No | No | Moderate | High | Low | Moderate |
Figure 1Bar graph comparing pooled and unpooled libraries of Paracirrhites arcatus.
Relative proportion of high quality mapped reads, total SNPs, shared SNPs with greater than 10× coverage, and cost when employing one of two strategies: (1) preparing one library for every individual (8 individuals here), or (2) preparing two libraries of four pooled individuals. For all categories except Cost, taller bars represent better performance.