| Literature DB >> 25623996 |
Claudia Brandariz-Fontes1, Miguel Camacho-Sanchez2, Carles Vilà2, José Luis Vega-Pla3, Ciro Rico4, Jennifer A Leonard2.
Abstract
Library preparation protocols for high-throughput DNA sequencing (HTS) include amplification steps in which errors can build up. In order to have confidence in the sequencing data, it is important to understand the effects of different Taq polymerases and PCR amplification protocols on the DNA molecules sequenced. We compared thirteen enzymes in three different marker systems: simple, single copy nuclear gene and complex multi-gene family. We also tested a modified PCR protocol, which has been suggested to reduce errors associated with amplification steps. We find that enzyme choice has a large impact on the proportion of correct sequences recovered. The most complex marker systems yielded fewer correct reads, and the proportion of correct reads was greatly affected by the enzyme used. Modified cycling conditions did reduce the number of incorrect sequences obtained in some cases, but enzyme had a much greater impact on the number of correct reads. Thus, the coverage required for the safe identification of genotypes using one of the low quality enzymes could be seven times larger than with more efficient enzymes in a biallelic system with equal amplification of the two alleles. Consequently, enzyme selection for downstream HTS has important consequences, especially in complex genetic systems.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25623996 PMCID: PMC4306961 DOI: 10.1038/srep08056
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Coverage necessary to reach a 99.9% probability of recovering three copies of the correct sequence for all alleles (based on the proportion of correct reads). Since not all alleles in a PCR amplify equally well, we calculated the coverage needed when two alleles amplify at the same rate (equal amplification), and when one allele yields twice as many products as the other (unequal amplification). Enzymes that did not amplify are marked n.a. and those which amplified but for which there was insufficient data to calculate coverage are labeled i.d. Abbreviations are those used in Figures 1 and 2 and the text
| Enzyme | Abbreviation | Test 1 | Test 2 equal amplification | Test 2 unequal amplification |
|---|---|---|---|---|
| Phusion® High Fidelity DNA Polymerase (Finnzymes) | Phusion | 7 | 42 | 87 |
| KAPA HiFi™ (Kapa Biosystems) | KapHF | 7 | i.d. | i.d. |
| Pwo® DNA Polymerase (Roche) | Pwo | 7 | n.a. | n.a. |
| AmpliTaq Gold® (Applied Biosystems) | Gold | 9 | 48 | 88 |
| i-MaxTM II DNA Polymerase (iNtRON Biotechnology) | iMax | 11 | 57 | 99 |
| Taq DNA Polymerase (Roche) | Roche Taq | 11 | 120 | 185 |
| Velocity DNA Polymerase (Bioline) | Velocity | 12 | n.a. | n.a. |
| HotStarTaq® DNA Polymerase (Qiagen) | HotStar | 14 | 97 | 152 |
| FastStart® High Fidelity PCR System (Roche) | FastStart | 14 | 45 | 86 |
| Biotaq® (Bioline) | Biotaq | 16 | 271 | 395 |
| OneTaq™ DNA Polymerase (New England Biolabs) | OneTaq | i.d. | n.a. | n.a. |
| Vent® DNA Polymerase (New England Biolabs) | Vent | n.a. | n.a. | n.a. |
| Deep Vent® DNA Polymerase (New England Biolabs) | DeepVent | n.a. | n.a. | n.a. |
Figure 1Proportion of correct reads for the three genetic systems (simple: a single allele per individual, squares; medium: two alleles, circles; and complex: multiple alleles, triangles) using standard PCR conditions (open) and modified PCR conditions to reduce chimera formation (gray).
The size of the shape is indicative of the number of reads (see legend). All enzymes yielded at least 50% correct reads in the simplest system, mitochondrial DNA (Test 1; open squares). Some enzymes only worked for a given set of conditions (cycling conditions/genetic system). A group of enzymes consisting of Phusion, Gold and FastStart yielded a high proportion of correct reads cosistently accross all conditions. Others, such as Roche Taq, HotStar and Biotaq, yielded a low percent of correct reads for the more complex systems (MHC class I and MHC class II). Abbreviations as defined in Table 1.
Figure 2Probability of obtaining 3 or more correct sequences for a given number of reads based on the proportion of correct reads observed for each enzyme and genetic system.
(A). For the simplest genetic system, with only one allele per individual. (B). For a locus with two alleles that amplify equally well (3 or more correct sequences for each of the two alleles). (C). For a locus with two alleles where one amplifies twice as well as the other. Note that the scale on the X-axis in panel A is different from that in B and C.
Primers used in first and second round reactions for all tests. We used published primers (references in text) upon which an M13 tail was added (shown in lower case). MIDs 1–9625 were used in both the forward and reverse primers
| Test | Primer | Sequence 5′ – 3′ |
|---|---|---|
| Test 1 | Thr-L-t | gttttcccagtcacgacGAATTCCCCGGTCTTGTAAACC |
| Test 1 | ddl5-t | aacagctatgaccatgCATTAATGCACGACGTACATAGG |
| Test 2 | PpLAa2U270 -t | gttttcccagtcacgacGCTTCTCATCCTAGTTCCCTT |
| Test 2 | Ppa2L542-t | aacagctatgaccatgGCCTAGGAGTGCAGCAGA |
| Test 3 | Be3-t | gttttcccagtcacgacGGGTCTCACACCYKCCAG |
| Test 3 | Be4-t | aacagctatgaccatgGMGCWGCAGSGTCTCYTT |
| Second round | forward | CGTATCGCCTCCCTCGCGCCATCAG[ |
| Second round | reverse | CTATGCGCCTTGCCAGCCCGCTCAG[ |