Literature DB >> 19620973

Virtual terminator nucleotides for next-generation DNA sequencing.

Jayson Bowers1, Judith Mitchell, Eric Beer, Philip R Buzby, Marie Causey, J William Efcavitch, Mirna Jarosz, Edyta Krzymanska-Olejnik, Li Kung, Doron Lipson, Geoffrey M Lowman, Subramanian Marappan, Peter McInerney, Adam Platt, Atanu Roy, Suhaib M Siddiqi, Kathleen Steinmann, John F Thompson.   

Abstract

We synthesized reversible terminators with tethered inhibitors for next-generation sequencing. These were efficiently incorporated with high fidelity while preventing incorporation of additional nucleotides, and we used them to sequence canine bacterial artificial chromosomes in a single-molecule system that provided even coverage for over 99% of the region sequenced. This single-molecule approach generated high-quality sequence data without the need for target amplification and thus avoided concomitant biases.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19620973      PMCID: PMC2719685          DOI: 10.1038/nmeth.1354

Source DB:  PubMed          Journal:  Nat Methods        ISSN: 1548-7091            Impact factor:   28.547


Highly parallel sequencing technologies have revolutionized biology by providing orders of magnitude more DNA sequence data than previously possible1. Most of these technologies require synthesizing DNA from an existing template using either a polymerase or ligase2. The first commercialized technologies required amplification of the template DNA prior to sequencing, but this can introduce a host of biases caused by differential behavior from many factors3, making even representation or accurate quantitation of samples difficult. Single-molecule sequencing4,5 can eliminate biases introduced by amplification. Any method employing polymerase-based sequencing-by-synthesis encounters the problem of how to count bases within homopolymers (sequences that repeat the same base). One strategy relies on nucleotide analogs that are capable of being incorporated once but block subsequent additions. If inhibition is efficient and reversible, the nucleotides could be used to step through homopolymer regions one base at a time. Modifications of the base with a fluorescent label and the 3′ hydroxyl with a blocker that prevents extension have been described6,7. These molecules allow stepwise addition through a homopolymer repeat but require removal of both modifications. Here, we report a different strategy, creating four analogs modified at only a single position. Each contain three features: i) free 3′-OH maintaining natural interactions at the polymerase active site, ii) base modified with a propargylamine connected to a cleavable linker, and iii) fluorescent dye tethered to an inhibitor, attached via the cleavable linker. We call these “Virtual Terminator” nucleotides since they are efficiently incorporated yet block incorporation of a second nucleotide on a homopolymer template, despite possessing a free 3′ hydroxyl. A similar approach but with just a single nucleotide has been described8. In a previous study demonstrating the feasibility of directly sequencing many single DNA molecules bound to a surface5, M13 bacteriophage was sequenced except for homopolymers longer than three. This first generation of nucleotides (Supplementary Figure 1) was modified such that the Cy5 dye was attached to the base with a linker containing a disulfide bond (Cy5-12ss-dNTP analogs,). Cy5 served as the fluorescent marker during the imaging phase of sequencing and could then be removed by cleavage of the disulfide bond prior to the next incorporation event. Despite the large size of the linker and fluorescent tags, the DNA polymerase was able to efficiently incorporate them, albeit at a slower rate than natural nucleotides. These nucleotides maintained a low misincorporation rate but were not homopolymer-competent. The Cy5-containing nucleotides described herein were tested for their ability to incorporate efficiently while preventing a second round of synthesis. These were all modified via the same linker attachment site present in the commercially-available nucleotides All of our nucleotides were synthesized as described in the Supplementary Note according to reaction schemes shown in Supplementary Figures 2 and 3. To determine the biochemical properties of our nucleotide analogs (Supplementary Table 1), we analyzed incorporation into primer-template DNA. One measure of the efficiency with which a nucleotide can be incorporated is the polymerization rate divided by the nucleotide dissociation rate (kPol/KD)9. This provided a measure of the likelihood that a given nucleotide will dissociate from the active site versus move onto the next step and be incorporated. KDs and kPols are provided for selected analogs (Table 1). The novel analogs did not incorporate as fast as Cy5-12ss-dNTPs but this rate was still sufficiently fast for sequencing.
Table 1

Incorporation rates for selected analogs

For selected analogs listed in Supplementary Table 1, the KD, kpol, and ratio of kpol/kD are provided. Measurements were carried out as described in Online Methods. Compound structures are shown in Supplementary Table 1 and Supplementary Figure 1. The “type” column is a shorthand nomenclature which indicates the incorporated nucleotide connected via the tether (*) to the inhibitory component.

AnalogTypeKD(μM)kpol (s-1)kpol/KD
17U*pU4.93 (+/-0.9)0.86 (+/-0.07)0.17
18U*U1.90.870.46
19G*pCp3.98 (+/-1.0)0.7 (+/-0.07)0.18
20A*pCp13.5 (+/-0.9)0.99 (+/-0.12)0.07
21U*pCp12.1 (+/-0.7)1.04 (+/-0.03)0.09
23C*pC4.14 (+/-0.7)0.77 (+/-0.06)0.19
2512ss-dUTP4.9 (+/-0.6)2.4 (+/-0.11)0.49
29A*pU2.7 (+/-1.0)0.57 (+/-0.06)0.21
30U**pU4.4 (+/-1.3)0.92 (+/-0.12)0.21
31G***pU2.0 (+/-0.4)0.98 (+/-0.27)0.49
32C****pC3.6 (+/-0.9)0.67 (+/-0.06)0.19
Nucleotides must be incorporated in the correct position and also have very little misincorporation. Fidelity was assessed by determining the pre-steady-state rate of incorporation into primer-templates that coded for incorrect additions. Select analogs were tested against three misincorporation templates, one for each possible mispair. Despite being incorporated at lower efficiency, these analogs were added with fidelity similar to Cy5-12ss-dNTPs (Supplementary Table 2) and natural nucleotides10. To determine whether the novel analogs can function as reversible terminators, we performed experiments similar to those described above except that the template encoded two base homopolymers. The rates of both first and second base addition were measured and the first then divided by the second (k1/k2, Supplementary Table 3). This described the effectiveness of the analog as a homopolymer run-through inhibitor normalized to incorporation efficiency at the first base. We observed a striking correlation between the number of phosphates on the inhibitory base and its effectiveness as a reversible terminator. Analogs lacking phosphates on the inhibitory base gave low k1/k2 values, indicating limited usefulness as a homopolymer inhibitor. In contrast, monophosphates on the inhibitory moiety had higher k1/k2 while the bisphosphate analogs showed even greater effectiveness as terminators. These biphosphate analogs provide the right combination of incorporation at correct positions and not at incorrect and homopolymer sites. Critical to the utility of reversible terminators is the ability to reverse the inhibition prior to subsequent base additions. We used a template with five consecutive Cs and performed base addition cycles followed by removal of the inhibitor-dye. Five cycles of addition-cleavage on such a template resulted in an almost perfectly synchronous walk through the homopolymer (Figure 1). Thus, these analogs were highly effective reversible terminators.
Figure 1

Virtual Terminator nucleotide base-by-base incorporation in a G5 homopolymer

The substrate used for testing homopolymer sequencing is shown along with successive cycles of addition of compound 22 in a solution phase reaction. Removal of the inhibitor-dye was accomplished by cleavage of the disulfide using TCEP, a reducing agent, followed by treatment with iodoacetamide to cap the free thiol. After each cycle, an aliquot of the reaction is run on an ABI3730 sequencing machine to achieve single base resolution of DNA. Length markers are shown in orange and the DNA being synthesized is shown in blue.

To test the Virtual Terminator nucleotides with mammalian DNA, canine BAC AC187329 was resequenced. This previously-sequenced BAC contains 194 kb of complex mammalian sequence. Two pass, single-molecule sequencing in which each molecule is sequenced twice, as described previously5 and in Online Methods, was performed with a low-capacity prototype sequencing instrument. Image analysis software allowed matching of reads from the same DNA molecule in the two passes. Comparison of two sequences from each molecule yielded a high confidence consensus sequence for that molecule which can then be combined with other reads to generate a final sequence. It was possible to use just single pass sequence data to generate a high quality consensus sequence with the higher error rate of the individual read offset by the higher coverage obtainable via single pass. Prior to alignment of sequence reads to the BAC reference, filtering of raw reads was carried out to eliminate artifacts and non-informative strands, as described in Online Methods. After filtering, high quality reads from 123,418 DNA strands were obtained that met criteria for both passes. An even larger number of DNA strands had high quality reads that met criteria for just one pass. Alignment of the two pass sequences to the reference yielded a median coverage of 15 (Figure 2). If only uniquely aligned sequences were included, even coverage of nearly the entire sequence was generated. The only low coverage positions corresponded to repetitive regions which are not capable of yielding unique alignments. If all uncovered regions were combined, they amounted to only 0.2% of the entire sequence.
Figure 2

Sequence coverage of BAC

Depth of coverage for the BAC is shown across its length. When all reads (non-unique) were mapped to the sequence (red), repeat regions received higher coverage due to multiply aligning reads. When only uniquely aligning sequences were included (blue), the repeat regions were under-represented. When a fractional correction was made to multiply aligning sequences (black), even coverage was obtained across the length of the BAC. When only unique alignments were used, the longest uncovered stretch of DNA was 279 bp, corresponding to a repeat region. If all alignments were used, the longest uncovered region was a highly AT rich 103 bp segment that included 28 consecutive AT nucleotides and many other shorter AT runs.

The average per nucleotide error rate for all strands ≥ 15 in length was 0.58% with little variation as a function of read length (Supplementary Figure 4). The length-independence of error rate was a natural property of single-molecule sequencing in which it is impossible to dephase the sequence since each molecule is read individually. If incorporation was missed during one cycle, it can occur in the next with no loss of information. Most errors were deletions, likely caused by incorporation without detection. Such deletions could result from either chemical or optical imperfections. Because these analogs were designed to overcome issues with homopolymer sequencing, it was of special interest to determine how well those regions sequenced. Over 98% of all 38,353 homopolymers in this BAC had coverage of ≥ 10 reads and can be called for length. Of those called, over 99.99% are called correctly. Only when the homopolymer length reached ten were fewer than half of homopolymers not covered with a sufficient number of reads for a call. Increased sequencing depth, as is generated by the commercial instrument, would alleviate this problem. Single-molecule sequencing provides tremendous advantages in experimental design relative to the more classical sequencing of ensembles of molecules generated by amplification. Turning this potential into a practical method for sequencing DNA has required several advances as described here. Previous work demonstrated the feasibility of this approach but homopolymers were difficult to call accurately. These novel nucleotides have been modified to include a tethered inhibitor to take advantage of interactions in the active site so that additional nucleotides may be prevented from entering and being incorporated. Highly efficient, fast cleavage of the dye and tethered inhibitor at the proper time (and not before) using mild conditions was also achieved. The analogs presented here include all these properties and thus enabled single-molecule sequencing with complex mammalian DNA at a scale not previously possible. The orders of magnitude higher throughput in single molecule sequencers, compared to Next generation sequencers that require target amplification promise a scalable method for achieving the $1000 genome. This work indicates that only technical optimization and not new technology is required to achieve that end. Figure 1: Cy5-12ss-dNTP analogs Figure 2: Scheme 1 for nucleotide analog synthesis Figure 3: Scheme 2 for nucleotide analog synthesi Figure 4: Error rate Table 1: Nucleotide analogs Table 2: Misincorporation rates for two analogs Table 3: Termination ability of analogs to prevent second base incorporation
  10 in total

1.  Sequence information can be obtained from single DNA molecules.

Authors:  Ido Braslavsky; Benedict Hebert; Emil Kartalov; Stephen R Quake
Journal:  Proc Natl Acad Sci U S A       Date:  2003-03-21       Impact factor: 11.205

Review 2.  The impact of next-generation sequencing technology on genetics.

Authors:  Elaine R Mardis
Journal:  Trends Genet       Date:  2008-02-11       Impact factor: 11.639

3.  Single-molecule DNA sequencing of a viral genome.

Authors:  Timothy D Harris; Phillip R Buzby; Hazen Babcock; Eric Beer; Jayson Bowers; Ido Braslavsky; Marie Causey; Jennifer Colonell; James Dimeo; J William Efcavitch; Eldar Giladi; Jaime Gill; John Healy; Mirna Jarosz; Dan Lapen; Keith Moulton; Stephen R Quake; Kathleen Steinmann; Edward Thayer; Anastasia Tyurina; Rebecca Ward; Howard Weiss; Zheng Xie
Journal:  Science       Date:  2008-04-04       Impact factor: 47.728

4.  The fidelity of DNA synthesis catalyzed by derivatives of Escherichia coli DNA polymerase I.

Authors:  K Bebenek; C M Joyce; M P Fitzgerald; T A Kunkel
Journal:  J Biol Chem       Date:  1990-08-15       Impact factor: 5.157

5.  Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators.

Authors:  Jingyue Ju; Dae Hyun Kim; Lanrong Bi; Qinglin Meng; Xiaopeng Bai; Zengmin Li; Xiaoxu Li; Mong Sano Marma; Shundi Shi; Jian Wu; John R Edwards; Aireen Romu; Nicholas J Turro
Journal:  Proc Natl Acad Sci U S A       Date:  2006-12-14       Impact factor: 11.205

6.  Kinetic mechanism of DNA polymerase I (Klenow).

Authors:  R D Kuchta; V Mizrahi; P A Benkovic; K A Johnson; S J Benkovic
Journal:  Biochemistry       Date:  1987-12-15       Impact factor: 3.162

7.  What would you do if you could sequence everything?

Authors:  Avak Kahvejian; John Quackenbush; John F Thompson
Journal:  Nat Biotechnol       Date:  2008-10       Impact factor: 54.908

8.  Termination of DNA synthesis by N6-alkylated, not 3'-O-alkylated, photocleavable 2'-deoxyadenosine triphosphates.

Authors:  Weidong Wu; Brian P Stupi; Vladislav A Litosh; Dena Mansouri; Demetra Farley; Sidney Morris; Sherry Metzker; Michael L Metzker
Journal:  Nucleic Acids Res       Date:  2007-09-18       Impact factor: 16.971

9.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.

Authors:  Juliane C Dohm; Claudio Lottaz; Tatiana Borodina; Heinz Himmelbauer
Journal:  Nucleic Acids Res       Date:  2008-07-26       Impact factor: 16.971

10.  Accurate whole human genome sequencing using reversible terminator chemistry.

Authors:  David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

  10 in total
  42 in total

Review 1.  RNA sequencing: advances, challenges and opportunities.

Authors:  Fatih Ozsolak; Patrice M Milos
Journal:  Nat Rev Genet       Date:  2010-12-30       Impact factor: 53.242

Review 2.  Single-molecule direct RNA sequencing without cDNA synthesis.

Authors:  Fatih Ozsolak; Patrice M Milos
Journal:  Wiley Interdiscip Rev RNA       Date:  2011-03-14       Impact factor: 9.957

3.  Single molecule sequencing with a HeliScope genetic analysis system.

Authors:  John F Thompson; Kathleen E Steinmann
Journal:  Curr Protoc Mol Biol       Date:  2010-10

4.  Genome-wide identification and characterization of replication origins by deep sequencing.

Authors:  Jia Xu; Yoshimi Yanagisawa; Alexander M Tsankov; Christopher Hart; Keita Aoki; Naveen Kommajosyula; Kathleen E Steinmann; James Bochicchio; Carsten Russ; Aviv Regev; Oliver J Rando; Chad Nusbaum; Hironori Niki; Patrice Milos; Zhiping Weng; Nicholas Rhind
Journal:  Genome Biol       Date:  2012-04-24       Impact factor: 13.583

5.  True single-molecule DNA sequencing of a pleistocene horse bone.

Authors:  Ludovic Orlando; Aurelien Ginolhac; Maanasa Raghavan; Julia Vilstrup; Morten Rasmussen; Kim Magnussen; Kathleen E Steinmann; Philipp Kapranov; John F Thompson; Grant Zazula; Duane Froese; Ida Moltke; Beth Shapiro; Michael Hofreiter; Khaled A S Al-Rasheid; M Thomas P Gilbert; Eske Willerslev
Journal:  Genome Res       Date:  2011-07-29       Impact factor: 9.043

6.  Extensive relationship between antisense transcription and alternative splicing in the human genome.

Authors:  A Sorana Morrissy; Malachi Griffith; Marco A Marra
Journal:  Genome Res       Date:  2011-06-30       Impact factor: 9.043

7.  Native molecular state of adeno-associated viral vectors revealed by single-molecule sequencing.

Authors:  Philipp Kapranov; Lingxia Chen; Debra Dederich; Biao Dong; Jie He; Kathleen E Steinmann; Andrea R Moore; John F Thompson; Patrice M Milos; Weidong Xiao
Journal:  Hum Gene Ther       Date:  2011-10-04       Impact factor: 5.695

Review 8.  Sequencing technologies - the next generation.

Authors:  Michael L Metzker
Journal:  Nat Rev Genet       Date:  2009-12-08       Impact factor: 53.242

Review 9.  Deep sequencing: becoming a critical tool in clinical virology.

Authors:  Miguel E Quiñones-Mateu; Santiago Avila; Gustavo Reyes-Teran; Miguel A Martinez
Journal:  J Clin Virol       Date:  2014-06-24       Impact factor: 3.168

10.  Chromatin profiling by directly sequencing small quantities of immunoprecipitated DNA.

Authors:  Alon Goren; Fatih Ozsolak; Noam Shoresh; Manching Ku; Mazhar Adli; Chris Hart; Melissa Gymrek; Or Zuk; Aviv Regev; Patrice M Milos; Bradley E Bernstein
Journal:  Nat Methods       Date:  2009-11-29       Impact factor: 28.547

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.