Literature DB >> 23731654

Improved coverage and accuracy with strand-conserving sequence enrichment.

Abstract

Targeted next-generation sequencing is becoming a common tool in the molecular diagnostic laboratory. However, currently available methods to enrich for regions of interest in the DNA sequence suffer from drawbacks such as high cost, complex protocols, lack of clinical-level accuracy and uneven target coverage. A target-enrichment approach using complementary long padlock probes described in a recent article significantly improves on previous methods in most of these areas. SEE RELATED RESEARCH: http://genomemedicine.com/content/5/5/50.

Entities: Chemical

Year: 2013 PMID： 23731654 PMCID： PMC3706846 DOI： 10.1186/gm450

Source DB: PubMed Journal: Genome Med ISSN： 1756-994X Impact factor: 11.117

From whole-genome sequencing to target capture

In the almost 13 years since the first whole human genome was sequenced and published [1,2], tremendous advances in technology have enabled the sequencing of human genomes for a fraction of the cost and time. However, although the cost of sequencing has dropped considerably, large-scale whole-genome sequencing remains challenging, particularly in the clinical arena. This is due to the still significant cost of sequencing an entire human genome, and the challenges of analyzing enormous amounts of data with tools that are not standardized to a level acceptable for routine diagnostic use. Consequently, targeted sequencing approaches may be more suitable for clinically actionable genes. Cheap and high-quality targeted sequencing is key for a number of clinical research applications, including large-scale variant screening in disease genes or as follow-up for genetic markers identified as significant in genome-wide association studies. Various methods have been developed to enable whole-exome sequencing and targeted-region sequencing. Early on, solid-state capture arrays were used, but these were expensive and had relatively complex protocols [3]. In-solution capture and PCR-based enrichment methods have reduced the cost and complexity of protocols considerably [4]. These improvements led to a wider adoption of next-generation sequencing and, in the past 12 months particularly, an increase in the use of targeted resequencing as a diagnostic tool [5]. Nevertheless, current methods are far from perfect. For example, PCR-based methods require highly multiplexed oligonucleotide pairs targeted to heterogeneous sequences with a range of melting temperatures and CG content to generate hundreds or thousands of amplicons in a single tube. This leads to differences in amplicon presentation and uneven sequence coverage. Hybridization-based methods exhibit significantly more off-target capture than other enrichment methods, do not capture repetitive sequences, and poorly cover GC- and AT-rich regions. Methods employing 'capture by circularization' (Figure 1), such as connector inversion probes (CIPs), also have problems. These methods use single-stranded DNA molecules with gene-specific targeting regions at the 5' and 3' ends that are complementary to the targeted genomic DNA [6]. After hybridization of the targeting ends of the CIP to the genomic DNA, a single-stranded DNA circle is formed and closed by gap filling and ligation. The single-stranded DNA circle is then linearized by restriction digest, and the target region is enriched by PCR and finally sequenced. CIPs require a large backbone for the probes to capture targets efficiently, which makes them expensive and difficult to manufacture [7].

Figure 1

Depiction of the cLPP and CIP methods. cLPP captures both strands of the targeted genomic DNA, generating two complementary single-stranded DNA circles. Each of the strands is then sequenced in the forward and reverse direction to yield four unique reads. CIP captures only one strand of the target genomic DNA region and generates a single-stranded DNA circle. The target region is then enriched by PCR and sequencing performed. The size of a target region is limited to a few megabases, which restricts the number of genes/exons that can be included in a clinical sequencing panel. In addition, all current capture methods use only one strand of genomic DNA, missing out on an additional level of possible accuracy.

Overcoming current limitations in target enrichment

By contrast with standard capture methods, the complementary long padlock probe (cLPP) approach, as presented by Shen et al. in a recent article [8], captures both strands of the target region, effectively doubling the target sequence information compared with other capture methods. This is achieved by generating double-stranded CIPs that are incubated at high temperatures to create single DNA strands, and then hybridized to the sense and antisense strands of genomic DNA, effectively forming two complementary single-stranded DNA circles. In addition, cLPP enables the sequencing of both strands in both the forward and reverse direction (Shen et al. call this reciprocal paired-end sequencing), resulting in a total of four unique sequence reads per template. This redundancy reduces uneven coverage due to differences in the amplification efficiencies of the target regions, and increases coverage and accuracy. This should lead to increased confidence in variant calls in the downstream bioinformatics analysis, and might allow for a reduced average depth of sequence coverage resulting in less sequencing per sample - thus lowering cost. Shen et al. also demonstrate that copy number variation (CNV) detection can be improved with this enrichment method owing to its significantly better discrimination of high- and low-covered targets. An additional interesting potential application for cLPP is the targeted resequencing of problematic DNA samples derived from formalin-fixed paraffin-embedded (FFPE) tissues. DNA extracted from FFPE samples frequently contains lesions such as abasic sites that lead to a significant increase in sequencing errors when using traditional single-strand sequence capture methods [9]. Owing to the ability of cLPP to capture both strands, it could become a compelling option for targeted resequencing of these sample types. Although cLPP appears to be better suited than traditional CIPs for clinical use, both methods require a large sample size to be economical because of the initial cost of assay development. Furthermore, to our knowledge, reagents based on cLPP are not yet commercially available, which poses a challenge to its widespread adoption.

Conclusion

cLPP is an innovative new approach for high-throughput target enrichment for next-generation sequencing. It improves on a number of shortcomings of current targeted sequencing methods such as accuracy, CNV detection and cost. Most compelling is its ability to preserve strand information and separately sequence sense and antisense strands. Beyond the resulting improvement of variant detection fidelity, other applications that rely on double-strand targeting could benefit. Such applications include problematic DNA samples, where redundancy is important to retrieve as much information as possible because of damage to a single DNA strand.

List of abbreviations

CIPs: connector inversion probes; cLPP: complementary long padlock probes; CNV: copy number variation: FFPE: formalin-fixed paraffin-embedded.

Competing interests

The authors declare that they have no competing interests.

9 in total

1. Initial sequencing and analysis of the human genome.

Authors: E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal: Nature Date: 2001-02-15 Impact factor: 49.962

2. A comprehensive assay for targeted multiplex amplification of human DNA sequences.

Authors: Sujatha Krishnakumar; Jianbiao Zheng; Julie Wilhelmy; Malek Faham; Michael Mindrinos; Ronald Davis
Journal: Proc Natl Acad Sci U S A Date: 2008-07-02 Impact factor: 11.205

Review 3. Diagnostic applications of high-throughput DNA sequencing.

Authors: Scott D Boyd
Journal: Annu Rev Pathol Date: 2012-11-01 Impact factor: 23.472

4. The sequence of the human genome.

Authors: J C Venter; M D Adams; E W Myers; P W Li; R J Mural; G G Sutton; H O Smith; M Yandell; C A Evans; R A Holt; J D Gocayne; P Amanatides; R M Ballew; D H Huson; J R Wortman; Q Zhang; C D Kodira; X H Zheng; L Chen; M Skupski; G Subramanian; P D Thomas; J Zhang; G L Gabor Miklos; C Nelson; S Broder; A G Clark; J Nadeau; V A McKusick; N Zinder; A J Levine; R J Roberts; M Simon; C Slayman; M Hunkapiller; R Bolanos; A Delcher; I Dew; D Fasulo; M Flanigan; L Florea; A Halpern; S Hannenhalli; S Kravitz; S Levy; C Mobarry; K Reinert; K Remington; J Abu-Threideh; E Beasley; K Biddick; V Bonazzi; R Brandon; M Cargill; I Chandramouliswaran; R Charlab; K Chaturvedi; Z Deng; V Di Francesco; P Dunn; K Eilbeck; C Evangelista; A E Gabrielian; W Gan; W Ge; F Gong; Z Gu; P Guan; T J Heiman; M E Higgins; R R Ji; Z Ke; K A Ketchum; Z Lai; Y Lei; Z Li; J Li; Y Liang; X Lin; F Lu; G V Merkulov; N Milshina; H M Moore; A K Naik; V A Narayan; B Neelam; D Nusskern; D B Rusch; S Salzberg; W Shao; B Shue; J Sun; Z Wang; A Wang; X Wang; J Wang; M Wei; R Wides; C Xiao; C Yan; A Yao; J Ye; M Zhan; W Zhang; H Zhang; Q Zhao; L Zheng; F Zhong; W Zhong; S Zhu; S Zhao; D Gilbert; S Baumhueter; G Spier; C Carter; A Cravchik; T Woodage; F Ali; H An; A Awe; D Baldwin; H Baden; M Barnstead; I Barrow; K Beeson; D Busam; A Carver; A Center; M L Cheng; L Curry; S Danaher; L Davenport; R Desilets; S Dietz; K Dodson; L Doup; S Ferriera; N Garg; A Gluecksmann; B Hart; J Haynes; C Haynes; C Heiner; S Hladun; D Hostin; J Houck; T Howland; C Ibegwam; J Johnson; F Kalush; L Kline; S Koduru; A Love; F Mann; D May; S McCawley; T McIntosh; I McMullen; M Moy; L Moy; B Murphy; K Nelson; C Pfannkoch; E Pratts; V Puri; H Qureshi; M Reardon; R Rodriguez; Y H Rogers; D Romblad; B Ruhfel; R Scott; C Sitter; M Smallwood; E Stewart; R Strong; E Suh; R Thomas; N N Tint; S Tse; C Vech; G Wang; J Wetter; S Williams; M Williams; S Windsor; E Winn-Deen; K Wolfe; J Zaveri; K Zaveri; J F Abril; R Guigó; M J Campbell; K V Sjolander; B Karlak; A Kejariwal; H Mi; B Lazareva; T Hatton; A Narechania; K Diemer; A Muruganujan; N Guo; S Sato; V Bafna; S Istrail; R Lippert; R Schwartz; B Walenz; S Yooseph; D Allen; A Basu; J Baxendale; L Blick; M Caminha; J Carnes-Stine; P Caulk; Y H Chiang; M Coyne; C Dahlke; A Deslattes Mays; M Dombroski; M Donnelly; D Ely; S Esparham; C Fosler; H Gire; S Glanowski; K Glasser; A Glodek; M Gorokhov; K Graham; B Gropman; M Harris; J Heil; S Henderson; J Hoover; D Jennings; C Jordan; J Jordan; J Kasha; L Kagan; C Kraft; A Levitsky; M Lewis; X Liu; J Lopez; D Ma; W Majoros; J McDaniel; S Murphy; M Newman; T Nguyen; N Nguyen; M Nodell; S Pan; J Peck; M Peterson; W Rowe; R Sanders; J Scott; M Simpson; T Smith; A Sprague; T Stockwell; R Turner; E Venter; M Wang; M Wen; D Wu; M Wu; A Xia; A Zandieh; X Zhu
Journal: Science Date: 2001-02-16 Impact factor: 47.728

5. Comparison of three targeted enrichment strategies on the SOLiD sequencing platform.

Authors: Dale J Hedges; Toumy Guettouche; Shan Yang; Guney Bademci; Ashley Diaz; Ashley Andersen; William F Hulme; Sara Linker; Arpit Mehta; Yvonne J K Edwards; Gary W Beecham; Eden R Martin; Margaret A Pericak-Vance; Stephan Zuchner; Jeffery M Vance; John R Gilbert
Journal: PLoS One Date: 2011-04-29 Impact factor: 3.240

6. Experiences with array-based sequence capture; toward clinical applications.

Authors: Rowida Almomani; Jaap van der Heijden; Yavuz Ariyurek; Yuching Lai; Egbert Bakker; Michiel van Galen; Martijn H Breuning; Johan T den Dunnen
Journal: Eur J Hum Genet Date: 2010-11-24 Impact factor: 4.246

7. Targeted high throughput sequencing in clinical cancer settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity.

Authors: Martin Kerick; Melanie Isau; Bernd Timmermann; Holger Sültmann; Ralf Herwig; Sylvia Krobitsch; Georg Schaefer; Irmgard Verdorfer; Georg Bartsch; Helmut Klocker; Hans Lehrach; Michal R Schweiger
Journal: BMC Med Genomics Date: 2011-09-29 Impact factor: 3.063

8. Connector inversion probe technology: a powerful one-primer multiplex DNA amplification system for numerous scientific applications.

Authors: Michael S Akhras; Magnus Unemo; Sreedevi Thiyagarajan; Pål Nyrén; Ronald W Davis; Andrew Z Fire; Nader Pourmand
Journal: PLoS One Date: 2007-09-19 Impact factor: 3.240

9. Multiplex target capture with double-stranded DNA probes.

Authors: Peidong Shen; Wenyi Wang; Aung-Kyaw Chi; Yu Fan; Ronald W Davis; Curt Scharfe
Journal: Genome Med Date: 2013-05-29 Impact factor: 11.117

9 in total

1 in total

1. VaDiR: an integrated approach to Variant Detection in RNA.

Authors: Lisa Neums; Seiji Suenaga; Peter Beyerlein; Sara Anders; Devin Koestler; Andrea Mariani; Jeremy Chien
Journal: Gigascience Date: 2018-02-01 Impact factor: 6.524

1 in total