| Literature DB >> 21738606 |
Georges Natsoulis1, John M Bell, Hua Xu, Jason D Buenrostro, Heather Ordonez, Susan Grimes, Daniel Newburger, Michael Jensen, Jacob M Zahn, Nancy Zhang, Hanlee P Ji.
Abstract
We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies.Entities:
Mesh:
Year: 2011 PMID: 21738606 PMCID: PMC3127857 DOI: 10.1371/journal.pone.0021088
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
OligoExome design summary.
| Parameter | HumanOligoExome |
|
| 17,049 |
|
| 157,624 |
|
| 9.25 |
|
| 44.6 |
|
| 784,783 |
|
| 98.3% |
|
| 46 |
|
| 5 |
Region-of-interest (ROI) is defined as a minimum of the exon and adjacent intronic sequence up to 50 bases from the exon flank.
Oligoexome design and flag summary.
| Restriction enzyme |
|
|
|
| Total per category |
|
| 190,900 | 191,315 | 186,011 | 216,557 | 784,783 |
|
| 169,776 | 171,320 | 161,813 | 184,754 | 687,663 |
|
| 20,992 | 19,882 | 24,021 | 31,670 | 96,565 |
|
| 4,915 | 4,983 | 4,854 | 6,179 | 20,931 |
|
| 290 | 183 | 58 | 460 | 991 |
|
| 379 | 379 | 578 | 456 | 1,792 |
|
| 21,124 | 19,995 | 24,198 | 31,803 | 97,120 |
Description of assays.
| Genomic Target Specifications | Capture assay 1 | Capture assay 2 | Capture assay 3 |
|
| 10 | 96 | 106 |
|
| 179 | 1776 | 2021 |
|
| 30.446 | 325.303 | 362.309 |
|
| 48.141 | 501.489 | 562.974 |
|
| |||
|
| 360 | 4,211 | 4,792 |
|
| 47.554 | 454.166 | 512.556 |
|
| 54.934 | 367.984 | 430.453 |
|
| 102.488 | 822.15 | 943.009 |
|
| 98.8% | 90.6% | 91.0% |
|
| 1.0% | 9.4% | 9.0% |
|
| |||
|
| 0.468 | 18.078 | 19.684 |
Figure 1Adjustment of capture oligonucleotides performance.
Pre- and post-adjustment capture oligonucleotides performance of capture assays 1 and 2 are shown. Capture assay 1's target size was 102.48 Kb and this intermediate version of capture assay 2 covered 616 Kb. The Y axis shows the proportions of bases across the target whose fold-coverage can be sorted into each order of magnitude before and after capture adjustment. Nominally, we opted for a sequencing depth between 100 and 1,000 as an adequate representation. In both assays, the proportion of bases whose FC is less than 100 drops significantly; in the case of capture assay 2, the number of bases with excessively high FC has dropped significantly as well.
Variant description and assay performance summary.
| Genomic targets | Capture assay 1 | Capture assay 2 | Capture assay 3 | |||
|
| 10 | 10 | 10 | 96 | 106 | 106 |
| Replicate 1 | Replicate 2 | |||||
|
| ||||||
|
| NA07037 (CEU) | NA07435 (CEU) | NA06995 (CEU) | NA07037 (CEU) | NA18507 (YRI) | NA18507 (YRI) |
|
| 90.007 | 40.907 | 87.770 | 1550.928 | 1279.549 | 1081.001 |
|
| 3.6% | 4.4% | 4.0% | 7.3% | 9.1% | 9.1% |
|
| 63% | 64% | 64% | 62% | 60% | 60% |
|
| 89.4% | 85.4% | 89.9% | 86.7% | 85.0% | 84.7% |
|
| 380 | 151 | 348 | 446 | 367 | 304 |
|
| ||||||
|
| 18/18 (100.0%) | 13/14 (92.9%) | 16/16 (100.0%) | 135/137 (98.5%) | 160/171 (93.5%) | 155/168 (92.2%) |
|
| 42/42 (100.0%) | 45/45 (100.0%) | 46/46 (100.0%) | 344/345 (99.7%) | 932/946 (98.5%) | 924/936 (98.7%) |
|
| Infinite | 5,318 | Infinite | 20,104 | 952 | 903 |
|
| ||||||
|
| 13 | 11 | 9 | 156 | 162 | 160 |
|
| 10 | 12 | 11 | 84 | 59 | 58 |
Figure 2Evaluation of Alu sequence in non-specific capture.
Ten-thousand consensus length (297 bases) Alu sequences were randomly selected and aligned. The percentage of Alu sequences containing MseI, BfaI, SauIIIA and CviQI sites along the multiple alignment positions is shown. The four most prevalent restriction sites are SauIIIA sites. The two most frequent amongst these are present in 50 to 75% of the Alu sequences. We note that the alignment sequence is much longer than many individual Alu sequences because of insertions and deletions.
Figure 3Comparison of targeted resequencing of independent samples.
We show an example of a 1,049 base captured region, occurring between coordinates 11096583 and 11097631 of chromosome 1. The fold-coverage from the three samples has been normalized by taking the ratio of fold-coverage at each position to the median depth for the sample, and then taking the log10 of that ratio. Purple lines indicate a capture oligonucleotide's target. The exons are indicated by the blue lines. Vertical lines, extending from the beginning and end of each captured amplicon, show that the discontinuities in depth are associated with the ends of captured targets.
Tumor specific mutations in matched normal tumor pair.
| Gene | Chr | Gene location | Genomic position | cDNA position |
|
| Normal colon (2950N) |
|
| |||||||
|
| 1 | Exon 20 | g.ch:1:74677879G>A | c.2047G>A |
| 31.7% | 0.7% |
|
| 5 | Exon 15 | g.ch:5:112201816C>T | c.2646C>T |
| 36.4% | 1.9% |
|
| 6 | Intron 14–15 | g.ch:6:69842382G>A | NA |
| 32.8% | 0.3% |
|
| 7 | Intron 17–18 | g.ch:7:140081065A>G | IVS18+25A>G |
| 25.8% | 0.0% |
|
| 12 | Exon 1 | g.ch:12:25289551C>T | c.226C>T |
| 33.3% | 0.9% |
|
| 12 | Intron 9–10 | g.ch:12:76939845C>A | NA |
| 25.9% | 5.0% |