| Literature DB >> 20043857 |
Hane Lee1, Brian D O'Connor, Barry Merriman, Vincent A Funari, Nils Homer, Zugen Chen, Daniel H Cohn, Stanley F Nelson.
Abstract
BACKGROUND: The emergence of next-generation sequencing technology presents tremendous opportunities to accelerate the discovery of rare variants or mutations that underlie human genetic disorders. Although the complete sequencing of the affected individuals' genomes would be the most powerful approach to finding such variants, the cost of such efforts make it impractical for routine use in disease gene research. In cases where candidate genes or loci can be defined by linkage, association, or phenotypic studies, the practical sequencing target can be made much smaller than the whole genome, and it becomes critical to have capture methods that can be used to purify the desired portion of the genome for shotgun short-read sequencing without biasing allelic representation or coverage. One major approach is array-based capture which relies on the ability to create a custom in-situ synthesized oligonucleotide microarray for use as a collection of hybridization capture probes. This approach is being used by our group and others routinely and we are continuing to improve its performance.Entities:
Mesh:
Substances:
Year: 2009 PMID: 20043857 PMCID: PMC2808330 DOI: 10.1186/1471-2164-10-646
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the results.
| Primers | Single strand separation | Double Hyb | Specificity | |||
|---|---|---|---|---|---|---|
| Baseline | - | - | - | 35% | 98% | 75% |
| Single Strand | - | Yes | - | 54% | 99% | 92% |
| Primer Block | Yes | - | - | 62% (63%†) | 99% (99%†) | 88% (94%†) |
| Double Hybridization | Yes | - | Yes | 90% (82%†) | 98% (98%†) | 84% (89%†) |
* Not considering flanking regions
† Results from replication using the tumor genomic DNA.
4 μg of genomic library was used for all experiments. The hybridization mix contained 50 μg of Human Cot-1, 52 μl of Agilent 10× Blocking agent and 260 μl of Agilent 2× hybridization buffer. Specificity is calculated by dividing sequence counts within the targeted region by total sequence counts mapable to the human genome uniquely for each run. Targeted region is defined as the targeted exon +/- 100 bp.
Figure 1Mapping of sequences relative to probe position in the genome. a) Sequence coverage distribution averaged across all targeted regions captured by basal capture protocol and b) sequence coverage distribution averaged across all targeted regions captured by double hybridization (modified) protocol show that the sequence reads are tightly limited around the targeted regions. Here, a targeted region is not necessarily a targeted exon but a probeset composed of multiple probes that are < 200 bp apart to each other. The y axis plots the relative abundance and the x axis is the base position relative to the probes positions.
Figure 2Copy number fold differences between the normal and tumor tissues per chromosome using single hybridization capture protocol with blockers. The cancer specimen used in these experiments was known to have a chromosome 7 copy number gain and a chromosome 10 deletion. The normalized counts per chromosome are plotted for all chromosomes and are markedly different for the two chromosomes at altered copy numbers.
Genomic intervals with regional aberration detected by SNP array were examined by capture array and the results were compared.
| Genomic Interval | SNP array | Capture array |
|---|---|---|
| chr1:141510591-142763403 | Gain | None captured |
| chr1:92652000-93425291 | Gain | Gain |
| chr2:42528749-43214954 | Loss | None captured |
| chr3:52074481-55997844 | Gain | None captured |
| chr6:144171325-144331154 | Gain (4 copies) | Gain (25 copies) |
| chr6:164267664-164772727 | Loss | Loss |
| chr6:169375842-169748900 | Loss | None captured |
| chr6:56774326-56953414 | Loss | Loss |
| chr7:54464822-55373694 | Gain | None captured |
| chr9:14995594-24802191 | Gain | None captured |
| chr14:68301617-68906063 | Loss | None captured |
| chr15:38336909-38504594 | Gain | Loss |
| chr16:2652029-2764985 | Loss | None captured |
| chr17:5541586-5968563 | Gain | No Change |
| chr20:39176664-39815654 | Gain | None captured |
| chr20:61246745-61325513 | Gain | None captured |
| chrX:13343420-151922021 | Gain | None captured |
| chrX:1674881-1838306 | Loss | None captured |
Figure 3. A 200 Kb sized moving average of the interval flanking a) known EGFR amplification event are plotted in genomic position and b) for reference another genomic interval around the FOXP2 gene also on chromosome 7 is shown demonstrating the more typical coverage. The EGFR region is amplified 25× in average compared to the region outside of EGFR.
Troubleshooting Guide
| Problem | Possible Reason | Solution |
|---|---|---|
| Genome is not fragmented after sonication. | Buffer condition is not adaptable for sonication. | Purify the DNA. We used QIAGEN PCR Purification Kit, eluted in EB to have it work. |
| Nothing is visible on the gel after 1 hr electrophoresis during library generation. | When the starting amount of DNA is small or there is significant DNA loss during the process for various reasons, it is possible that the DNA is smeared over a wide range after an hour of electrophoresis and not visible on the gel. | It is good to check the gel to see if the DNA is present after ~10 min run when the DNA is not smeared at a wide range. |
| Cannot collect ~400 ul after the stripping step. | Gasket slide was re-used. | Do not re-use the gasket slide. |
| Not enough DNA amplified after the first stripping. | Stripping was not efficient. | Another stripping process can be done and checked if there were left over genomic fragments hybridized on the probes. Since it does not matter if the stripped solution contains contaminants as long as the contaminants do not have adapters ligated at the end, it is possible to thoroughly continue the stripping process until no products get amplified. |
Figure 4Percentage of targeted bases sequenced at various minimum coverage for different mean coverages. X-axis represents the coverage per base level and the corresponding y-axis represents the percentage of targeted bases that were covered at greater or equal with certain coverage. Table legends describe the detail of each line shown.
PCR primer sequences used to design primers for blocking the adapters in the hybridization mix.
| Primer 1.1 | 5' AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT |
|---|---|
| Primer 2.1 | 5' CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT |
Oligonucleotide sequences© 2006 Illumina, Inc. All rights reserved.
Masks that define each index for sequence alignment.
| Index | Masks |
|---|---|
| 1 | 111111111111111111 |
| 2 | 11110100110111101010101111 |
| 3 | 11111111111111001111 |
| 4 | 1111011101100101001111111 |
| 5 | 11110111000101010000010101110111 |
| 6 | 1011001101011110100110010010111 |
| 7 | 1110110010100001000101100111001111 |
| 8 | 1111011111111111111 |
| 9 | 11011111100010110111101101 |
| 10 | 111010001110001110100011011111 |