| Literature DB >> 21592371 |
Stefan Taudien1, Karol Szafranski, Marius Felder, Marco Groth, Klaus Huse, Francesca Raffaelli, Andreas Petzold, Xinmin Zhang, Philip Rosenstiel, Jochen Hampe, Stefan Schreiber, Matthias Platzer.
Abstract
BACKGROUND: In highly copy number variable (CNV) regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS) approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21592371 PMCID: PMC3118217 DOI: 10.1186/1471-2164-12-243
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Workflow for the identification of SNVs from 454 sequences generated after target enrichment by NimbleGen sequence capture of individual DNAs.
Comparison of high confidence differences (HCDiffs) obtained from the CTRL and DEFA regions by sequence capture (SeqCap) with HapMap SNP genotypes.
| NA12716 | NA12760 | NA12716 | NA12760 | ||
|---|---|---|---|---|---|
| HapMap: heterozygous | 244 | 263 | SeqCap: heterozygous | 711 | 646 |
| SeqCap: | HapMap: | ||||
| homozygous for reference, no HCDiff | 0 | 5 | heterozygous | 237 | 248 |
| homozygous for reference | 0 | 2 | homozygous for variant | 5 | 4 |
| heterozygous | 237 | 248 | homozygous for reference | 10 | 4 |
| homozygous for variant | 3 | 0 | |||
| not or poorly covered/alignment problem | 4 | 8 | |||
| HapMap: homozygous for variant | 254 | 217 | SeqCap: homozygous for variant | 526 | 468 |
| SeqCap: | HapMap: | ||||
| homozygous for reference, no HCDiff | 0 | 0 | homozygous for variant | 248 | 212 |
| homozygous for reference | 0 | 0 | homozygous for reference | 0 | 0 |
| heterozygous | 5 | 4 | heterozygous | 3 | 0 |
| homozygous for variant | 248 | 212 | |||
| not or poorly covered/alignment problem | 1 | 1 | |||
| HapMap: homozygous for reference | 839 | 842 | SeqCap: homozygous for reference | 151 | 137 |
| SeqCap: | HapMap: | ||||
| homozygous for reference, no HCDiff | 827 | 835 | homozygous for variant | 0 | 0 |
| homozygous for reference | 1 | 1 | homozygous for reference | 1 | 1 |
| heterozygous | 10 | 4 | heterozygous | 0 | 2 |
| homozygous for variant | 0 | 0 | |||
| not or poorly covered/alignment problem | 1 | 2 | |||
| Sensitivity (heterozygous) | 97,1% | 94,3% | Specificity (heterozygous) | 94,0% | 96,9% |
| Sensitivity (homozygous) | 98,4% | 99,0% | Specificity (homozygous) | 98,4% | 98,6% |
The SeqCap HCDiffs were categorized according to the variant's allele frequency (VAF) with the human genome sequence (hg18) as reference allele: homozygous for reference (VAF 10-24%); heterozygous (VAF 25-75%); homozygous for variant (VAF 76-100%).
Figure 2Distribution of variant's allele frequencies (VAFs) before (HCDiffs) and after filtering (SNVs).
Known and putative novel (putnov) SNVs and SNV densities in CTRL, DEFA and DEFB of NA12716 and NA12760 after successive filtering of HCDiffs
| Total | CTRL | DEFA | DEFB | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| SNVs | SNVs per kb | known | putnov | SNVs per kb | known | putnov | SNVs per kb | known | putnov | SNVs per kb | |
| NA12716 | 1.919 | 2,3 | 574 | 41 | 1,5 | 604 | 27 | 3,2 | 560 | 113 | 4,1 |
| NA12760 | 2.265 | 2,7 | 595 | 48 | 1,5 | 428 | 37 | 2,4 | 979 | 178 | 7,0 |
| unique SNVs | 2.886 | 771 | 77 | 701 | 61 | 1.056 | 220 | ||||
| 848 | 762 | 1.276 | |||||||||
Figure 3Known and putative novel SNVs identified in the different target regions of NA12716 and NA12760.
DEFB cluster copy numbers (CN) per diploid genome of NA12760 estimated by the ratio of reads per haplotype
| region/gene | reads | read numbers with different haplotypes | read ratio | CN |
|---|---|---|---|---|
| 51 | 17:34 | 1:2 (3n) | 6 | |
| 89 | 16:62:11 | 1:4:1 | 6 | |
| down | 45 | 10:26:9 | 1:3:1 | 5 |
| stream | 35 | 11:14:4:6 | 2:3:1:1 | 7 |
| 102 | 25:12:19:13:12:21 | 2:1:1:1:1:2 | 8 | |
| 33 | 8:6:11:8 | 1:1:2:1 | 5 | |
| 64 | 9:8:47 | 1:1:4 | 6 | |
| 38 | 15:23 | 2:3 | 5 | |
| 60 | 29:16:15 | 2:1:1 | 4 | |
| 100 | 50:16:34 | 3:1:2 | 6 | |
| 42 | 7:12:15:8 | 1:2:2:1 | 6 | |
| 63 | 22:11:12:18 | 2:1:1:2 | 6 | |
| 54 | 20:27:7 | 3:4:1 | 8 | |
| 37 | 20:10:7 | 3:2:1 | 6 | |
| 35 | 19:16 | 1:1 (2n) | 6 | |
| 48 | 14:21:13 | 2:3:2 | 7 | |
| 30 | 4:21:5 | 1:5:1 | 7 | |
| 32 | 8:14:10 | 2:3:2 | 7 | |
| 48 | 7:22:19 | 1:3:3 | 7 | |
| 48 | 9:20:6:13 | 1:2:1:1 | 5 | |
| 62 | 13:18:13:7:11 | 2:2:2:1:1 | 8 | |
| 34 | 6:13:15 | 1:2:3 | 6 | |
| 33 | 12:21 | 1:2 (3n) | 6 | |
| 40 | 6:8:26 | 1:1:4 | 6 | |
| 33 | 3:4:21:5 | 1:1:4:1 | 7 | |
| 53 | 12:33:8 | 1:3:1 | 5 | |
| 36 | 17:7:12 | 3:1:2 | 6 | |
| 37 | 14:16:7 | 2:2:1 | 5 | |
| 53 | 10:17:17:9 | 1:2:2:1 | 6 | |
| 32 | 20:12 | 2:1 (3n) | 6 | |
| 30 | 13:10:7 | 3:2:1 | 6 | |
| 30 | 14:6:6:4 | 3:1:1:1 | 6 | |
| 86 | 63:26 | 3:1 | 4 | |
| 62 | 6:8:21:27 | 1:1:3:3 | 8 | |
| 65 | 13:23:15:14 | 1:2:1:1 | 5 | |
| 31 | 6:6:6:5:8 | 1:1:1:1:2 | 6 | |
| 38 | 10:10:5:3:10 | 2:2:1:1:2 | 8 | |
| 29 | 6:7:16 | 1:1:3 | 5 | |
| 31 | 9:14:8 | 2:3:2 | 7 | |
| 33 | 6:27 | 1:5 | 6 | |
| 100 | 32:12:18:38 | 2:1:1:2 | 6 | |
| 77 | 21:13:31:12 | 2:1:3:1 | 7 | |
| 30 | 8:22 | 1:3 (4n) | 8 | |
| 63 | 20:8:22:13 | 2:1:2:1 | 6 | |
| 61 | 17:6:24:14 | 2:1:3:2 | 8 | |
| 39 | 7:5:27 | 1:1:4 | 6 | |
| 28 | 5:5:5:13 | 1:1:1:3 | 6 | |
| 97 | 78:19 | 4:1 | 5 | |
| total number of reads | 2.397 | |||
| total number of CN estimations | 48 | |||
| CN per diploid genome - average | 6,17 | |||
| - max | 8 | |||
| - min | 4 | |||
| - STDEV | 1,06 | |||
Figure 4SNVs in the putative DEFB4 promoter region ~1.9 kb upstream of the transcription start site inferred from the NA12716 and NA12760 sequence alignments with variants' allele frequencies (VAF) in comparison to the reference NCBI build 36.1 (hg18) and to variations identified by Groth et al. [25]. Dash: homozygote for the hg18 allele NA12760 HTs: haplotypes from SNV combinations represented by >30 sequences (+): confirmation of polymorphisms identified in [25]; for SNV details see additional file18. SNVs 252 and 260 (blue) were found to be polymorphic in NA12716 and/or NA12760 by the present study but not in [25].