| Literature DB >> 21955857 |
Yu Xu, Hui Jiang, Chris Tyler-Smith, Yali Xue, Tao Jiang, Jiawei Wang, Mingzhi Wu, Xiao Liu, Geng Tian, Jun Wang, Jian Wang, Huangming Yang, Xiuqing Zhang.
Abstract
BACKGROUND: Exome sequencing, which allows the global analysis of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21955857 PMCID: PMC3308058 DOI: 10.1186/gb-2011-12-9-r95
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Capture specificity of the three human exome capture platforms
| Filtered | Mapped to genome | Uniquely mapped to genome | Uniquely mapped to TR | Uniquely mapped to TFR | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Replicatea | Reads (M) | Read length | Bases (Mb) | Percent of reads | Bases (Mb) | Percent of reads | Bases (Mb) | Percent of reads | Bases (Mb) | Percent of reads | Bases (MB) |
| NA_r1 | 37 | PE90 | 3,352 | 85.2 | 2,793 | 81.4 | 2,682 | 53.5 | 1,437 | 63.1 | 2,064 |
| NA_r2 | 58 | PE90 | 5,210 | 79.3 | 4,115 | 76.0 | 3,944 | 56.4 | 2,370 | 67.9 | 3,481 |
| NS_r1 | 31 | PE90 | 2,781 | 86.4 | 2,402 | 82.3 | 2,287 | 54.2 | 1,192 | 67.7 | 1,860 |
| NS_r2 | 80 | PE90 | 7,230 | 85.9 | 6,163 | 82.8 | 5,964 | 55.0 | 3,175 | 67.6 | 4,787 |
| AS_r1 | 26 | PE90 | 2,220 | 84.4 | 1,868 | 74.4 | 1,645 | 58.1 | 1,146 | 60.1 | 1,332 |
| AS_r2 | 66 | PE90 | 5,720 | 78.8 | 4,496 | 69.2 | 3,950 | 54.6 | 2,776 | 56.4 | 3,225 |
aAS, Agilent solution; NA, NimbleGen array; NS, NimbleGen solution; r1 and r2 are two replicate experiments for each platform. TFR, targeted and flanking regions; TR, targeted regions.
Uniformity of depth by three human exome capture platforms
| Filtered | Mean coverage (×) | Coverage depth (percent of bases in TR) | Coverage depth (percent of bases in FR) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Replicatea | Reads (M) | Bases (Mb) | On TR | On FR | 0× | 1× to 10× | 10× to 50 × | >50× | 0× | 1× to 10× | 10× to 50× | >50× |
| NA-r1 | 25.5 | 2,280 | 30.8 | 9.3 | 1.3 | 12.3 | 69.8 | 16.6 | 6.8 | 56.2 | 36.8 | 0.2 |
| NA-r2 | 27.1 | 2,421 | 30.0 | 8.5 | 1.0 | 11.2 | 73.8 | 14.1 | 7.4 | 53.1 | 33.1 | 0.1 |
| NS-r1 | 26.1 | 2,338 | 30.1 | 10.8 | 1.7 | 13.0 | 69.7 | 15.6 | 6.1 | 53.1 | 39.5 | 1.3 |
| NS-r2 | 25.5 | 2,289 | 30.3 | 9.9 | 1.5 | 13.6 | 68.5 | 16.3 | 6.7 | 56.4 | 35.8 | 1.0 |
| AS-r1 | 25.7 | 2,222 | 32.7 | 3.5 | 2.1 | 18.6 | 59.5 | 19.8 | 46.8 | 42.7 | 10.1 | 0.4 |
| AS-r2 | 25.1 | 2,175 | 32.5 | 3.4 | 2.2 | 19.0 | 59.0 | 19.8 | 47.3 | 42.4 | 9.9 | 0.4 |
For the analyses, a set of data that has 30-fold coverage on targeted regions was randomly selected for each of the six replicates. aAS, Agilent solution; NA, NimbleGen array; NS, NimbleGen solution; r1 and r2 are two replicate experiments for each platform. FR, flanking region; TR, targeted region.
Figure 1Normalized per-base sequencing-depth distribution on targets. For the purpose of comparison among the three platforms, we selected a set of reads with an average coverage of approximately 30-fold from each replicate. The depth and the frequency (the fraction of a certain depth-level bases for certain sequencing depth-coverage in the total sequencing data) were normalized by the average coverage depth of each replicate on targets. NA-r1 and NA-r2, NS-r1 and NS-r2, and AS-r1 and AS-r2 represent each of two replicates for NimbleGen Sequence Capture Arrays, NimbleGen SeqCap EZ and Agilent SureSelect, respectively.
Figure 2Genotype sensitivity. (a) Genotype sensitivity of six replicates at 30× sequencing depth. (b) Genotype sensitivity as a function of sequencing depth. For the analyses, subsets of reads from two combined replicate datasets for each platform were randomly extracted at different average depths. NA, NS and AS represent NimbleGen Sequence Capture Arrays, NimbleGen SeqCap EZ and Agilent SureSelect, respectively, while r1 and r2 are two replicate experiments for each platform.
Figure 3Correlation of sequencing depth and coverage rate on consensus targeted CCDSs. The graph shows pair-wise Pearson correlation coefficients for both sequencing depth (top-left triangle) and coverage rate (bottom-right triangle) based on the 182,259 CCDSs targeted by both Agilent and NimbleGen. NA, NS and AS represent NimbleGen Sequence Capture Arrays, NimbleGen SeqCap EZ and Agilent SureSelect, respectively, while r1 and r2 are two replicate experiments for each platform.
Concordance of genotypes and SNPs
| Concordance with 1 M bead genotyping data | Concordance with WGSS data | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All genotypes | SNPs in 1 M chip | SNPs in exome capture | Homs in 1 M chip | Homs in exome capture | Hets in 1 M chip | Hets in exome capture | All genotypes | SNPs in WGSS | SNPs in exome capture | Homs in WGSS | Homs in exome capture | Hets in WGSS | Hets in exome capture | |
| NA-r1 | 99.846 | 99.641 | 99.826 | 99.649 | 99.987 | 99.633 | 99.687 | 99.999 | 99.216 | 98.636 | 99.951 | 99.868 | 98.683 | 97.750 |
| NA-r2 | 99.854 | 99.670 | 99.835 | 99.708 | 99.975 | 99.637 | 99.714 | 99.999 | 99.264 | 98.616 | 99.943 | 99.850 | 98.768 | 97.724 |
| NS-r1 | 99.854 | 99.679 | 99.819 | 99.682 | 99.951 | 99.676 | 99.707 | 99.998 | 99.211 | 98.396 | 99.974 | 99.747 | 98.657 | 97.426 |
| NS-r2 | 99.849 | 99.660 | 99.841 | 99.684 | 99.987 | 99.640 | 99.716 | 99.999 | 99.197 | 98.706 | 99.979 | 99.752 | 98.629 | 97.949 |
| AS-r1 | 99.816 | 99.526 | 99.823 | 99.571 | 99.948 | 99.486 | 99.712 | 99.998 | 98.783 | 98.021 | 99.917 | 99.824 | 97.945 | 96.703 |
| AS-r2 | 99.815 | 99.514 | 99.805 | 99.556 | 99.880 | 99.477 | 99.738 | 99.998 | 98.762 | 97.972 | 99.927 | 99.771 | 97.893 | 96.645 |
For each replicate, the 30-fold data set used for Table 2 analyses was also used for the analyses. aAS, Agilent solution; NA, NimbleGen array; NS, NimbleGen solution; r1 and r2 are two replicate experiments for each platform. Hets, heterozygotes; Homs, homozygotes.
Power for identifying disease-causing rare mutations
| NA-r1 | NA-r2 | NS-r1 | NS-r2 | AS-r1 | AS-r2 | |
|---|---|---|---|---|---|---|
| High quality genotype assigned sites | 32,139 | 32,674 | 31,750 | 31,923 | 31,685 | 31,353 |
| Reference genotypes | 32,124 | 32,658 | 31,732 | 31,909 | 31,666 | 31,335 |
| SNPs | 15 | 16 | 18 | 14 | 19 | 18 |
| Low quality genotype assigned sites | 6,064 | 5,529 | 6,453 | 6,280 | 7,349 | 7,681 |
| Uncovered | 1,703 | 1,703 | 1,703 | 1,703 | 872 | 872 |
For each replicate, the 30-fold data set used for Tables 2 and 3 analyses was also used for this analysis. AS, Agilent solution; NA, NimbleGen array; NS, NimbleGen solution; r1 and r2 are two replicate experiments for each platform.