| Literature DB >> 25236458 |
Darren T Houniet1, Thahira J Rahman1, Saeed Al Turki1, Matthew E Hurles1, Yaobo Xu1, Judith Goodship1, Bernard Keavney1, Mauro Santibanez Koref1.
Abstract
MOTIVATION: During the past 4 years, whole-exome sequencing has become a standard tool for finding rare variants causing Mendelian disorders. In that time, there has also been a proliferation of both sequencing platforms and approaches to analyse their output. This requires approaches to assess the performance of different methods. Traditionally, criteria such as comparison with microarray data or a number of known polymorphic sites have been used. Here we expand such approaches, developing a maximum likelihood framework and using it to estimate the sensitivity and specificity of whole-exome sequencing data.Entities:
Mesh:
Year: 2014 PMID: 25236458 PMCID: PMC4271148 DOI: 10.1093/bioinformatics/btu606
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Comparison of estimated specificity and sensitivity for different pipelines
Fig. 2.Exploring the effect of parameter choice. Represented is the effect of minimum base quality threshold on estimated sensitivity (Panel A) and specificity (Panel B). Each point on the graph represents the result for a single sample analysed using the minimum base quality threshold given in the abscissa
Fig. 3.Effect of average coverage. Each point represents the sensitivity (log scaled) achieved for a sample at a given value
Mean sensitivity and specificity estimates
| Sensitivity and Specificity | Estimated from | ||
|---|---|---|---|
| CEU frequencies | Sample frequencies | Microarray | |
| Sensitivity | 0.962 | 0.979 | 0.984 |
| (95% CI | (0.945–0.970) | (0.962–0.986) | (0.982–0.986) |
| Specificity | 0.998 | 0.999 | 0.999 |
| (95% CI) | (0.997–0.998) | (0.999–0.999) | (0.999–1.00) |
Notes: Represented are the estimates for the specificity and sensitivity of the NovoAlign/Samtools pipeline.
aAllele frequencies for the Hapmap CEU population.
bAllele frequencies calculated from the genotyping genotyping results for the 19 samples.
cGenotypes determined using the Illumina 660 W chip.
d95% confidence interval for the mean, determined by resampling.
Fig. 4.Correlation between sensitivity estimates from microarray data and using CEU population frequencies. Each point represents the sensitivity value obtained for individuals and analysis pipelines
Fig. 5.Effect of reference population misspecification. Values represent the average sensitivity across all individuals in our study. CEU: Utah residents with northern and western European ancestry from the CEPH collection; TSI: Tuscan in Italy; MEX: Mexican ancestry in Los Angeles, California; GIH: Gujarati Indians in Houston, Texas; ASW: African ancestry in southwest USA; MKK: Maasai in Kinyawa, Kenya; CHB: Han Chinese in Beijing, China; JPT: Japanese in Tokyo, Japan; CHD: Chinese in Metropolitan Denver, Colorado; LWK: Luhya in Webuye, Kenya; YRI: Yoruban in Ibadan, Nigeria
Fig. 6.Sensitivity of different indel detection pipelines