| Literature DB >> 21917141 |
Katherine R Smith1, Catherine J Bromhead, Michael S Hildebrand, A Eliot Shearer, Paul J Lockhart, Hossein Najmabadi, Richard J Leventer, George McGillivray, David J Amor, Richard J Smith, Melanie Bahlo.
Abstract
Many exome sequencing studies of Mendelian disorders fail to optimally exploit family information. Classical genetic linkage analysis is an effective method for eliminating a large fraction of the candidate causal variants discovered, even in small families that lack a unique linkage peak. We demonstrate that accurate genetic linkage mapping can be performed using SNP genotypes extracted from exome data, removing the need for separate array-based genotyping. We provide software to facilitate such analyses.Entities:
Mesh:
Year: 2011 PMID: 21917141 PMCID: PMC3308048 DOI: 10.1186/gb-2011-12-9-r85
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Partial pedigrees for families A, T and M.
Number of HapMap Phase II SNPs covered ≥ 5 by distance to targeted base
| Distance to | Number of SNPs (%) | HapMap | |||
|---|---|---|---|---|---|
| targeted base | M-3 | M-4 | A-7 | T-1 | Phase II (N) |
| 0 bp | 56,648 (91.9) | 56,835 (92.2) | 57,027 (92.5) | 58,142 (94.3) | 61,647 |
| 1 to 200 bp | 50,077 (56.7) | 50,805 (57.5) | 46,144 (52.2) | 57,923 (65.6) | 88,349 |
| > 200 bp | 13,683 (0.4) | 13,565 (0.4) | 13,987 (0.4) | 17,007 (0.5) | 3,119,167 |
| Total | 120,408 (3.7) | 121,205 (3.7) | 117,158 (3.6) | 133,072 (4.1) | 3,269,163 |
The denominator for percentages is the total number of HapMap Phase II SNPs in that distance category.
Intermarker distances for the two genotyping arrays and for exome genotypes covered ≥ 5
| Median | 1st quartile | 3rd quartile | |
|---|---|---|---|
| Illumina OmniExpress | 2,233 | 814 | 5,125 |
| Illumina 610 | 2,744 | 1,019 | 6,027 |
| M-3 | 1,853 | 236 | 11,390 |
| M-4 | 1,830 | 235 | 11,260 |
| A-7 | 1,943 | 240 | 12,000 |
| T-1 | 1,647 | 227 | 10,210 |
Intermarker distances are in base pairs.
Increasing the prior heterozygous probability modestly improves concordance between exome and array genotypes
|
| M-3 (N = 52,617) | M-4 (N = 52,892) | A-7 (N = 29,459) | T-1 (N = 32,763) |
|---|---|---|---|---|
| 0.00001 | 0.9737 | 0.9734 | 0.9698 | 0.9741 |
| 0.001 (default) | 0.9882 | 0.9874 | 0.9865 | 0.9885 |
| 0.01 | 0.9927 | 0.9926 | 0.9918 | 0.9925 |
| 0.05 | 0.9951 | 0.9950 | 0.9942 | 0.9945 |
| 0.1 | 0.9958 | 0.9958 | 0.9950 | 0.9952 |
| 0.2 | 0.9968 | 0.9965 | 0.9958 | 0.9961 |
| 0.3 | 0.9971 | 0.9968 | 0.9961 | 0.9964 |
| 0.4 | 0.9973 | 0.9971 | 0.9964 | 0.9968 |
| 0.5 | 0.9974 | 0.9973 | 0.9965 | 0.9969 |
Proportion of SNPs where WES and genotyping array genotypes are concordant for the four exomes, for varying values of t (prior probability of a heterozygous genotype). Conditional on coverage with ≥ 5 reads.
Number and average heterozygosity of array and WES SNPs selected for linkage analysis
| M-3 and M-4 | A-7 | T-1 | ||||
|---|---|---|---|---|---|---|
| WES | Array | WES | Array | WES | Array | |
| SNPs available | 114,681 | 677,144 | 117,158 | 593,638 | 133,071 | 587,680 |
| SNPs selected | 8,016 | 12,173 | 8,135 | 12,243 | 8,402 | 12,194 |
| Average heterozygosity | 0.40 | 0.49 | 0.40 | 0.48 | 0.41 | 0.48 |
Average heterozygosity refers to the HapMap CEU population and not to the individual being studied. For M-3 and M-4, 'SNPs available' is the number of SNPs covered ≥ 5 in both individuals.
Figure 2Genome-wide comparison of LOD scores using array-based and WES-derived genotypes for families A, T and M.
Distribution of LOD score differences (WES - array) at linkage peaks
| Family | Median | 2.5th centile | 97.5th centile |
|---|---|---|---|
| A | -0.0005 | -0.572 | 0.092 |
| T | -0.002 | -0.390 | 0.035 |
| M | -0.0003 | -0.117 | 0.0034 |
Summary of differences at analysis positions where either the WES or the array LOD scores reach their genome-wide maximum.
Efficacy of variant elimination due to linkage peak filtering
| Family | Model | Consanguinity | Number of linkage peaks | Max LOD | Number of not synonymous exonic variants | Number of (%) not synonymous exonic variants in linkage regions |
|---|---|---|---|---|---|---|
| A | Recessive | First cousin offspring | 15 | 1.2 | 10,982 | 604 (5.50) |
| T | Recessive | First cousins once removed offspring | 5 | 1.51 | 11,353 | 65 (0.57) |
| M | Dominant | None | 41 | 0.3 | 13,186 | 2,478 (18.79) |