| Literature DB >> 26143870 |
Atsuko Imai1, Akihiro Nakaya2, Somayyeh Fahiminiya3, Martine Tétreault3, Jacek Majewski3, Yasushi Sakata4, Seiji Takashima5, Mark Lathrop3, Jurg Ott6.
Abstract
A major challenge in current exome sequencing in autosomal recessive (AR) families is the lack of an effective method to prioritize single-nucleotide variants (SNVs). AR families are generally too small for linkage analysis, and length of homozygous regions is unreliable for identification of causative variants. Various common filtering steps usually result in a list of candidate variants that cannot be narrowed down further or ranked. To prioritize shortlisted SNVs we consider each homozygous candidate variant together with a set of SNVs flanking it. We compare the resulting array of genotypes between an affected family member and a number of control individuals and argue that, in a family, differences between family member and controls should be larger for a pathogenic variant and SNVs flanking it than for a random variant. We assess differences between arrays in two individuals by the Hamming distance and develop a suitable test statistic, which is expected to be large for a causative variant and flanking SNVs. We prioritize candidate variants based on this statistic and applied our approach to six patients with known pathogenic variants and found these to be in the top 2 to 10 percentiles of ranks.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26143870 PMCID: PMC5155624 DOI: 10.1038/srep12028
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Pedigree drawings for families F1, F4, F6, OI, and L1.
For families F1, F4, and F6, genotypes are marked for individuals with DNA available and tested; the following abbreviations are used: +, normal (wild-type) variant; Δ, rare mutant variant. Family OI: The affected individual (P1) is indicated with a solid symbol, heterozygotes are shown with half-solid symbols.
Results of our family-control analysis for prioritizing m final candidate variants in an affected individual from five families.
| F1 | TTC7A | Multiple intestinal atresia | 1 | 10 | 10.0 | 0.0645 |
| F4 | TTC7A | Multiple intestinal atresia | 1 | 14 | 7.1 | 0.0645 |
| F6 | TTC7A | Multiple intestinal atresia | 1 | 18 | 5.6 | 0.0645 |
| OI | BMP1 | Osteogenesis imperfecta | 1 | 14 | 7.1 | 0.0303 |
| L1a | POLR3B | Leukodystrophy | 1 | 50 | 2.0 | 0.0645 |
| L1b | POLR3B | Leukodystrophy | 4 | 44 | 9.1 | 0.0645 |
Rank = order of test statistic (largest tmax ranked 1) for pathogenic variant among the m candidate variants; % = top percentile rank, 100 × rank/m; p = empirical significance level. L1a and L1b refer to two affected individuals in family L1.