| Literature DB >> 26037133 |
Christopher M Watson1,2, Laura A Crinnion1,2, Juliana Gurgel-Gianetti3, Sally M Harrison1, Catherine Daly1, Agne Antanavicuite1, Carolina Lascelles1, Alexander F Markham1, Sergio D J Pena4,5, David T Bonthron1,2, Ian M Carr1.
Abstract
Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease-causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome-wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution.Entities:
Keywords: autozygosity mapping; exome; next generation sequencing; software
Mesh:
Year: 2015 PMID: 26037133 PMCID: PMC4744743 DOI: 10.1002/humu.22818
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Figure 1Mapping of the CFTR locus in a nonconsanguineous CEPH family (Pedigree 3, see the section Methods), using variant data derived either (A) from exome sequencing or (B) from microarray SNP genotyping (Affymetrix SNP 6.0). Display of Chromosome 7 haplotypes was performed using Phaser [Carr et al., 2012]. In this display, blue and pink vertical bars denote the haplotypes inherited by the first affected offspring from the father and mother, respectively. The disease locus must be located in a region where all affected individuals share the same combination of maternal (pink) and paternal (blue) haplotypes as this first individual. This is denoted by a purple color in the central vertical bar (region of overlap between the affected paternal and maternal haplotypes). Unaffected offspring should be discordant with their affected siblings at the disease locus (i.e., no purple bar). In this example, the actual position of the CFTR gene is marked by a horizontal red line. It can be seen that the candidate region containing CFTR is considerably smaller in (B), due to the superior resolution and distribution of the microarray SNPs.
Figure 2Autozygosity mapping using exome variant data. The blue regions indicate the extent of autozygous regions deduced by AgileVCFMapper. The black and yellow vertical lines indicate the genotypes of variants as described in the section “Identifying concordant and nonconcordant autozygous regions.” A: The option “autozygous regions” has been selected; B: The option “common regions” has been chosen; consequently, in (B) some variants that are homozygous but nonconcordant between the two individuals under study have changed from black to yellow. The red horizontal bar indicates a region of overlapping but nonconcordant autozygosity.
Figure 3Visualization of Chromosome 8 exome (upper two rows) and microarray genotype (lower three rows) data for five siblings from Pedigree 2. The blue and pink rectangles indicate the positions of autozygous regions in affected and unaffected individuals, respectively. A: The black and yellow vertical lines identify homozygous and heterozygous exome positions, respectively. In contrast, in (B) more restrictive display settings are selected; black vertical lines identify exome variants that are homozygous in both affected individuals for which exome data are available and have no associated rs number in the original VCF variant data files. The variant at approximately 133.5 Mb, lying within the region of shared autozygosity, is a frameshift mutation in the LRRC6 gene and is thought to be the disease‐causing variant.
Figure 4A: Visualization of a simulated heterozygous de novo mutation in FGFR3, introduced in silico into the VCF data file of one patient NA07383 of Pedigree 3 (NIGMS CF1038). This individual is marked “A” to the left, the other three siblings as “U.” The parents are designated “F” and “M.” The heterozygous de novo variant appears as a black vertical bar; the homozygous reference genotype in all other individuals is indicated in blue. B: The displayed meta‐information for the variant describes its location in the gene, its effect on the protein's sequence (if any), and the read depth information for each individual in the analysis.