| Literature DB >> 26826718 |
Vagheesh Narasimhan1, Petr Danecek1, Aylwyn Scally2, Yali Xue1, Chris Tyler-Smith1, Richard Durbin1.
Abstract
UNLABELLED: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity.Entities:
Mesh:
Year: 2016 PMID: 26826718 PMCID: PMC4892413 DOI: 10.1093/bioinformatics/btw044
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Comparison of error rates of BCFtools/RoH and other existing methods as well as performance on real data. (A) Performance on simulated data. FPR and FNRs in data simulated with varying levels of autozygosity and SNP calling error, analyzed using three different detection methods. (B) Performance on real data. We compare the inbreeding coefficient F, estimated either by our method as the percentage of the genome that is autozygous, or as the deviation from HWE estimated across all sites, for 31 CEU individuals. (C) Example autozygous segments in simulated data (green) and detected by our method (red). For each chromosome, the y-axis shows the normalized density of heterozygous sites in bins of 0.1 Mb. The x-axis shows the position of the chromosome (in units of 1e8 bp). The overlapping red and green sections show that the regions identified as autozygous using our HMM approach accurately reflect the true length and location of autozygous sections in the simulated data