Literature DB >> 26819469

FILTUS: a desktop GUI for fast and efficient detection of disease-causing variants, including a novel autozygosity detector.

Magnus D Vigeland¹, Kristina S Gjøtterud², Kaja K Selmer¹.

Abstract

UNLABELLED: FILTUS is a stand-alone tool for working with annotated variant files, e.g. when searching for variants causing Mendelian disease. Very flexible in terms of input file formats, FILTUS offers efficient filtering and a range of downstream utilities, including statistical analysis of gene sharing patterns, detection of de novo mutations in trios, quality control plots and autozygosity mapping. The autozygosity mapping is based on a hidden Markov model and enables accurate detection of autozygous regions directly from exome-scale variant files.
AVAILABILITY AND IMPLEMENTATION: FILTUS is written in Python and runs on Windows, Mac and Linux. Binaries and source code are freely available at http://folk.uio.no/magnusv/filtus.html and on GitHub: https://github.com/magnusdv/filtus Automatic installation is available via PyPI (e.g. pip install filtus). CONTACT: magnusdv@medisin.uio.no SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Disease Gene Species

Mesh：

Year: 2016 PMID： 26819469 PMCID： PMC4866527 DOI： 10.1093/bioinformatics/btw046

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Recent years have seen a revolution in Mendelian disease gene identification thanks to high-throughput sequencing (HTS) methods, in particular whole-exome sequencing (WES). Most of the released software for downstream analysis is aimed at bioinformatically trained users, thus posing a challenge for many medical researchers wanting to do hands-on analysis of HTS data. To accommodate this, we introduce a program (FILTUS) offering advanced tools for identifying disease-causing variants in an easy-to-use graphical environment. Several programs for manipulating variant files have been published, including VarSifter (Teer ) and the web-based EVA (Coutant ). In addition to efficient browsing and filtering, FILTUS offers several specialized analysis tools aimed at Mendelian disease projects. These include statistical evaluation of variant sharing among patients, de novo detection in trios and autozygosity mapping. FILTUS accepts virtually any variant files, in contrast to most existing programs which are limited to Variant Call Format (VCF) or other specific input formats. Typical examples of non-standard formats are VCF files with additional columns, and files produced by Annovar (Wang ). Although FILTUS is primarily intended for WES-scale data, whole-genome data can be analyzed by using the built-in prefiltering functionality. Some countries have strict regulations requiring offline handling of all human sequencing data, thus making it impossible to use web-based tools or to download information during analysis. FILTUS is ideal for such working conditions, requiring no installation, being completely self-contained and offline. In summary, the main features of FILTUS include: Stand-alone, offline desktop tool with user-friendly GUI Very flexible in terms of input file formats Simultaneous analysis of up to several hundred exomes Fast, versatile filtering, including summary table Column summaries and quality control (QC) plots Export to MERLIN-format (for linkage analysis) Creating and manipulating in-house variant databases Statistical gene prioritizing, detection of de novo mutations in trios and autozygosity mapping

2 Methods and results

2.1 Statistical evaluation of gene sharing

A common strategy for Mendelian disease gene identification is to compare WES data from unrelated patients with the same phenotype. The basic idea is to apply strict filters, leaving only potentially disease-causing variants compatible with the inheritance model, and then look for genes where these variants are enriched among the patients. In many cases this method produces a long list of genes with no obvious ranking. FILTUS implements a statistical model (Zhi and Chen, 2012) for evaluating the significance of each gene in the output (Supplementary Material S1).

2.2 Detection of de novo mutations

There is an emerging understanding that de novo mutations are a major cause of Mendelian disorders. As a result, trio sequencing has become a popular design when faced with an isolated patient with healthy parents (Chong ). To overcome the practical challenge of false positives and negatives, Bayesian methods are typically used, as in DeNovoGear (Ramu ). FILTUS implements a similar approach, computing posterior de novo probabilities from the genotype likelihoods provided by the variant caller (Supplementary Material S2). We compared FILTUS with DeNovoGear by applying both to a publically available trio data set. The results were very similar, particularly when using filters typical in clinical settings (Supplementary Material S3). In addition to posterior probabilities, FILTUS computes ALT allele percentages for each trio member, facilitating ranking and filtering.

2.3 Autozygosity mapping: the AutEx algorithm

Autozygosity (or homozygosity) mapping (Lander and Botstein, 1987) is a powerful method for mapping recessive disorders. Traditional sliding-window approaches as offered by PLINK (Purcell ) are designed for dense, evenly distributed SNPs and are not optimal for exome data. Better methods have recently been proposed, e.g. H3M2 (Magi ), and the -roh command of BCFtools (Li ), but these require skillful bioinformatic handling of sequence data. As an alternative, we introduce the AutEx algorithm for detecting autozygous regions directly from variant files. Our approach is based on a hidden Markov model (Leutenegger ) described in Supplementary Material S4. The user specifies an approximate parental relationship and a column with allele frequencies (if available). The output provides details of each estimated autozygous segment, and can be directly used for filtering. Zoomable plots show the detected regions with surrounding variants. An example of a disease gene identification aided by AutEx is given in Supplementary Material S5. We compared AutEx with traditional homozygosity mapping to validate its performance, using data from a child of first cousin parents for which both dense SNP genotypes and WES data were available. Taking the SNP-based homozygous segments (as detected by PLINK) as the true segments, AutEx applied to the WES variants exceeded 95% for both true positive and true negative rates (Supplementary Material S6).

2.4 Visualizations

FILTUS offers various plots to aid QC of the variant files: gender estimation (based on X-chromosomal heterozygosity levels), private variants (compared with the other samples) and autosomal heterozygosity level (examples given in Supplementary Material S7). In addition histograms and scatter plots can be made from any numerical columns.

3 Discussion

FILTUS has been used in many successful disease gene identifications, some of which are published (Baroy ; Fjaer ; Hansen ; Pedurupillay ) and others currently in preparation. FILTUS runs on Windows, Mac and Linux (see Supplementary Material S8 for supported versions) and is actively maintained and developed. We believe it to be a valuable contribution to the computer toolset of researchers and clinicians working with HTS variant data, especially those without access to specialized bioinformatics resources.

15 in total

1. Estimation of the inbreeding coefficient through use of genomic data.

Authors: Anne-Louise Leutenegger; Bernard Prum; Emmanuelle Génin; Christophe Verny; Arnaud Lemainque; Françoise Clerget-Darpoux; Elizabeth A Thompson
Journal: Am J Hum Genet Date: 2003-07-29 Impact factor: 11.025

2. VarSifter: visualizing and analyzing exome-scale sequence variation data on a desktop computer.

Authors: Jamie K Teer; Eric D Green; James C Mullikin; Leslie G Biesecker
Journal: Bioinformatics Date: 2011-12-30 Impact factor: 6.937

3. PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors: Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal: Am J Hum Genet Date: 2007-07-25 Impact factor: 11.025

4. Kaufman oculocerebrofacial syndrome in sisters with novel compound heterozygous mutation in UBE3B.

Authors: Christeen Ramane J Pedurupillay; Tuva Barøy; Asbjørn Holmgren; Anne Blomhoff; Magnus D Vigeland; Ying Sheng; Eirik Frengen; Petter Strømme; Doriana Misceo
Journal: Am J Med Genet A Date: 2015-03 Impact factor: 2.802

5. H3M2: detection of runs of homozygosity from whole-exome sequencing data.

Authors: Alberto Magi; Lorenzo Tattini; Flavia Palombo; Matteo Benelli; Alessandro Gialluisi; Betti Giusti; Rosanna Abbate; Marco Seri; Gian Franco Gensini; Giovanni Romeo; Tommaso Pippucci
Journal: Bioinformatics Date: 2014-06-24 Impact factor: 6.937

6. The Sequence Alignment/Map format and SAMtools.

Authors: Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal: Bioinformatics Date: 2009-06-08 Impact factor: 6.937

7. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.

Authors: Kai Wang; Mingyao Li; Hakon Hakonarson
Journal: Nucleic Acids Res Date: 2010-07-03 Impact factor: 16.971

8. DeNovoGear: de novo indel and point mutation discovery and phasing.

Authors: Avinash Ramu; Michiel J Noordam; Rachel S Schwartz; Arthur Wuster; Matthew E Hurles; Reed A Cartwright; Donald F Conrad
Journal: Nat Methods Date: 2013-08-25 Impact factor: 28.547

9. Statistical guidance for experimental design and data analysis of mutation detection in rare monogenic mendelian diseases by exome sequencing.

Authors: Degui Zhi; Rui Chen
Journal: PLoS One Date: 2012-02-10 Impact factor: 3.240

10. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics.

Authors: Sophie Coutant; Chloé Cabot; Arnaud Lefebvre; Martine Léonard; Elise Prieur-Gaston; Dominique Campion; Thierry Lecroq; Hélène Dauchel
Journal: BMC Bioinformatics Date: 2012-09-07 Impact factor: 3.169

22 in total

1. Biallelic variants in the RNA exosome gene EXOSC5 are associated with developmental delays, short stature, cerebellar hypoplasia and motor weakness.

Authors: Anne Slavotinek; Doriana Misceo; Stephanie Htun; Linda Mathisen; Eirik Frengen; Michelle Foreman; Jennifer E Hurtig; Liz Enyenihi; Maria C Sterrett; Sara W Leung; Dina Schneidman-Duhovny; Juvianee Estrada-Veras; Jacque L Duncan; Charlotte A Haaxma; Erik-Jan Kamsteeg; Vivian Xia; Daniah Beleford; Yue Si; Ganka Douglas; Hans Einar Treidene; Ambro van Hoof; Milo B Fasken; Anita H Corbett
Journal: Hum Mol Genet Date: 2020-08-03 Impact factor: 6.150

2. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome.

Authors: Caroline Lund; Pasquale Striano; Hanne Sørmo Sorte; Pasquale Parisi; Michele Iacomino; Ying Sheng; Magnus D Vigeland; Anne-Marte Øye; Rikke Steensbjerre Møller; Kaja K Selmer; Federico Zara
Journal: Mol Syndromol Date: 2016-08-17

3. Homozygosity for a nonsense variant in AIMP2 is associated with a progressive neurodevelopmental disorder with microcephaly, seizures, and spastic quadriparesis.

Authors: Anju Shukla; Aneek Das Bhowmik; Malavika Hebbar; Kadavigere V Rajagopal; Katta M Girisha; Neerja Gupta; Ashwin Dalal
Journal: J Hum Genet Date: 2017-11-16 Impact factor: 3.172

4. NOX1 Regulates Collective and Planktonic Cell Migration: Insights From Patients With Pediatric-Onset IBD and NOX1 Deficiency.

Authors: Razieh Khoshnevisan; Michael Anderson; Stephen Babcock; Sierra Anderson; David Illig; Benjamin Marquardt; Roya Sherkat; Katrin Schröder; Franziska Moll; Sebastian Hollizeck; Meino Rohlfs; Christoph Walz; Peyman Adibi; Abbas Rezaei; Alireza Andalib; Sibylle Koletzko; Aleixo M Muise; Scott B Snapper; Christoph Klein; Jay R Thiagarajah; Daniel Kotlarz
Journal: Inflamm Bowel Dis Date: 2020-07-17 Impact factor: 5.325

5. Human TGF-β1 deficiency causes severe inflammatory bowel disease and encephalopathy.

Authors: Daniel Kotlarz; Benjamin Marquardt; Tuva Barøy; Way S Lee; Liza Konnikova; Sebastian Hollizeck; Thomas Magg; Anna S Lehle; Christoph Walz; Ingo Borggraefe; Fabian Hauck; Philip Bufler; Raffaele Conca; Sarah M Wall; Eva M Schumacher; Doriana Misceo; Eirik Frengen; Beint S Bentsen; Holm H Uhlig; Karl-Peter Hopfner; Aleixo M Muise; Scott B Snapper; Petter Strømme; Christoph Klein
Journal: Nat Genet Date: 2018-02-26 Impact factor: 38.330

6. Report of the Third Family with Multiple Mitochondrial Dysfunctions Syndrome 5 Caused by the Founder Variant p.(Glu87Lys) in ISCA1.

Authors: Anju Shukla; Parneet Kaur; Katta M Girisha
Journal: J Pediatr Genet Date: 2018-04-05

7. GARLIC: Genomic Autozygosity Regions Likelihood-based Inference and Classification.

Authors: Zachary A Szpiech; Alexandra Blant; Trevor J Pemberton
Journal: Bioinformatics Date: 2017-07-01 Impact factor: 6.937

8. Benefits of clinical criteria and high-throughput sequencing for diagnosing children with syndromic craniosynostosis.

Authors: Elin Tønne; Bernt Johan Due-Tønnessen; Inger-Lise Mero; Ulrikke Straume Wiig; Mari Ann Kulseth; Magnus Dehli Vigeland; Ying Sheng; Charlotte von der Lippe; Kristian Tveten; Torstein Ragnar Meling; Eirik Helseth; Ketil Riddervold Heimdal
Journal: Eur J Hum Genet Date: 2020-12-07 Impact factor: 4.246

9. Segregation of Incomplete Achromatopsia and Alopecia Due to PDE6H and LPAR6 Variants in a Consanguineous Family from Pakistan.

Authors: Christeen Ramane J Pedurupillay; Erlend Christoffer Sommer Landsend; Magnus Dehli Vigeland; Muhammad Ansar; Eirik Frengen; Doriana Misceo; Petter Strømme
Journal: Genes (Basel) Date: 2016-07-27 Impact factor: 4.096

10. Novel PIGT Variant in Two Brothers: Expansion of the Multiple Congenital Anomalies-Hypotonia Seizures Syndrome 3 Phenotype.

Authors: Nadia Skauli; Sean Wallace; Samuel C C Chiang; Tuva Barøy; Asbjørn Holmgren; Asbjørg Stray-Pedersen; Yenan T Bryceson; Petter Strømme; Eirik Frengen; Doriana Misceo
Journal: Genes (Basel) Date: 2016-11-29 Impact factor: 4.096