| Literature DB >> 27075447 |
Stefan H Lelieveld1, Joris A Veltman2,3, Christian Gilissen4.
Abstract
With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing.Entities:
Mesh:
Year: 2016 PMID: 27075447 PMCID: PMC4883269 DOI: 10.1007/s00439-016-1658-6
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Overview of some of the novel bioinformatics tools related to the storage, analysis or interpretation of exome sequencing data
| Name | Description | Website |
|---|---|---|
| Data-compression | ||
| CRAMtools | Framework to compress BAM files into CRAM format |
|
| Scramble | C implementation of CRAM to compress BAM into CRAM format for faster encoding |
|
| TABIX | Tool to index and query bgzip-compressed VCF formatted files, available via SAMtools |
|
| Genotype query tools | Toolset to compress and query VCF files. Designed to compress large-scale cohorts |
|
| Cloud tools | ||
| CloudBurst | Cloud-based parallel read-mapping algorithm to map sequence reads to a reference |
|
| Cloud aligner | Cloud-based Hadoop MapReduce-based approach mapping of sequence reads |
|
| Crossbow | Cloud-computing software tool that combines read-mapping and the SNP genotyping |
|
| VAT | Variant Annotation Tool (VAT) is a Cloud-based platform to functionally annotate variants |
|
| Mercury | A whole exome sequencing analysis workflow deployed In the Amazon Web Services (AWS) cloud |
|
| Variant prioritization tools | ||
| CADD | Combined 63 annotations into one meta-score (C score) for the entire genome based on a SVM |
|
| Eigen | Spectral approach to the functional annotation of genetic variants in coding and non-coding regions. |
|
| DANN | DANN used the same feature set and training data as CADD to train a deep neural network (DNN). |
|
| FitCons | Predictions of pathogenicity for the entire genome based on evolutionary conservation and functional data |
|
| SPANR/SPIDEX | Trained a model optimized for the prioritization of splice site variants with a deep learning approach |
|
| HAL | Prioritization of splice site variants based on their effect of (alternative) RNA splicing |
|
| PHIVE | Analysis of exome variants by computing phenotype similarity between human disease phenotypes and phenotype information from knockout experiments in model organisms |
|
| RVIS | The Residual Variation Intolerance Score or RVIS is a gene based score to prioritize disease genes based on intolerant to genetic variation |
|
| CNV detection | ||
| CoNIFER | Detects rare CNVs in exome data based on sequence read-depth |
|
| XHMM | Uses principal-component analysis (PCA) to normalize exome read-depth and a hidden Markov model (HMM) to detect CNVs |
|
| Codex | Normalization and CNV calling procedure for whole exome sequencing data |
|
| Data sharing | ||
| ExAC | 60,706 unrelated individuals sequenced as part of various disease-specific and population genetic studies |
|
| DECIPHER | Database containing data from 18,533 patients who have given consent for broad data-sharing |
|
| Café variome | Platform to share genetic variant and phenotype data on a global scale |
|
| GeneMatcher | Online platform designed to connect clinicians and researchers from around the world who share an interest in the same gene or genes |
|
| RD-connect | Platform that links up data used in rare disease research into a central resource for researchers worldwide |
|
| PhenomeCentral | Repository for secure data-sharing targeted to clinicians and scientists working in the rare disorder community |
|
| MatchMaker Exchange | Platform enabling matching of cases with similar phenotypic and genotypic profiles though a number of databases |
|
| Phenotypes | ||
| Phenotips | A software tool for collecting and analyzing phenotypic information for patients with genetic disorders |
|
| PhenoDB | A software tool to store and analyze standardized phenotypic information |
|
| Phenominer | A tool to extract structured phenotypes from text |
|