| Literature DB >> 30809243 |
Pawel Suwinski1, ChuangKee Ong2,3, Maurice H T Ling2, Yang Ming Poh2, Asif M Khan2,4, Hui San Ong2.
Abstract
There is a growing attention toward personalized medicine. This is led by a fundamental shift from the 'one size fits all' paradigm for treatment of patients with conditions or predisposition to diseases, to one that embraces novel approaches, such as tailored target therapies, to achieve the best possible outcomes. Driven by these, several national and international genome projects have been initiated to reap the benefits of personalized medicine. Exome and targeted sequencing provide a balance between cost and benefit, in contrast to whole genome sequencing (WGS). Whole exome sequencing (WES) targets approximately 3% of the whole genome, which is the basis for protein-coding genes. Nonetheless, it has the characteristics of big data in large deployment. Herein, the application of WES and its relevance in advancing personalized medicine is reviewed. WES is mapped to Big Data "10 Vs" and the resulting challenges discussed. Application of existing biological databases and bioinformatics tools to address the bottleneck in data processing and analysis are presented, including the need for new generation big data analytics for the multi-omics challenges of personalized medicine. This includes the incorporation of artificial intelligence (AI) in the clinical utility landscape of genomic information, and future consideration to create a new frontier toward advancing the field of personalized medicine.Entities:
Keywords: analytics; big data; exome; personalized medicine; precision; sequencing
Year: 2019 PMID: 30809243 PMCID: PMC6379253 DOI: 10.3389/fgene.2019.00049
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
List of biological databases and bioinformatics tools relevant for data-warehousing, alignment, processing or analysis of sequence reads.
| Category | Bioinformatics tools | Reference |
|---|---|---|
| Read alignment | BWA | |
| Bowtie | ||
| Annotation | Annovar (Qiagen) | |
| Variant Effect Predictor (Ensembl) | ||
| SNPsift and SNPeffect | ||
| Variant Annotation Integrator (UCSC) | ||
| NCBI Variant Annotation | ||
| Sift4G | ||
| WGS annotator (runnable on the Amazon Compute Cloud) | ||
| Visualization | NCBI Variant Viewer | |
| UCSC Genome Browser | ||
| ENSEMBL Genome Browser | ||
| ExAC browser | ||
| Integrative Genomics Viewer (IGV) | ||
| Personal Genome Browser (PGB) | ||
| 3D Genome Browser | ||
| Data-warehousing | ClinVar (clinical significance) | |
| dbSNP (NCBI main variant annotation database) | ||
| dbNSFP (variants damage prediction using many | ||
| COSMIC (Catalogue of Somatic Mutations in Cancer) | ||
| GWAS Catalog | ||
| GWAS Central | ||
| Cancer Atlas | ||
| RefSeq | ||
| PANTHER | ||
| TCGA (The Cancer Genome Atlas) | ||
| ICGC (International Cancer Genome Consortium | ||
| Analytics | Genome Analysis Toolkit (GATK) | |
| MuTect | ||
| OTG-snpcaller | ||
| ASEQ | ||
| Halvade-RNA | ||
| GT-WGS | ||
| EXCAVATOR2 | ||
| KaryoScan | ||
| AI-based analytics | Exomiser | |
| DeepVariant | ||
| Deep Genomics | ||
| Qiagen (Ingenuity Variant Analysis and Ingenuity Pathway Analysis) | ||
| Golden Helix (VarSeq, VSCkinical) | ||
| Advaita (iVariant/iPatway/iBio Guides) | ||
| Lifemap Sciences |
FIGURE 1The 10 Vs big data characteristics of whole exome sequencing.
Comparison of various NGS technique and primary analysis tools.
| NGS techniques | Study aim(s) | Data size per sample | Tool(s) used | Reference |
|---|---|---|---|---|
| WGS | ~90 GB | Velvet, SOAPdenovo | ||
| WES | Protein-coding variant identification | ~5–6 GB | Edico DRAGEN, GATK, Samtools | |
| RNA-seq | Gene expression, novel isoform discovery | ~3–4 GB | DESeq, Cufflinks | |
| ChIP-seq | Protein–DNA interaction study, i.e., identification of histone marks and transcription factor binding sites | ~1–2 GB | QuEST, MACS | |
| Bisulfite-seq | DNA methylation sites identification | ~1–2 GB | BS Seeker |
FIGURE 2The changing paradigms of personalized medicine. (A) Notable timelines in Genomics and Personalized Medicine, including the data storage size for the four big data domains by 2025, with genomics either on par or the most demanding of the domains (Stephens et al., 2015). (B) The intersection of big data analytics and WES for advancement of personalized medicine. The drawings are not to scale.