| Literature DB >> 29721916 |
Emma Graham1,2, Jessica Lee2,3, Magda Price2, Maja Tarailo-Graovac4,5,6, Allison Matthews7, Udo Engelke8, Jeffrey Tang2, Leo A J Kluijtmans8, Ron A Wevers8, Wyeth W Wasserman2,3, Clara D M van Karnebeek9,10,11, Sara Mostafavi12,13,14.
Abstract
Many inborn errors of metabolism (IEMs) are amenable to treatment; therefore, early diagnosis and treatment is imperative. Despite recent advances, the genetic basis of many metabolic phenotypes remains unknown. For discovery purposes, whole exome sequencing (WES) variant prioritization coupled with clinical and bioinformatics expertise is the primary method used to identify novel disease-causing variants; however, causation is often difficult to establish due to the number of plausible variants. Integrated analysis of untargeted metabolomics (UM) and WES or whole genome sequencing (WGS) data is a promising systematic approach for identifying disease-causing variants. In this review, we provide a literature-based overview of UM methods utilizing liquid chromatography mass spectrometry (LC-MS), and assess approaches to integrating WES/WGS and LC-MS UM data for the discovery and prioritization of variants causing IEMs. To embed this integrated -omics approach in the clinic, expansion of gene-metabolite annotations and metabolomic feature-to-metabolite mapping methods are needed.Entities:
Keywords: Genomics; Inborn errors of metabolism; Metabolomics; Omic integration; Variant prioritization
Mesh:
Year: 2018 PMID: 29721916 PMCID: PMC5959954 DOI: 10.1007/s10545-018-0139-6
Source DB: PubMed Journal: J Inherit Metab Dis ISSN: 0141-8955 Impact factor: 4.982
Fig. 1WES rare variant analysis pipeline for the detection of inborn errors of metabolism causing neurometabolic disorders, as used in Tarailo-Graovac et al 2016. Given raw sequencing reads for each patient, this pipeline identifies a conservative list of candidate variants (MAF ≤ 0.01). First, raw reads (FASTQ files) are aligned to the human genome (hg19 or equivalent). Second, variants are annotated using published software programs like ANNOVAR. Third, variants that do not map to protein-coding regions, or that do not pass QC steps are removed. Fourth, variants that do not agree with multiple inheritance models and that would not agree with the observed phenotypic effect are removed. Finally, rare variants are selected by removing variants with annotated minor allele frequencies (MAF) greater than 0.01
Fig. 2Sample LC-MS metabolomics analysis pipeline. Briefly, raw metabolomics data can be processed using freely available processing software (e.g., XCMS), annotated (e.g., CAMERA), normalized (e.g., through use of internal standards), and filtered. Differentially abundant metabolites can be isolated using univariate or multivariate tests. Biological interpretation such as pathway analysis can be performed using published metabolomic databases (e.g., HMDB, BioCyc, METLIN)
Fig. 3Untargeted metabolomics pre-processing pipeline. A combination of automated and manual steps are used to prepare metabolomics data for downstream analysis. The algorithms listed are only examples of tools that could be used in each step
Identified IEM genes, their functions, and number of associated metabolites (as listed in the Human Metabolome Database)
| Gene | Function | Number of annotated metabolites |
|---|---|---|
| Catalyzes the transfer of the acyl group of long-chain fatty acid-CoA conjugates onto carnitine, an essential step for the mitochondrial uptake of long-chain fatty acids and their subsequent beta-oxidation in the mitochondrion | 13,757 | |
| Produces phosphorylated and unphosphorylated forms of N-acetylneuraminic acid (Neu5Ac) and 2-keto-3-deoxy-D-glycero-D-galacto-nononic acid (KDN) | 9 | |
| Phosphotransferase | 2 | |
| Sodium ion membrane transporter | 5 |