Literature DB >> 30256891

De novo pattern discovery enables robust assessment of functional consequences of non-coding variants.

Hai Yang1,2, Rui Chen1,2, Quan Wang1,2, Qiang Wei1,2, Ying Ji1,2, Guangze Zheng1,2, Xue Zhong2,3, Nancy J Cox2,3, Bingshan Li1,2.   

Abstract

MOTIVATION: Given the complexity of genome regions, prioritize the functional effects of non-coding variants remains a challenge. Although several frameworks have been proposed for the evaluation of the functionality of non-coding variants, most of them used 'black boxes' methods that simplify the task as the pathogenicity/benign classification problem, which ignores the distinct regulatory mechanisms of variants and leads to less desirable performance. In this study, we developed DVAR, an unsupervised framework that leverage various biochemical and evolutionary evidence to distinguish the gene regulatory categories of variants and assess their comprehensive functional impact simultaneously.
RESULTS: DVAR performed de novo pattern discovery in high-dimensional data and identified five regulatory clusters of non-coding variants. Leveraging the new insights into the multiple functional patterns, it measures both the between-class and the within-class functional implication of the variants to achieve accurate prioritization. Compared to other two-class learning methods, it showed improved performance in identification of clinically significant variants, fine-mapped GWAS variants, eQTLs and expression-modulating variants. Moreover, it has superior performance on disease causal variants verified by genome-editing (like CRISPR-Cas9), which could provide a pre-selection strategy for genome-editing technologies across the whole genome. Finally, evaluated in BioVU and UK Biobank, two large-scale DNA biobanks linked to complete electronic health records, DVAR demonstrated its effectiveness in prioritizing non-coding variants associated with medical phenotypes.
AVAILABILITY AND IMPLEMENTATION: The C++ and Python source codes, the pre-computed DVAR-cluster labels and DVAR-scores across the whole genome are available at https://www.vumc.org/cgg/dvar. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Year:  2019        PMID: 30256891      PMCID: PMC6499232          DOI: 10.1093/bioinformatics/bty826

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  47 in total

1.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

2.  ChromHMM: automating chromatin-state discovery and characterization.

Authors:  Jason Ernst; Manolis Kellis
Journal:  Nat Methods       Date:  2012-02-28       Impact factor: 28.547

3.  Distribution and intensity of constraint in mammalian genomic sequence.

Authors:  Gregory M Cooper; Eric A Stone; George Asimenos; Eric D Green; Serafim Batzoglou; Arend Sidow
Journal:  Genome Res       Date:  2005-06-17       Impact factor: 9.043

Review 4.  Identifying regulatory elements in eukaryotic genomes.

Authors:  Leelavati Narlikar; Ivan Ovcharenko
Journal:  Brief Funct Genomic Proteomic       Date:  2009-06-04

5.  Presenting ENCODE.

Authors:  Magdalena Skipper; Ritu Dhand; Philip Campbell
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

6.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.

Authors:  Adam Siepel; Gill Bejerano; Jakob S Pedersen; Angie S Hinrichs; Minmei Hou; Kate Rosenbloom; Hiram Clawson; John Spieth; Ladeana W Hillier; Stephen Richards; George M Weinstock; Richard K Wilson; Richard A Gibbs; W James Kent; Webb Miller; David Haussler
Journal:  Genome Res       Date:  2005-07-15       Impact factor: 9.043

7.  The NIH Roadmap Epigenomics Mapping Consortium.

Authors:  Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal:  Nat Biotechnol       Date:  2010-10       Impact factor: 54.908

8.  A user's guide to the encyclopedia of DNA elements (ENCODE).

Authors: 
Journal:  PLoS Biol       Date:  2011-04-19       Impact factor: 8.029

9.  Identifying a high fraction of the human genome to be under selective constraint using GERP++.

Authors:  Eugene V Davydov; David L Goode; Marina Sirota; Gregory M Cooper; Arend Sidow; Serafim Batzoglou
Journal:  PLoS Comput Biol       Date:  2010-12-02       Impact factor: 4.475

10.  Identifying novel constrained elements by exploiting biased substitution patterns.

Authors:  Manuel Garber; Mitchell Guttman; Michele Clamp; Michael C Zody; Nir Friedman; Xiaohui Xie
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

View more
  2 in total

Review 1.  Methods for the Analysis and Interpretation for Rare Variants Associated with Complex Traits.

Authors:  J Dylan Weissenkampen; Yu Jiang; Scott Eckert; Bibo Jiang; Bingshan Li; Dajiang J Liu
Journal:  Curr Protoc Hum Genet       Date:  2019-03-08

2.  TVAR: assessing tissue-specific functional effects of non-coding variants with deep learning.

Authors:  Hai Yang; Rui Chen; Quan Wang; Qiang Wei; Ying Ji; Xue Zhong; Bingshan Li
Journal:  Bioinformatics       Date:  2022-10-14       Impact factor: 6.931

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.