| Literature DB >> 34643213 |
Zhen Cao1,2, Yanting Huang3, Ran Duan4, Peng Jin5, Zhaohui S Qin3,6, Shihua Zhang1,2,7,8.
Abstract
Understanding the impact of non-coding sequence variants on complex diseases is an essential problem. We present a novel ensemble learning framework-CASAVA, to predict genomic loci in terms of disease category-specific risk. Using disease-associated variants identified by GWAS as training data, and diverse sequencing-based genomics and epigenomics profiles as features, CASAVA provides risk prediction of 24 major categories of diseases throughout the human genome. Our studies showed that CASAVA scores at a genomic locus provide a reasonable prediction of the disease-specific and disease category-specific risk prediction for non-coding variants located within the locus. Taking MHC2TA and immune system diseases as an example, we demonstrate the potential of CASAVA in revealing variant-disease associations. A website (http://zhanglabtools.org/CASAVA) has been built to facilitate easily access to CASAVA scores.Entities:
Keywords: complex disease; disease category; ensemble learning; functional annotation; non-coding variant
Mesh:
Year: 2022 PMID: 34643213 DOI: 10.1093/bib/bbab438
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622