Mulin Jun Li1, Zhicheng Pan2, Zipeng Liu3, Jiexing Wu4, Panwen Wang5, Yun Zhu6, Feng Xu5, Zhengyuan Xia7, Pak Chung Sham2, Jean-Pierre A Kocher8, Miaoxin Li9, Jun S Liu10, Junwen Wang11. 1. Department of Statistics, Harvard University, Cambridge, Boston, 02138-2901 MA, USA, Centre for Genomic Sciences. 2. Centre for Genomic Sciences, Department of Psychiatry. 3. Centre for Genomic Sciences, Department of Anaesthesiology. 4. Department of Statistics, Harvard University, Cambridge, Boston, 02138-2901 MA, USA. 5. Centre for Genomic Sciences. 6. Centre for Genomic Sciences, School of Biomedical Sciences. 7. Department of Anaesthesiology. 8. Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ 85259, USA and. 9. Centre for Genomic Sciences, Department of Psychiatry, Centre for Reproduction, Development and Growth, LKS Faculty of Medicine, the University of Hong Kong, Hong Kong SAR, China. 10. Department of Statistics, Harvard University, Cambridge, Boston, 02138-2901 MA, USA, Center for Statistical Science, Tsinghua University, Beijing 100084, China and. 11. Centre for Genomic Sciences, Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ 85259, USA and Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA.
Abstract
MOTIVATION: Prediction and prioritization of human non-coding regulatory variants is critical for understanding the regulatory mechanisms of disease pathogenesis and promoting personalized medicine. Existing tools utilize functional genomics data and evolutionary information to evaluate the pathogenicity or regulatory functions of non-coding variants. However, different algorithms lead to inconsistent and even conflicting predictions. Combining multiple methods may increase accuracy in regulatory variant prediction. RESULTS: Here, we compiled an integrative resource for predictions from eight different tools on functional annotation of non-coding variants. We further developed a composite strategy to integrate multiple predictions and computed the composite likelihood of a given variant being regulatory variant. Benchmarked by multiple independent causal variants datasets, we demonstrated that our composite model significantly improves the prediction performance. AVAILABILITY AND IMPLEMENTATION: We implemented our model and scoring procedure as a tool, named PRVCS, which is freely available to academic and non-profit usage at http://jjwanglab.org/PRVCS CONTACT: wang.junwen@mayo.edu, jliu@stat.harvard.edu, or limx54@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Prediction and prioritization of human non-coding regulatory variants is critical for understanding the regulatory mechanisms of disease pathogenesis and promoting personalized medicine. Existing tools utilize functional genomics data and evolutionary information to evaluate the pathogenicity or regulatory functions of non-coding variants. However, different algorithms lead to inconsistent and even conflicting predictions. Combining multiple methods may increase accuracy in regulatory variant prediction. RESULTS: Here, we compiled an integrative resource for predictions from eight different tools on functional annotation of non-coding variants. We further developed a composite strategy to integrate multiple predictions and computed the composite likelihood of a given variant being regulatory variant. Benchmarked by multiple independent causal variants datasets, we demonstrated that our composite model significantly improves the prediction performance. AVAILABILITY AND IMPLEMENTATION: We implemented our model and scoring procedure as a tool, named PRVCS, which is freely available to academic and non-profit usage at http://jjwanglab.org/PRVCS CONTACT: wang.junwen@mayo.edu, jliu@stat.harvard.edu, or limx54@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Andrew B Stergachis; Eric Haugen; Anthony Shafer; Wenqing Fu; Benjamin Vernot; Alex Reynolds; Anthony Raubitschek; Steven Ziegler; Emily M LeProust; Joshua M Akey; John A Stamatoyannopoulos Journal: Science Date: 2013-12-13 Impact factor: 47.728
Authors: Jacob F Degner; Athma A Pai; Roger Pique-Regi; Jean-Baptiste Veyrieras; Daniel J Gaffney; Joseph K Pickrell; Sherryl De Leon; Katelyn Michelini; Noah Lewellen; Gregory E Crawford; Matthew Stephens; Yoav Gilad; Jonathan K Pritchard Journal: Nature Date: 2012-02-05 Impact factor: 49.962
Authors: Peter D Stenson; Matthew Mort; Edward V Ball; Katy Shaw; Andrew Phillips; David N Cooper Journal: Hum Genet Date: 2014-01 Impact factor: 4.132
Authors: Simon A Forbes; David Beare; Prasad Gunasekaran; Kenric Leung; Nidhi Bindal; Harry Boutselakis; Minjie Ding; Sally Bamford; Charlotte Cole; Sari Ward; Chai Yin Kok; Mingming Jia; Tisham De; Jon W Teague; Michael R Stratton; Ultan McDermott; Peter J Campbell Journal: Nucleic Acids Res Date: 2014-10-29 Impact factor: 16.971
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott Journal: Nucleic Acids Res Date: 2013-11-14 Impact factor: 16.971
Authors: Nilah M Ioannidis; Joe R Davis; Marianne K DeGorter; Nicholas B Larson; Shannon K McDonnell; Amy J French; Alexis J Battle; Trevor J Hastie; Stephen N Thibodeau; Stephen B Montgomery; Carlos D Bustamante; Weiva Sieh; Alice S Whittemore Journal: Bioinformatics Date: 2017-12-15 Impact factor: 6.937