Literature DB >> 33515242

A semi-supervised model to predict regulatory effects of genetic variants at single nucleotide resolution using massively parallel reporter assays.

Zikun Yang1, Chen Wang1, Stephanie Erjavec2, Lynn Petukhova3,4, Angela Christiano2,4, Iuliana Ionita-Laza1.   

Abstract

MOTIVATION: Predicting regulatory effects of genetic variants is a challenging but important problem in functional genomics. Given the relatively low sensitivity of functional assays, and the pervasiveness of class imbalance in functional genomic data, popular statistical prediction models can sharply underestimate the probability of a regulatory effect. We describe here the presence-only model (PO-EN), a type of semi-supervised model, to predict regulatory effects of genetic variants at sequence-level resolution in a context of interest by integrating a large number of epigenetic features and massively parallel reporter assays (MPRAs).
RESULTS: Using experimental data from a variety of MPRAs we show that the presence-only model produces better calibrated predicted probabilities and has increased accuracy relative to state-of-the-art prediction models. Furthermore, we show that the predictions based on pre-trained PO-EN models are useful for prioritizing functional variants among candidate eQTLs and significant SNPs at GWAS loci. In particular, for the costimulatory locus, associated with multiple autoimmune diseases, we show evidence of a regulatory variant residing in an enhancer 24.4 kb downstream of CTLA4, with evidence from capture Hi-C of interaction with CTLA4. Furthermore, the risk allele of the regulatory variant is on the same risk increasing haplotype as a functional coding variant in exon 1 of CTLA4, suggesting that the regulatory variant acts jointly with the coding variant leading to increased risk to disease. AVAILABILITY: The presence-only model is implemented in the R package 'PO.EN', freely available on CRAN. A vignette describing a detailed demonstration of using the proposed PO-EN model can be found on github at https://github.com/Iuliana-Ionita-Laza/PO.EN/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) (2021). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Year:  2021        PMID: 33515242      PMCID: PMC8337004          DOI: 10.1093/bioinformatics/btab040

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  35 in total

1.  Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease.

Authors:  Hironori Ueda; Joanna M M Howson; Laura Esposito; Joanne Heward; Hywel Snook; Giselle Chamberlain; Daniel B Rainbow; Kara M D Hunter; Annabel N Smith; Gianfranco Di Genova; Mathias H Herr; Ingrid Dahlman; Felicity Payne; Deborah Smyth; Christopher Lowe; Rebecca C J Twells; Sarah Howlett; Barry Healy; Sarah Nutland; Helen E Rance; Vin Everett; Luc J Smink; Alex C Lam; Heather J Cordell; Neil M Walker; Cristina Bordin; John Hulme; Costantino Motzo; Francesco Cucca; J Fred Hess; Michael L Metzker; Jane Rogers; Simon Gregory; Amit Allahabadia; Ratnasingam Nithiyananthan; Eva Tuomilehto-Wolf; Jaakko Tuomilehto; Polly Bingley; Kathleen M Gillespie; Dag E Undlien; Kjersti S Rønningen; Cristian Guja; Constantin Ionescu-Tîrgovişte; David A Savage; A Peter Maxwell; Dennis J Carson; Chris C Patterson; Jayne A Franklyn; David G Clayton; Laurence B Peterson; Linda S Wicker; John A Todd; Stephen C L Gough
Journal:  Nature       Date:  2003-04-30       Impact factor: 49.962

2.  On estimating probability of presence from use-availability or presence-background data.

Authors:  Steven J Phillips; Jane Elith
Journal:  Ecology       Date:  2013-06       Impact factor: 5.499

Review 3.  Beyond GWASs: illuminating the dark road from association to function.

Authors:  Stacey L Edwards; Jonathan Beesley; Juliet D French; Alison M Dunning
Journal:  Am J Hum Genet       Date:  2013-11-07       Impact factor: 11.025

4.  Cost-Sensitive Feature Selection by Optimizing F-Measures.

Authors: 
Journal:  IEEE Trans Image Process       Date:  2017-12-08       Impact factor: 10.856

5.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus.

Authors:  Kiran Musunuru; Alanna Strong; Maria Frank-Kamenetsky; Noemi E Lee; Tim Ahfeldt; Katherine V Sachs; Xiaoyu Li; Hui Li; Nicolas Kuperwasser; Vera M Ruda; James P Pirruccello; Brian Muchmore; Ludmila Prokunina-Olsson; Jennifer L Hall; Eric E Schadt; Carlos R Morales; Sissel Lund-Katz; Michael C Phillips; Jamie Wong; William Cantley; Timothy Racie; Kenechi G Ejebe; Marju Orho-Melander; Olle Melander; Victor Koteliansky; Kevin Fitzgerald; Ronald M Krauss; Chad A Cowan; Sekar Kathiresan; Daniel J Rader
Journal:  Nature       Date:  2010-08-05       Impact factor: 49.962

6.  Haplotypes in the CTLA4 region are associated with coeliac disease in the Irish population.

Authors:  K Brophy; A W Ryan; J M Thornton; M Abuzakouk; A P Fitzgerald; R M McLoughlin; C O'morain; N P Kennedy; F M Stevens; C Feighery; D Kelleher; R McManus
Journal:  Genes Immun       Date:  2006-01       Impact factor: 2.676

7.  LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS.

Authors:  William Fithian; Trevor Hastie
Journal:  Ann Stat       Date:  2014-10-01       Impact factor: 4.028

Review 8.  Massively Parallel Reporter Assays: Defining Functional Psychiatric Genetic Variants Across Biological Contexts.

Authors:  Bernard Mulvey; Tomás Lagunas; Joseph D Dougherty
Journal:  Biol Psychiatry       Date:  2020-06-18       Impact factor: 13.382

Review 9.  Massively Parallel Assays and Quantitative Sequence-Function Relationships.

Authors:  Justin B Kinney; David M McCandlish
Journal:  Annu Rev Genomics Hum Genet       Date:  2019-05-15       Impact factor: 9.340

10.  A semi-supervised approach for predicting cell-type specific functional consequences of non-coding variation using MPRAs.

Authors:  Zihuai He; Linxi Liu; Kai Wang; Iuliana Ionita-Laza
Journal:  Nat Commun       Date:  2018-12-05       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.