Mohith Manjunath1,2, Jialu Yan1,2, Yeoan Youn3, Kristen L Drucker4, Thomas M Kollmeyer4, Andrew M McKinney5, Valter Zazubovich6, Yi Zhang2,7,8, Joseph F Costello5, Jeanette Eckel-Passow9, Paul R Selvin1,3, Robert B Jenkins4, Jun S Song1,2. 1. Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA. 2. Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA. 3. Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA. 4. Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA. 5. Department of Neurological Surgery, University of California San Francisco, San Francisco, California, USA. 6. Department of Physics, Concordia University, Montreal, Québec, Canada. 7. Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA. 8. Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts, USA. 9. Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA.
Abstract
BACKGROUND: Large-scale genome-wide association studies (GWAS) have implicated thousands of germline genetic variants in modulating individuals' risk to various diseases, including cancer. At least 25 risk loci have been identified for low-grade gliomas (LGGs), but their molecular functions remain largely unknown. METHODS: We hypothesized that GWAS loci contain causal single nucleotide polymorphisms (SNPs) that reside in accessible open chromatin regions and modulate the expression of target genes by perturbing the binding affinity of transcription factors (TFs). We performed an integrative analysis of genomic and epigenomic data from The Cancer Genome Atlas and other public repositories to identify candidate causal SNPs within linkage disequilibrium blocks of LGG GWAS loci. We assessed their potential regulatory role via in silico TF binding sequence perturbations, convolutional neural network trained on TF binding data, and simulated annealing-based interpretation methods. RESULTS: We built an interactive website (http://education.knoweng.org/alg3/) summarizing the functional footprinting of 280 variants in 25 LGG GWAS regions, providing rich information for further computational and experimental scrutiny. We identified as case studies PHLDB1 and SLC25A26 as candidate target genes of rs12803321 and rs11706832, respectively, and predicted the GWAS variant rs648044 to be the causal SNP modulating ZBTB16, a known tumor suppressor in multiple cancers. We showed that rs648044 likely perturbed the binding affinity of the TF MAFF, as supported by RNA interference and in vitro MAFF binding experiments. CONCLUSIONS: The identified candidate (causal SNP, target gene, TF) triplets and the accompanying resource will help accelerate our understanding of the molecular mechanisms underlying genetic risk factors for gliomas.
BACKGROUND: Large-scale genome-wide association studies (GWAS) have implicated thousands of germline genetic variants in modulating individuals' risk to various diseases, including cancer. At least 25 risk loci have been identified for low-grade gliomas (LGGs), but their molecular functions remain largely unknown. METHODS: We hypothesized that GWAS loci contain causal single nucleotide polymorphisms (SNPs) that reside in accessible open chromatin regions and modulate the expression of target genes by perturbing the binding affinity of transcription factors (TFs). We performed an integrative analysis of genomic and epigenomic data from The Cancer Genome Atlas and other public repositories to identify candidate causal SNPs within linkage disequilibrium blocks of LGG GWAS loci. We assessed their potential regulatory role via in silico TF binding sequence perturbations, convolutional neural network trained on TF binding data, and simulated annealing-based interpretation methods. RESULTS: We built an interactive website (http://education.knoweng.org/alg3/) summarizing the functional footprinting of 280 variants in 25 LGG GWAS regions, providing rich information for further computational and experimental scrutiny. We identified as case studies PHLDB1 and SLC25A26 as candidate target genes of rs12803321 and rs11706832, respectively, and predicted the GWAS variant rs648044 to be the causal SNP modulating ZBTB16, a known tumor suppressor in multiple cancers. We showed that rs648044 likely perturbed the binding affinity of the TF MAFF, as supported by RNA interference and in vitro MAFF binding experiments. CONCLUSIONS: The identified candidate (causal SNP, target gene, TF) triplets and the accompanying resource will help accelerate our understanding of the molecular mechanisms underlying genetic risk factors for gliomas.
Authors: Alex I Finnegan; Somang Kim; Hu Jin; Michael Gapinske; Wendy S Woods; Pablo Perez-Pinera; Jun S Song Journal: Nucleic Acids Res Date: 2020-05-07 Impact factor: 16.971
Authors: Karim Labreche; Ben Kinnersley; Giulia Berzero; Anna Luisa Di Stefano; Amithys Rahimian; Ines Detrait; Yannick Marie; Benjamin Grenier-Boley; Khe Hoang-Xuan; Jean-Yves Delattre; Ahmed Idbaih; Richard S Houlston; Marc Sanson Journal: Acta Neuropathol Date: 2018-02-19 Impact factor: 17.088
Authors: Xin Sun; Vijesh Vaghjiani; W Samantha N Jayasekara; Jason E Cain; Justin C St John Journal: Clin Epigenetics Date: 2018-12-17 Impact factor: 6.551
Authors: Ramakrishna Kommagani; Maria M Szwarc; Yasmin M Vasquez; Mary C Peavey; Erik C Mazur; William E Gibbons; Rainer B Lanz; Francesco J DeMayo; John P Lydon Journal: PLoS Genet Date: 2016-04-01 Impact factor: 5.917