Farren B S Briggs1, Corriene Sept2. 1. Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, 2103 Cornell Rd, Cleveland, OH 44106, USA. 2. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
Abstract
(1) Background: Complex genetic relationships, including gene-gene (G × G; epistasis), gene(n), and gene-environment (G × E) interactions, explain a substantial portion of the heritability in multiple sclerosis (MS). Machine learning and data mining methods are promising approaches for uncovering higher order genetic relationships, but their use in MS have been limited. (2) Methods: Association rule mining (ARM), a combinatorial rule-based machine learning algorithm, was applied to genetic data for non-Latinx MS cases (n = 207) and controls (n = 179). The objective was to identify patterns (rules) amongst the known MS risk variants, including HLA-DRB1*15:01 presence, HLA-A*02:01 absence, and 194 of the 200 common autosomal variants. Probabilistic measures (confidence and support) were used to mine rules. (3) Results: 114 rules met minimum requirements of 80% confidence and 5% support. The top ranking rule by confidence consisted of HLA-DRB1*15:01, SLC30A7-rs56678847 and AC093277.1-rs6880809; carriers of these variants had a significantly greater risk for MS (odds ratio = 20.2, 95% CI: 8.5, 37.5; p = 4 × 10-9). Several variants were shared across rules, the most common was INTS8-rs78727559, which was in 32.5% of rules. (4) Conclusions: In summary, we demonstrate evidence that specific combinations of MS risk variants disproportionately confer elevated risk by applying a robust analytical framework to a modestly sized study population.
(1) Background: Complex genetic relationships, including gene-gene (G × G; epistasis), gene(n), and gene-environment (G × E) interactions, explain a substantial portion of the heritability in multiple sclerosis (MS). Machine learning and data mining methods are promising approaches for uncovering higher order genetic relationships, but their use in MS have been limited. (2) Methods: Association rule mining (ARM), a combinatorial rule-based machine learning algorithm, was applied to genetic data for non-Latinx MS cases (n = 207) and controls (n = 179). The objective was to identify patterns (rules) amongst the known MS risk variants, including HLA-DRB1*15:01 presence, HLA-A*02:01 absence, and 194 of the 200 common autosomal variants. Probabilistic measures (confidence and support) were used to mine rules. (3) Results: 114 rules met minimum requirements of 80% confidence and 5% support. The top ranking rule by confidence consisted of HLA-DRB1*15:01, SLC30A7-rs56678847 and AC093277.1-rs6880809; carriers of these variants had a significantly greater risk for MS (odds ratio = 20.2, 95% CI: 8.5, 37.5; p = 4 × 10-9). Several variants were shared across rules, the most common was INTS8-rs78727559, which was in 32.5% of rules. (4) Conclusions: In summary, we demonstrate evidence that specific combinations of MS risk variants disproportionately confer elevated risk by applying a robust analytical framework to a modestly sized study population.
Entities:
Keywords:
association rule mining; epistasis; genetic interactions; multiple sclerosis
Authors: Gaddiel Galarza-Muñoz; Farren B S Briggs; Irina Evsyukova; Geraldine Schott-Lerner; Edward M Kennedy; Tinashe Nyanhete; Liuyang Wang; Laura Bergamaschi; Steven G Widen; Georgia D Tomaras; Dennis C Ko; Shelton S Bradrick; Lisa F Barcellos; Simon G Gregory; Mariano A Garcia-Blanco Journal: Cell Date: 2017-03-23 Impact factor: 41.582
Authors: Veit Rothhammer; Davis M Borucki; Emily C Tjon; Maisa C Takenaka; Chun-Cheih Chao; Alberto Ardura-Fabregat; Kalil Alves de Lima; Cristina Gutiérrez-Vázquez; Patrick Hewson; Ori Staszewski; Manon Blain; Luke Healy; Tradite Neziraj; Matilde Borio; Michael Wheeler; Loic Lionel Dragin; David A Laplaud; Jack Antel; Jorge Ivan Alvarez; Marco Prinz; Francisco J Quintana Journal: Nature Date: 2018-05-16 Impact factor: 49.962
Authors: Anna Karin Hedström; Izaura Lima Bomfim; Lisa Barcellos; Milena Gianfrancesco; Catherine Schaefer; Ingrid Kockum; Tomas Olsson; Lars Alfredsson Journal: Neurology Date: 2014-02-05 Impact factor: 9.910
Authors: Anna Karin Hedström; Izaura Lima Bomfim; Lisa F Barcellos; Farren Briggs; Catherine Schaefer; Ingrid Kockum; Tomas Olsson; Lars Alfredsson Journal: Int J Epidemiol Date: 2014-10-15 Impact factor: 7.196
Authors: Anna Karin Hedström; Michail Katsoulis; Ola Hössjer; Izaura L Bomfim; Annette Oturai; Helle Bach Sondergaard; Finn Sellebjerg; Henrik Ullum; Lise Wegner Thørner; Marte Wendel Gustavsen; Hanne F Harbo; Dragana Obradovic; Milena A Gianfrancesco; Lisa F Barcellos; Catherine A Schaefer; Jan Hillert; Ingrid Kockum; Tomas Olsson; Lars Alfredsson Journal: Eur J Epidemiol Date: 2017-06-08 Impact factor: 8.082
Authors: Damian Szklarczyk; Annika L Gable; David Lyon; Alexander Junge; Stefan Wyder; Jaime Huerta-Cepas; Milan Simonovic; Nadezhda T Doncheva; John H Morris; Peer Bork; Lars J Jensen; Christian von Mering Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971