Literature DB >> 34276057

Prioritization of disease genes from GWAS using ensemble-based positive-unlabeled learning.

Nikita Kolosov1,2,3, Mark J Daly4,5,6, Mykyta Artomov7,8,9,10,11.   

Abstract

A primary challenge in understanding disease biology from genome-wide association studies (GWAS) arises from the inability to directly implicate causal genes from association data. Integration of multiple-omics data sources potentially provides important functional links between associated variants and candidate genes. Machine-learning is well-positioned to take advantage of a variety of such data and provide a solution for the prioritization of disease genes. Yet, classical positive-negative classifiers impose strong limitations on the gene prioritization procedure, such as a lack of reliable non-causal genes for training. Here, we developed a novel gene prioritization tool-Gene Prioritizer (GPrior). It is an ensemble of five positive-unlabeled bagging classifiers (Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, Adaptive Boosting), that treats all genes of unknown relevance as an unlabeled set. GPrior selects an optimal composition of algorithms to tune the model for each specific phenotype. Altogether, GPrior fills an important niche of methods for GWAS data post-processing, significantly improving the ability to pinpoint disease genes compared to existing solutions.
© 2021. The Author(s), under exclusive licence to European Society of Human Genetics.

Entities:  

Mesh:

Year:  2021        PMID: 34276057      PMCID: PMC8484264          DOI: 10.1038/s41431-021-00930-w

Source DB:  PubMed          Journal:  Eur J Hum Genet        ISSN: 1018-4813            Impact factor:   5.351


  35 in total

1.  Significant linkage and association between a functional (GT)n polymorphism in promoter of the N-methyl-D-aspartate receptor subunit gene (GRIN2A) and schizophrenia.

Authors:  Jinsong Tang; Xiaogang Chen; Xijia Xu; Renrong Wu; Jingping Zhao; Zhengmao Hu; Kun Xia
Journal:  Neurosci Lett       Date:  2006-10-02       Impact factor: 3.046

2.  Machine Learning-Based Gene Prioritization Identifies Novel Candidate Risk Genes for Inflammatory Bowel Disease.

Authors:  Ofer Isakov; Iris Dotan; Shay Ben-Shachar
Journal:  Inflamm Bowel Dis       Date:  2017-09       Impact factor: 5.325

3.  Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease.

Authors:  Manuel A Rivas; Mélissa Beaudoin; Agnes Gardet; Christine Stevens; Yashoda Sharma; Clarence K Zhang; Gabrielle Boucher; Stephan Ripke; David Ellinghaus; Noel Burtt; Tim Fennell; Andrew Kirby; Anna Latiano; Philippe Goyette; Todd Green; Jonas Halfvarson; Talin Haritunians; Joshua M Korn; Finny Kuruvilla; Caroline Lagacé; Benjamin Neale; Ken Sin Lo; Phil Schumm; Leif Törkvist; Marla C Dubinsky; Steven R Brant; Mark S Silverberg; Richard H Duerr; David Altshuler; Stacey Gabriel; Guillaume Lettre; Andre Franke; Mauro D'Amato; Dermot P B McGovern; Judy H Cho; John D Rioux; Ramnik J Xavier; Mark J Daly
Journal:  Nat Genet       Date:  2011-10-09       Impact factor: 38.330

4.  Speeding disease gene discovery by sequence based candidate prioritization.

Authors:  Euan A Adie; Richard R Adams; Kathryn L Evans; David J Porteous; Ben S Pickard
Journal:  BMC Bioinformatics       Date:  2005-03-14       Impact factor: 3.169

5.  Integrating functional data to prioritize causal variants in statistical fine-mapping studies.

Authors:  Gleb Kichaev; Wen-Yun Yang; Sara Lindstrom; Farhad Hormozdiari; Eleazar Eskin; Alkes L Price; Peter Kraft; Bogdan Pasaniuc
Journal:  PLoS Genet       Date:  2014-10-30       Impact factor: 5.917

6.  Fine-mapping inflammatory bowel disease loci to single-variant resolution.

Authors:  Hailiang Huang; Ming Fang; Luke Jostins; Maša Umićević Mirkov; Gabrielle Boucher; Carl A Anderson; Vibeke Andersen; Isabelle Cleynen; Adrian Cortes; François Crins; Mauro D'Amato; Valérie Deffontaine; Julia Dmitrieva; Elisa Docampo; Mahmoud Elansary; Kyle Kai-How Farh; Andre Franke; Ann-Stephan Gori; Philippe Goyette; Jonas Halfvarson; Talin Haritunians; Jo Knight; Ian C Lawrance; Charlie W Lees; Edouard Louis; Rob Mariman; Theo Meuwissen; Myriam Mni; Yukihide Momozawa; Miles Parkes; Sarah L Spain; Emilie Théâtre; Gosia Trynka; Jack Satsangi; Suzanne van Sommeren; Severine Vermeire; Ramnik J Xavier; Rinse K Weersma; Richard H Duerr; Christopher G Mathew; John D Rioux; Dermot P B McGovern; Judy H Cho; Michel Georges; Mark J Daly; Jeffrey C Barrett
Journal:  Nature       Date:  2017-06-28       Impact factor: 49.962

7.  The open targets post-GWAS analysis pipeline.

Authors:  Gareth Peat; William Jones; Michael Nuhn; José Carlos Marugán; William Newell; Ian Dunham; Daniel Zerbino
Journal:  Bioinformatics       Date:  2020-05-01       Impact factor: 6.937

8.  ToppGene Suite for gene list enrichment analysis and candidate gene prioritization.

Authors:  Jing Chen; Eric E Bardes; Bruce J Aronow; Anil G Jegga
Journal:  Nucleic Acids Res       Date:  2009-05-22       Impact factor: 16.971

9.  Prediction of human disease genes by human-mouse conserved coexpression analysis.

Authors:  Ugo Ala; Rosario Michael Piro; Elena Grassi; Christian Damasco; Lorenzo Silengo; Martin Oti; Paolo Provero; Ferdinando Di Cunto
Journal:  PLoS Comput Biol       Date:  2008-03-28       Impact factor: 4.475

10.  IBD risk loci are enriched in multigenic regulatory modules encompassing putative causative genes.

Authors:  Yukihide Momozawa; Julia Dmitrieva; Emilie Théâtre; Valérie Deffontaine; Souad Rahmouni; Benoît Charloteaux; François Crins; Elisa Docampo; Mahmoud Elansary; Ann-Stephan Gori; Christelle Lecut; Rob Mariman; Myriam Mni; Cécile Oury; Ilya Altukhov; Dmitry Alexeev; Yuri Aulchenko; Leila Amininejad; Gerd Bouma; Frank Hoentjen; Mark Löwenberg; Bas Oldenburg; Marieke J Pierik; Andrea E Vander Meulen-de Jong; C Janneke van der Woude; Marijn C Visschedijk; Mark Lathrop; Jean-Pierre Hugot; Rinse K Weersma; Martine De Vos; Denis Franchimont; Severine Vermeire; Michiaki Kubo; Edouard Louis; Michel Georges
Journal:  Nat Commun       Date:  2018-06-21       Impact factor: 14.919

View more
  1 in total

1.  A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning.

Authors:  Saeid Azadifar; Ali Ahmadi
Journal:  BMC Bioinformatics       Date:  2022-10-14       Impact factor: 3.307

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.