Cedric Simillion1,2, Robin Liechti3, Heidi E L Lischer4,5, Vassilios Ioannidis3,6, Rémy Bruggmann7. 1. Interfaculty Bioinformatics Unit and SIB Swiss Institute of Bioinformatics, University of Bern, Baltzerstrasse 6, 3012, Berne, Switzerland. cedric.simillion@dkf.unibe.ch. 2. Department of Clinical Research, University of Bern, Murtenstrasse 35, 3008, Berne, Switzerland. cedric.simillion@dkf.unibe.ch. 3. Vital-IT, SIB Swiss Institute of Bioinformatics, Quartier Sorge - Batiment Genopode, 1015, Lausanne, Switzerland. 4. Interfaculty Bioinformatics Unit and SIB Swiss Institute of Bioinformatics, University of Bern, Baltzerstrasse 6, 3012, Berne, Switzerland. 5. Present Address: URPP Evolution in Action; Institute of Evolutionary Biology and Environmental Studies (IEU), University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland. 6. SIB Technology, SIB Swiss Institute of Bioinformatics, Quartier Sorge - Batiment Genopode, 1015, Lausanne, Switzerland. 7. Interfaculty Bioinformatics Unit and SIB Swiss Institute of Bioinformatics, University of Bern, Baltzerstrasse 6, 3012, Berne, Switzerland. remy.bruggmann@bioinformatics.unibe.ch.
Abstract
BACKGROUND: The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. RESULTS: Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. Furthermore, we explore how sample source bias can affect the results of a GSEA analysis. CONCLUSIONS: The benchmarking results show that SetRank is a highly specific tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account. SetRank is available as an R package and through an online web interface.
BACKGROUND: The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. RESULTS: Here we present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. Furthermore, we explore how sample source bias can affect the results of a GSEA analysis. CONCLUSIONS: The benchmarking results show that SetRank is a highly specific tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account. SetRank is available as an R package and through an online web interface.
Entities:
Keywords:
Algorithm; Functional genomics; GSEA; Gene set enrichment analysis; Pathway analysis; R package; Sample source bias; Web interface
Authors: Markus Krupp; Jens U Marquardt; Ugur Sahin; Peter R Galle; John Castle; Andreas Teufel Journal: Bioinformatics Date: 2012-02-17 Impact factor: 6.937
Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205
Authors: Joëlle Michaud; Ken M Simpson; Robert Escher; Karine Buchet-Poyau; Tim Beissbarth; Catherine Carmichael; Matthew E Ritchie; Frédéric Schütz; Ping Cannon; Marjorie Liu; Xiaofeng Shen; Yoshiaki Ito; Wendy H Raskind; Marshall S Horwitz; Motomi Osato; David R Turner; Terence P Speed; Maria Kavallaris; Gordon K Smyth; Hamish S Scott Journal: BMC Genomics Date: 2008-07-31 Impact factor: 3.969
Authors: Weijun Luo; Michael S Friedman; Kerby Shedden; Kurt D Hankenson; Peter J Woolf Journal: BMC Bioinformatics Date: 2009-05-27 Impact factor: 3.169
Authors: David Croft; Antonio Fabregat Mundo; Robin Haw; Marija Milacic; Joel Weiser; Guanming Wu; Michael Caudy; Phani Garapati; Marc Gillespie; Maulik R Kamdar; Bijay Jassal; Steven Jupe; Lisa Matthews; Bruce May; Stanislav Palatnik; Karen Rothfels; Veronica Shamovsky; Heeyeon Song; Mark Williams; Ewan Birney; Henning Hermjakob; Lincoln Stein; Peter D'Eustachio Journal: Nucleic Acids Res Date: 2013-11-15 Impact factor: 16.971
Authors: Samuel Katz; Jian Song; Kyle P Webb; Nicolas W Lounsbury; Clare E Bryant; Iain D C Fraser Journal: Cell Syst Date: 2021-03-24 Impact factor: 11.091
Authors: Lukas Franz Mager; Viktor Hendrik Koelzer; Regula Stuber; Lester Thoo; Irene Keller; Ivonne Koeck; Maya Langenegger; Cedric Simillion; Simona P Pfister; Martin Faderl; Vera Genitsch; Irina Tcymbarevich; Pascal Juillerat; Xiaohong Li; Yu Xia; Eva Karamitopoulou; Ruth Lyck; Inti Zlobec; Siegfried Hapfelmeier; Rémy Bruggmann; Kathy D McCoy; Andrew J Macpherson; Christoph Müller; Bruce Beutler; Philippe Krebs Journal: Elife Date: 2017-10-04 Impact factor: 8.140