Kuo Yang1, Ning Wang1, Guangming Liu1, Ruyu Wang1, Jian Yu1, Runshun Zhang2, Jianxin Chen3, Xuezhong Zhou1,4. 1. School of Computer and Information Technology and Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China. 2. Guanganmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China. 3. Beijing University of Chinese Medicine, Beijing, China. 4. Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
Abstract
Objective: Investigating the molecular mechanisms of symptoms is a vital task in precision medicine to refine disease taxonomy and improve the personalized management of chronic diseases. Although there are abundant experimental studies and computational efforts to obtain the candidate genes of diseases, the identification of symptom genes is rarely addressed. We curated a high-quality benchmark dataset of symptom-gene associations and proposed a heterogeneous network embedding for identifying symptom genes. Methods: We proposed a heterogeneous network embedding representation algorithm, which constructed a heterogeneous symptom-related network that integrated symptom-related associations and applied an embedding representation algorithm to obtain the low-dimensional vector representation of nodes. By measuring the relevance between symptoms and genes via calculating the similarities of their vectors, the candidate genes of given symptoms can be obtained. Results: A benchmark dataset of 18 270 symptom-gene associations between 505 symptoms and 4549 genes was curated. We compared our method to baseline algorithms (FSGER and PRINCE). The experimental results indicated our algorithm achieved a significant improvement over the state-of-the-art method, with precision and recall improved by 66.80% (0.844 vs 0.506) and 53.96% (0.311 vs 0.202), respectively, for TOP@3 and association precision improved by 37.71% (0.723 vs 0.525) over the PRINCE. Conclusions: The experimental validation of the algorithms and the literature validation of typical symptoms indicated our method achieved excellent performance. Hence, we curated a prediction dataset of 17 479 symptom-candidate genes. The benchmark and prediction datasets have the potential to promote investigations of the molecular mechanisms of symptoms and provide candidate genes for validation in experimental settings.
Objective: Investigating the molecular mechanisms of symptoms is a vital task in precision medicine to refine disease taxonomy and improve the personalized management of chronic diseases. Although there are abundant experimental studies and computational efforts to obtain the candidate genes of diseases, the identification of symptom genes is rarely addressed. We curated a high-quality benchmark dataset of symptom-gene associations and proposed a heterogeneous network embedding for identifying symptom genes. Methods: We proposed a heterogeneous network embedding representation algorithm, which constructed a heterogeneous symptom-related network that integrated symptom-related associations and applied an embedding representation algorithm to obtain the low-dimensional vector representation of nodes. By measuring the relevance between symptoms and genes via calculating the similarities of their vectors, the candidate genes of given symptoms can be obtained. Results: A benchmark dataset of 18 270 symptom-gene associations between 505 symptoms and 4549 genes was curated. We compared our method to baseline algorithms (FSGER and PRINCE). The experimental results indicated our algorithm achieved a significant improvement over the state-of-the-art method, with precision and recall improved by 66.80% (0.844 vs 0.506) and 53.96% (0.311 vs 0.202), respectively, for TOP@3 and association precision improved by 37.71% (0.723 vs 0.525) over the PRINCE. Conclusions: The experimental validation of the algorithms and the literature validation of typical symptoms indicated our method achieved excellent performance. Hence, we curated a prediction dataset of 17 479 symptom-candidate genes. The benchmark and prediction datasets have the potential to promote investigations of the molecular mechanisms of symptoms and provide candidate genes for validation in experimental settings.
Authors: Elisha Gootwine; Ron Ofri; Eyal Banin; Alexey Obolensky; Edward Averbukh; Raaya Ezra-Elia; Maya Ross; Hen Honig; Alexander Rosov; Esther Yamin; Guo-Jie Ye; David R Knop; Paulette M Robinson; Jeffrey D Chulay; Mark S Shearman Journal: Hum Gene Ther Clin Dev Date: 2017-05-05 Impact factor: 5.032
Authors: Adam E Locke; Bratati Kahali; Sonja I Berndt; Anne E Justice; Tune H Pers; Felix R Day; Corey Powell; Sailaja Vedantam; Martin L Buchkovich; Jian Yang; Damien C Croteau-Chonka; Tonu Esko; Tove Fall; Teresa Ferreira; Stefan Gustafsson; Zoltán Kutalik; Jian'an Luan; Reedik Mägi; Joshua C Randall; Thomas W Winkler; Andrew R Wood; Tsegaselassie Workalemahu; Jessica D Faul; Jennifer A Smith; Jing Hua Zhao; Wei Zhao; Jin Chen; Rudolf Fehrmann; Åsa K Hedman; Juha Karjalainen; Ellen M Schmidt; Devin Absher; Najaf Amin; Denise Anderson; Marian Beekman; Jennifer L Bolton; Jennifer L Bragg-Gresham; Steven Buyske; Ayse Demirkan; Guohong Deng; Georg B Ehret; Bjarke Feenstra; Mary F Feitosa; Krista Fischer; Anuj Goel; Jian Gong; Anne U Jackson; Stavroula Kanoni; Marcus E Kleber; Kati Kristiansson; Unhee Lim; Vaneet Lotay; Massimo Mangino; Irene Mateo Leach; Carolina Medina-Gomez; Sarah E Medland; Michael A Nalls; Cameron D Palmer; Dorota Pasko; Sonali Pechlivanis; Marjolein J Peters; Inga Prokopenko; Dmitry Shungin; Alena Stančáková; Rona J Strawbridge; Yun Ju Sung; Toshiko Tanaka; Alexander Teumer; Stella Trompet; Sander W van der Laan; Jessica van Setten; Jana V Van Vliet-Ostaptchouk; Zhaoming Wang; Loïc Yengo; Weihua Zhang; Aaron Isaacs; Eva Albrecht; Johan Ärnlöv; Gillian M Arscott; Antony P Attwood; Stefania Bandinelli; Amy Barrett; Isabelita N Bas; Claire Bellis; Amanda J Bennett; Christian Berne; Roza Blagieva; Matthias Blüher; Stefan Böhringer; Lori L Bonnycastle; Yvonne Böttcher; Heather A Boyd; Marcel Bruinenberg; Ida H Caspersen; Yii-Der Ida Chen; Robert Clarke; E Warwick Daw; Anton J M de Craen; Graciela Delgado; Maria Dimitriou; Alex S F Doney; Niina Eklund; Karol Estrada; Elodie Eury; Lasse Folkersen; Ross M Fraser; Melissa E Garcia; Frank Geller; Vilmantas Giedraitis; Bruna Gigante; Alan S Go; Alain Golay; Alison H Goodall; Scott D Gordon; Mathias Gorski; Hans-Jörgen Grabe; Harald Grallert; Tanja B Grammer; Jürgen Gräßler; Henrik Grönberg; Christopher J Groves; Gaëlle Gusto; Jeffrey Haessler; Per Hall; Toomas Haller; Goran Hallmans; Catharina A Hartman; Maija Hassinen; Caroline Hayward; Nancy L Heard-Costa; Quinta Helmer; Christian Hengstenberg; Oddgeir Holmen; Jouke-Jan Hottenga; Alan L James; Janina M Jeff; Åsa Johansson; Jennifer Jolley; Thorhildur Juliusdottir; Leena Kinnunen; Wolfgang Koenig; Markku Koskenvuo; Wolfgang Kratzer; Jaana Laitinen; Claudia Lamina; Karin Leander; Nanette R Lee; Peter Lichtner; Lars Lind; Jaana Lindström; Ken Sin Lo; Stéphane Lobbens; Roberto Lorbeer; Yingchang Lu; François Mach; Patrik K E Magnusson; Anubha Mahajan; Wendy L McArdle; Stela McLachlan; Cristina Menni; Sigrun Merger; Evelin Mihailov; Lili Milani; Alireza Moayyeri; Keri L Monda; Mario A Morken; Antonella Mulas; Gabriele Müller; Martina Müller-Nurasyid; Arthur W Musk; Ramaiah Nagaraja; Markus M Nöthen; Ilja M Nolte; Stefan Pilz; Nigel W Rayner; Frida Renstrom; Rainer Rettig; Janina S Ried; Stephan Ripke; Neil R Robertson; Lynda M Rose; Serena Sanna; Hubert Scharnagl; Salome Scholtens; Fredrick R Schumacher; William R Scott; Thomas Seufferlein; Jianxin Shi; Albert Vernon Smith; Joanna Smolonska; Alice V Stanton; Valgerdur Steinthorsdottir; Kathleen Stirrups; Heather M Stringham; Johan Sundström; Morris A Swertz; Amy J Swift; Ann-Christine Syvänen; Sian-Tsung Tan; Bamidele O Tayo; Barbara Thorand; Gudmar Thorleifsson; Jonathan P Tyrer; Hae-Won Uh; Liesbeth Vandenput; Frank C Verhulst; Sita H Vermeulen; Niek Verweij; Judith M Vonk; Lindsay L Waite; Helen R Warren; Dawn Waterworth; Michael N Weedon; Lynne R Wilkens; Christina Willenborg; Tom Wilsgaard; Mary K Wojczynski; Andrew Wong; Alan F Wright; Qunyuan Zhang; Eoin P Brennan; Murim Choi; Zari Dastani; Alexander W Drong; Per Eriksson; Anders Franco-Cereceda; Jesper R Gådin; Ali G Gharavi; Michael E Goddard; Robert E Handsaker; Jinyan Huang; Fredrik Karpe; Sekar Kathiresan; Sarah Keildson; Krzysztof Kiryluk; Michiaki Kubo; Jong-Young Lee; Liming Liang; Richard P Lifton; Baoshan Ma; Steven A McCarroll; Amy J McKnight; Josine L Min; Miriam F Moffatt; Grant W Montgomery; Joanne M Murabito; George Nicholson; Dale R Nyholt; Yukinori Okada; John R B Perry; Rajkumar Dorajoo; Eva Reinmaa; Rany M Salem; Niina Sandholm; Robert A Scott; Lisette Stolk; Atsushi Takahashi; Toshihiro Tanaka; Ferdinand M van 't Hooft; Anna A E Vinkhuyzen; Harm-Jan Westra; Wei Zheng; Krina T Zondervan; Andrew C Heath; Dominique Arveiler; Stephan J L Bakker; John Beilby; Richard N Bergman; John Blangero; Pascal Bovet; Harry Campbell; Mark J Caulfield; Giancarlo Cesana; Aravinda Chakravarti; Daniel I Chasman; Peter S Chines; Francis S Collins; Dana C Crawford; L Adrienne Cupples; Daniele Cusi; John Danesh; Ulf de Faire; Hester M den Ruijter; Anna F Dominiczak; Raimund Erbel; Jeanette Erdmann; Johan G Eriksson; Martin Farrall; Stephan B Felix; Ele Ferrannini; Jean Ferrières; Ian Ford; Nita G Forouhi; Terrence Forrester; Oscar H Franco; Ron T Gansevoort; Pablo V Gejman; Christian Gieger; Omri Gottesman; Vilmundur Gudnason; Ulf Gyllensten; Alistair S Hall; Tamara B Harris; Andrew T Hattersley; Andrew A Hicks; Lucia A Hindorff; Aroon D Hingorani; Albert Hofman; Georg Homuth; G Kees Hovingh; Steve E Humphries; Steven C Hunt; Elina Hyppönen; Thomas Illig; Kevin B Jacobs; Marjo-Riitta Jarvelin; Karl-Heinz Jöckel; Berit Johansen; Pekka Jousilahti; J Wouter Jukema; Antti M Jula; Jaakko Kaprio; John J P Kastelein; Sirkka M Keinanen-Kiukaanniemi; Lambertus A Kiemeney; Paul Knekt; Jaspal S Kooner; Charles Kooperberg; Peter Kovacs; Aldi T Kraja; Meena Kumari; Johanna Kuusisto; Timo A Lakka; Claudia Langenberg; Loic Le Marchand; Terho Lehtimäki; Valeriya Lyssenko; Satu Männistö; André Marette; Tara C Matise; Colin A McKenzie; Barbara McKnight; Frans L Moll; Andrew D Morris; Andrew P Morris; Jeffrey C Murray; Mari Nelis; Claes Ohlsson; Albertine J Oldehinkel; Ken K Ong; Pamela A F Madden; Gerard Pasterkamp; John F Peden; Annette Peters; Dirkje S Postma; Peter P Pramstaller; Jackie F Price; Lu Qi; Olli T Raitakari; Tuomo Rankinen; D C Rao; Treva K Rice; Paul M Ridker; John D Rioux; Marylyn D Ritchie; Igor Rudan; Veikko Salomaa; Nilesh J Samani; Jouko Saramies; Mark A Sarzynski; Heribert Schunkert; Peter E H Schwarz; Peter Sever; Alan R Shuldiner; Juha Sinisalo; Ronald P Stolk; Konstantin Strauch; Anke Tönjes; David-Alexandre Trégouët; Angelo Tremblay; Elena Tremoli; Jarmo Virtamo; Marie-Claude Vohl; Uwe Völker; Gérard Waeber; Gonneke Willemsen; Jacqueline C Witteman; M Carola Zillikens; Linda S Adair; Philippe Amouyel; Folkert W Asselbergs; Themistocles L Assimes; Murielle Bochud; Bernhard O Boehm; Eric Boerwinkle; Stefan R Bornstein; Erwin P Bottinger; Claude Bouchard; Stéphane Cauchi; John C Chambers; Stephen J Chanock; Richard S Cooper; Paul I W de Bakker; George Dedoussis; Luigi Ferrucci; Paul W Franks; Philippe Froguel; Leif C Groop; Christopher A Haiman; Anders Hamsten; Jennie Hui; David J Hunter; Kristian Hveem; Robert C Kaplan; Mika Kivimaki; Diana Kuh; Markku Laakso; Yongmei Liu; Nicholas G Martin; Winfried März; Mads Melbye; Andres Metspalu; Susanne Moebus; Patricia B Munroe; Inger Njølstad; Ben A Oostra; Colin N A Palmer; Nancy L Pedersen; Markus Perola; Louis Pérusse; Ulrike Peters; Chris Power; Thomas Quertermous; Rainer Rauramaa; Fernando Rivadeneira; Timo E Saaristo; Danish Saleheen; Naveed Sattar; Eric E Schadt; David Schlessinger; P Eline Slagboom; Harold Snieder; Tim D Spector; Unnur Thorsteinsdottir; Michael Stumvoll; Jaakko Tuomilehto; André G Uitterlinden; Matti Uusitupa; Pim van der Harst; Mark Walker; Henri Wallaschofski; Nicholas J Wareham; Hugh Watkins; David R Weir; H-Erich Wichmann; James F Wilson; Pieter Zanen; Ingrid B Borecki; Panos Deloukas; Caroline S Fox; Iris M Heid; Jeffrey R O'Connell; David P Strachan; Kari Stefansson; Cornelia M van Duijn; Gonçalo R Abecasis; Lude Franke; Timothy M Frayling; Mark I McCarthy; Peter M Visscher; André Scherag; Cristen J Willer; Michael Boehnke; Karen L Mohlke; Cecilia M Lindgren; Jacques S Beckmann; Inês Barroso; Kari E North; Erik Ingelsson; Joel N Hirschhorn; Ruth J F Loos; Elizabeth K Speliotes Journal: Nature Date: 2015-02-12 Impact factor: 49.962
Authors: Andrew Chatr-Aryamontri; Bobby-Joe Breitkreutz; Rose Oughtred; Lorrie Boucher; Sven Heinicke; Daici Chen; Chris Stark; Ashton Breitkreutz; Nadine Kolas; Lara O'Donnell; Teresa Reguly; Julie Nixon; Lindsay Ramage; Andrew Winter; Adnane Sellam; Christie Chang; Jodi Hirschman; Chandra Theesfeld; Jennifer Rust; Michael S Livstone; Kara Dolinski; Mike Tyers Journal: Nucleic Acids Res Date: 2014-11-26 Impact factor: 19.160
Authors: Noa Rappaport; Michal Twik; Inbar Plaschkes; Ron Nudel; Tsippi Iny Stein; Jacob Levitt; Moran Gershoni; C Paul Morrey; Marilyn Safran; Doron Lancet Journal: Nucleic Acids Res Date: 2016-11-28 Impact factor: 16.971
Authors: Sandra Orchard; Mais Ammari; Bruno Aranda; Lionel Breuza; Leonardo Briganti; Fiona Broackes-Carter; Nancy H Campbell; Gayatri Chavali; Carol Chen; Noemi del-Toro; Margaret Duesbury; Marine Dumousseau; Eugenia Galeota; Ursula Hinz; Marta Iannuccelli; Sruthi Jagannathan; Rafael Jimenez; Jyoti Khadake; Astrid Lagreid; Luana Licata; Ruth C Lovering; Birgit Meldal; Anna N Melidoni; Mila Milagros; Daniele Peluso; Livia Perfetto; Pablo Porras; Arathi Raghunath; Sylvie Ricard-Blum; Bernd Roechert; Andre Stutz; Michael Tognolli; Kim van Roey; Gianni Cesareni; Henning Hermjakob Journal: Nucleic Acids Res Date: 2013-11-13 Impact factor: 16.971
Authors: Juan A G Ranea; James Perkins; Mónica Chagoyen; Elena Díaz-Santiago; Florencio Pazos Journal: Genes (Basel) Date: 2022-06-17 Impact factor: 4.141