Javier Lopez-Ibañez1, Florencio Pazos1, Monica Chagoyen2. 1. Computational Systems Biology Group, National Center for Biotecnology (CNB-CSIC), Darwin 3, 28049, Madrid, Spain. 2. Computational Systems Biology Group, National Center for Biotecnology (CNB-CSIC), Darwin 3, 28049, Madrid, Spain. monica.chagoyen@cnb.csic.es.
Abstract
BACKGROUND: Assignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology. Protein sequence profiles are very successful in capturing the main structural and functional features of a protein family, and can be used to assign new members to it based on matching of their sequences against these profiles. In this work, we extend this idea to chemical compounds, constructing a profile-inspired model for a set of related metabolites (those in the same biological pathway), based on a fragment-based vectorial representation of their chemical structures. RESULTS: We use this representation to predict the biological pathway of a chemical compound with good overall accuracy (AUC 0.74-0.90 depending on the database tested), and analyzed some factors that affect performance. The approach, which is compared with equivalent methods, can in addition detect those molecular fragments characteristic of a pathway. CONCLUSIONS: The method is available as a graphical interactive web server http://csbg.cnb.csic.es/iFragMent .
BACKGROUND: Assignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology. Protein sequence profiles are very successful in capturing the main structural and functional features of a protein family, and can be used to assign new members to it based on matching of their sequences against these profiles. In this work, we extend this idea to chemical compounds, constructing a profile-inspired model for a set of related metabolites (those in the same biological pathway), based on a fragment-based vectorial representation of their chemical structures. RESULTS: We use this representation to predict the biological pathway of a chemical compound with good overall accuracy (AUC 0.74-0.90 depending on the database tested), and analyzed some factors that affect performance. The approach, which is compared with equivalent methods, can in addition detect those molecular fragments characteristic of a pathway. CONCLUSIONS: The method is available as a graphical interactive web server http://csbg.cnb.csic.es/iFragMent .
Authors: Gautier Koscielny; Peter An; Denise Carvalho-Silva; Jennifer A Cham; Luca Fumis; Rippa Gasparyan; Samiul Hasan; Nikiforos Karamanis; Michael Maguire; Eliseo Papa; Andrea Pierleoni; Miguel Pignatelli; Theo Platt; Francis Rowland; Priyanka Wankar; A Patrícia Bento; Tony Burdett; Antonio Fabregat; Simon Forbes; Anna Gaulton; Cristina Yenyxe Gonzalez; Henning Hermjakob; Anne Hersey; Steven Jupe; Şenay Kafkas; Maria Keays; Catherine Leroy; Francisco-Javier Lopez; Maria Paula Magarinos; James Malone; Johanna McEntyre; Alfonso Munoz-Pomer Fuentes; Claire O'Donovan; Irene Papatheodorou; Helen Parkinson; Barbara Palka; Justin Paschall; Robert Petryszak; Naruemon Pratanwanich; Sirarat Sarntivijal; Gary Saunders; Konstantinos Sidiropoulos; Thomas Smith; Zbyslaw Sondka; Oliver Stegle; Y Amy Tang; Edward Turner; Brendan Vaughan; Olga Vrousgou; Xavier Watkins; Maria-Jesus Martin; Philippe Sanseau; Jessica Vamathevan; Ewan Birney; Jeffrey Barrett; Ian Dunham Journal: Nucleic Acids Res Date: 2016-11-29 Impact factor: 16.971
Authors: Timothy Jewison; Yilu Su; Fatemeh Miri Disfany; Yongjie Liang; Craig Knox; Adam Maciejewski; Jenna Poelzer; Jessica Huynh; You Zhou; David Arndt; Yannick Djoumbou; Yifeng Liu; Lu Deng; An Chi Guo; Beomsoo Han; Allison Pon; Michael Wilson; Shahrzad Rafatnia; Philip Liu; David S Wishart Journal: Nucleic Acids Res Date: 2013-11-06 Impact factor: 16.971