L J Jensen1, R Gupta, H-H Staerfeldt, S Brunak. 1. Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, The Technical University of Denmark, DK-2800 Lyngby, Denmark. ljj@cbs.dtu.dk
Abstract
MOTIVATION: The human genome project has led to the discovery of many human protein coding genes which were previously unknown. As a large fraction of these are functionally uncharacterized, it is of interest to develop methods for predicting their molecular function from sequence. RESULTS: We have developed a method for prediction of protein function for a subset of classes from the Gene Ontology classification scheme. This subset includes several pharmaceutically interesting categories-transcription factors, receptors, ion channels, stress and immune response proteins, hormones and growth factors can all be predicted. Although the method relies on protein sequences as the sole input, it does not rely on sequence similarity, but instead on sequence derived protein features such as predicted post translational modifications (PTMs), protein sorting signals and physical/chemical properties calculated from the amino acid composition. This allows for prediction of the function for orphan proteins where no homologs can be found. Using this method we propose two novel receptors in the human genome, and further demonstrate chromosomal clustering of related proteins.
MOTIVATION: The human genome project has led to the discovery of many human protein coding genes which were previously unknown. As a large fraction of these are functionally uncharacterized, it is of interest to develop methods for predicting their molecular function from sequence. RESULTS: We have developed a method for prediction of protein function for a subset of classes from the Gene Ontology classification scheme. This subset includes several pharmaceutically interesting categories-transcription factors, receptors, ion channels, stress and immune response proteins, hormones and growth factors can all be predicted. Although the method relies on protein sequences as the sole input, it does not rely on sequence similarity, but instead on sequence derived protein features such as predicted post translational modifications (PTMs), protein sorting signals and physical/chemical properties calculated from the amino acid composition. This allows for prediction of the function for orphan proteins where no homologs can be found. Using this method we propose two novel receptors in the human genome, and further demonstrate chromosomal clustering of related proteins.
Authors: Chuming Chen; Qinghua Wang; Hongzhan Huang; Cholanayakanahalli R Vinayaka; John S Garavelli; Cecilia N Arighi; Darren A Natale; Cathy H Wu Journal: Database (Oxford) Date: 2019-01-01 Impact factor: 3.451
Authors: Adrian Schröder; Johannes Eichner; Jochen Supper; Jonas Eichner; Dierk Wanke; Carsten Henneges; Andreas Zell Journal: PLoS One Date: 2010-11-30 Impact factor: 3.240
Authors: Agnieszka S Juncker; Lars J Jensen; Andrea Pierleoni; Andreas Bernsel; Michael L Tress; Peer Bork; Gunnar von Heijne; Alfonso Valencia; Christos A Ouzounis; Rita Casadio; Søren Brunak Journal: Genome Biol Date: 2009-02-02 Impact factor: 13.583