| Literature DB >> 18467422 |
Andreu Alibés1, Andrés Cañada, Ramón Díaz-Uriarte.
Abstract
Many biological experiments and their subsequent analysis yield lists of genes or proteins that can potentially be important to the prognosis or diagnosis of certain diseases (e.g. cancer). Nowadays, information about the function of those genes or proteins may be already gathered in some databases, but it is essential to understand if some of the members of those lists have a function in common or if they belong to the same metabolic pathway. To help researchers filter those genes or proteins that have such information in common, we have developed PaLS (pathway and literature strainer, http://pals.bioinfo.cnio.es). PaLS takes a list or a set of lists of gene or protein identifiers and shows which ones share certain descriptors. Four publicly available databases have been used for this purpose: PubMed, which links genes with those articles that make reference to them; Gene Ontology, an annotated ontology of terms related to the cellular component, biological process or molecular function where those genes or proteins are involved; KEGG pathways and Reactome pathways. Those descriptors among these four sources of information that are shared by more members of the list (or lists) are highlighted by PaLS.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18467422 PMCID: PMC2447779 DOI: 10.1093/nar/gkn251
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.PaLS processing steps. Starting from a list or lists of protein or gene identifiers (A), PaLS looks for all their descriptors in the same database of ID conversions pregenerated for IDconverter (14) (B). Finally, it sorts those descriptors that appear more often in the lists, so the user can get an idea of the of the relevance of their lists (C). This example is done with a list of cancer-related genes available in the Help section of the web server.
Figure 2.Example of a graph plot produced by PaLS (generated with the NetworkX package). The graph shows, for list from the 7th cross-validation run, those RefSeq_RNAs that are connected through common Gene Ontology terms. It can be seen how there is a central group of genes that share more terms (as they are closer to each other), and a gene, NM_006623, on the right side, that is only connected to another gene of the list.