| Literature DB >> 19455256 |
Ivy F L Tsui1, Raj Chari, Timon P H Buys, Wan L Lam.
Abstract
The study of pathway disruption is key to understanding cancer biology. Advances in high throughput technologies have led to the rapid accumulation of genomic data. The explosion in available data has generated opportunities for investigation of concerted changes that disrupt biological functions, this in turns created a need for computational tools for pathway analysis. In this review, we discuss approaches to the analysis of genomic data and describe the publicly available resources for studying biological pathways.Entities:
Year: 2007 PMID: 19455256 PMCID: PMC2410087
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Example of EGFR-mediated signaling changes, a commonly disrupted pathway in lung cancer. The EGFR pathway could be disrupted by an increased expression of growth factor ligands. By targeting EGFR with tyrosine kinase inhibitors (TKIs) and MAb (monoclonal antibodies), EGFR activity can be eliminated. However, a downstream factor (e.g. MAPK signaling pathway) may also be activated to disrupt the pathway, thus making TKIs ineffective. Pathway data was obtained and selected from the Cancer Cell Map database and drawn using Cytoscape.
Figure 2.Graphical output display of heatmap, hierarchical clustering, and principal component analysis. A: An example of a heatmap representation of 30 simulated profiles helps the user to easily visualize four groups of samples along the x-axis with distinct characteristics expression patterns for 300 genes. Heatmap facilitates the grouping of altered genes and sample clusters, but does not convey any spatial relationship between clustered samples. B: An example of a dendrogram generated from hierarchical clustering of the simulated data represented in figure 2A. A dendrogram is a tree diagram consisting of many U-shaped lines connecting objects to represent hierarchical clusters. In this dendrogram, four clusters of samples are formed based on distinct expression signatures. C: A two-dimensional graphical visualization of principal components analysis (PCA) based on the simulated data shown in figure 2A. Samples are color-coded based on the four clusters observed by hierarchical clustering in 2B.
Figure 3.Biological knowledgebases contain a myriad of specific information on each gene/protein. Sequence databases are the basis for gene and protein information. Gene and protein information is further extracted and their inter-relationships are experimentally identified, building molecular interaction databases. All of this information is the foundation of pathway databases.
Gene and protein databases.
| NCBI Gen-Bank | NIH genetic sequence database | An international DNA sequence database | [ | |
| EMBL Nucleotide Sequence Database/EMBL-Bank | European Molecular Biology Laboratory Nucleotide Sequence Database | Collection of DNA and RNA sequences in Europe and is synchronized with GenBank at NCBI and DDBJ in Japan. | [ | |
| DDBJ | DNA Data Bank of Japan | Nucleotide sequence database in Japan and in collaboration with EMBL and NCBI GenBank | [ | |
| Entrez Gene | - | NCBI database that focuses on gene-to-sequence relationship and provides gene-specific information. | [ | |
| RefSeq | NCBI Reference Sequences | NCBI collection of non-redundant set of DNA, RNA, and protein sequences. | [ | |
| UniGene | NCBI UniGene | Partitions GenBank sequences into sets of transcript sequences that are likely to represent distinct genes. | [ | |
| Ensembl | - | A source for comparative chordate genome sequences and gene annotation at EBI/Sanger. | [ | |
| UCSC Genome Browser Database | University of California Santa Cruz Genome Browser Database | Human genome assembly and customizable track browsers at UCSC. | genome.ucsc.edu/bestlinks.html | [ |
| UniProtKB/TrEMBL | Universal Protein Resource Knowledgebase/Translated European Molecular Biology Laboratories | Computer-curated protein sequence database containing translations of all coding sequences in EMBL/GenBank/DDBJ and also other protein sequences from the literature. | [ | |
| UniProtKB/Swiss-Prot Protein Knowledge-base | Universal Protein Resource Knowledgebase | Manually-curated protein sequence database providing publicly available information about protein sequences. | [ |
Gene and protein information databases.
| GO | Gene Ontology | Provides a controlled vocabulary to describe gene and gene product attributes in many organisms. | [ | |
| Entrez Gene | - | NCBI database that focuses on gene-to-sequence relationship and provides gene-specific information. | [ | |
| OMIM | Online Mendelian Inheritance in Man | Collection of human genes information and genetic disorders. | [ | |
| HomoloGene | - | Homolog detection among annotated genes of several eukaryotic genomes. | - | |
| iHOP | Information Hyperlinked Over Proteins | Convert PubMed literature into a navigable resource. | [ | |
| SCOP | Structural Classification of Proteins | Classifies proteins of known structure based on their evolutionary and structural relationships. | [ | |
| RCSB PDB | Research Collaboratory for Structural Bioinformatics Protein Data Bank | Resource for studying biomacromolecular structures and their relationships to sequence, function, and disease. | [ | |
| PIR | Protein Information Resource | A resource to identify and interpret protein sequence information. | [ | |
| IntEnz | Integrated relational Enzyme database | Contains enzyme data curated and approved by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology | [ | |
| ENZYME nomenclature database | - | Database of information related to enzyme nomenclature. | [ | |
| BRENDA | BRaunschweig ENzyme DAtabase | Collection of enzyme functional data. | [ | |
| Module Map | - | Collection and tools for the analysis of microarray data in 22 tumor types. | [ | |
| Cancer Gene Census | - | Catalogue of cancer-related genes. | - | |
| Cancer Gene Data Curation Project | - | Catalogue gene-disease and gene-drug relationships in cancer. | - | |
| Tumor Gene Database | - | Database of tumor genes with a standard set of information. | - | |
| GEO | Gene Expression Omni-bus | A public archive for data submission and provides mining tools to query and download data. | [ | |
| CGED | Cancer Gene Expression Database | Database with graphical display of gene expression and clinical data on different tumor types. | [ | |
| Cancer Genes Resequencing Resource | - | Searchable database of cancer genes. | - | |
| SMD | Stanford Microarray Database | Database for storage and tools for processing and analyzing microarray data. | [ | |
| Progenetix | - | A public database that collects information about chromosomal alterations in cancer. | [ | |
| ArrayExpress | - | Public repository for microarray data. | [ | |
| CGAP | Cancer Genome Anatomy Project | Database which relates chromosomal alterations to tumor characteristics. | [ |
Molecular interaction database.
| IntAct | EBI protein intearction database | Protein interaction database by literature curation or user submissions. | HierarchView | [ | |
| DIP | Database of Interacting Proteins | Curated both manually and automatically to combine experimentally determined protein-protein interactions. | Y | dip.doe-mbi.ucla.edu | [ |
| MINT | Molecular INTeractions Database | Curated manually, experimentally verified protein interactions from literature. | MINT Viewer | mint.bio.uniroma2.it/mint/Welcome.do | [ |
| HPRD | Human Protein Reference Database | Manually curated based on experimental evidence and contains information on domain architecture, post-translational modifications, interaction networks and disease association. | GenMAPP | [ | |
| HomoMINT | - | Molecular interactions discovered in model organisms are mapped to orthologs in Homo sapiens. | MINT Viewer | mint.bio.uniroma2.it/HomoMINT | [ |
| Domino | Domain peptide interactions database | Protein interactions of domain peptides. | MINT Viewer | mint.bio.uniroma2.it/domino | [ |
| PDZBase | - | Experimentally determined protein-protein interactions involving the PDZ-domains. | N | icb.med.cornell.edu/services/pdz/start | [ |
| BOND | Biomolecular Object Network Databank | An interaction database that includes high-throughput data submissions and manually curated information from literature. | Cytoscape | bond.unleashedinformatics.com | [ |
| BioGRID | General Repository for Interaction Datasets | A repository for protein and genetic interactions contributed by the community. | Osprey | [ | |
| OPHID | Online Predicted Human Interaction Database | Database with known protein-protein interactions from human and predicted protein-protein interactions from model organisms. | Y | ophid.utoronto.ca/ophid | [ |
| PIP | Potential Interactions of Proteins | Predicted protein-protein interactions derived from homology with experimentally known interactions from other species. | Y | bmm.cancerre-searchuk.org/~pip | [ |
| MPPI | MIPS mammalian protein-protein interaction database | Published experimental protein interaction data in mammals | Y | mips.gsf.de/proj/ppi | [ |
| HPID | Human Protein Interaction Database | Human protein interaction information and infer interactions between submitted proteins. | WebInter-Viewer | [ | |
| InterDom | Database of Interacting Domains | Putative protein domain interactions information. | N | interdom.lit.org.sg | [ |
| STRING | Search Tool for the Retrieval of Interacting Proteins | Database of known and predicted protein-protein interactions. | Y | string.embl.de | [ |
Figure 4.An approach to building pathway databases. Biological knowledgebases are represented as rectangles with squared edges. Computational tools for text-mining and language control are represented as ellipses. Molecular interaction and pathway databases are represented by rectangles with rounded edges.
Pathway databases.
| KEGG pathway | Kyoto Encyclopedia of Genes and Genomes Pathway | Manually drawn pathway maps with different organisms. | Free | Y | [ | |
| The Cancer Cell Map | - | Ten human cancer-related signaling pathways. | Free | Cytoscape | cancer.cellmap.org/cellmap | N/A |
| Reactome | - | Biological pathways that include experimentally confirmed, manually inferred, and electronically inferred reactions. | Free | Skypainter | [ | |
| HPRD | Human Protein Reference Database | Ten human cancer signaling pathways and 10 immune system signaling pathway. | Free | GenMAPP | [ | |
| BioCarta | Charting Pathways of Life | Graphical display of known and suggested pathways. | Free | Y | N/A | |
| STKE | Signal Transduction Knowledge Environment | Database of cellular signaling pathways. | Free | SVG | stke.sciencemag.org | [ |
| PharmGKB | The Pharmacogenetics and Pharmacogenomics Knowledge Base | Database to explore relationships among drugs, diseases and genes, including their variations and gene products. | Free | Y | [ | |
| Panther Classification System | Protein Analysis Through Evolutionary Relationships | Predict protein function and contains over 139 pathways mapped to protein sequences. | Free | CellDesigner | [ | |
| MetaCyc | Metabolic Encyclopedia of enzymes and pathways | Non-redundant, experimentally determined pathways from more than 900 different organisms. | Free | Y | metacyc.org | [ |
| aMAZE | - | Molecular interactions and cellular processes. | Free | N | [ | |
| CGAP | Cancer Genome Anatomy Project | Pathways are from BioCarta and KEGG. | Free | Y | cgap.nci.nih.gov/Pathways | [ |
| INOH Pathway Database | Integrating Network Objects with Hierarchies | Pathway database of different organisms which organize pathway objects in an ontology-based system. | Free | INOH Client tool | N/A |
Collection of databases.
| Pathguide | The Pathway Resource List | List about 222 biological pathway databases. | Free | [ | |
| UBiC Bioinformatics Links Directory | UBC Bioinformatics Centre | Curated links to molecular resources, tools, and databases. | Free | bioinformatics.ubc.ca/resources/links_directory | [ |
| NAR Molecular Biology Database Collection | Nucleic Acids Research online Molecular Biology Database Collection | Provide external links to sequence, structures, and pathway data-bases. | Free | www3.oup.co.uk/nar/database/subcat/6/25 | [ |
Software tools.
| MAPPFinder | MicroArray Pathway Profiles Finder | View data in the context of Gene Ontology (GO) and GenMAPP biological pathways. | Free | Y | GO | [ | |
| GoMiner | - | Tool to classify gene onto the Gene Ontology (GO) hierarchy framework. | Free | Y | GO | discover.nci.nih.gov/gominer | [ |
| EASE | Expression Analysis Systematic Explorer | Statistical tool to analyze gene list by GO. | Free | Y | GO | david.abcc.ncifcrf.gov | [ |
| Onto-Express | - | Analyze the list of genes into GO hierarchy. | Free | Y | GO | vortex.cs.wayne.edu/Projects.html#Onto-Express | [ |
| GoSurfer | - | Analyze gene list using GO and visualize them as a hierarchical tree. | Free | Y | GO | bioinformatics.bioen.uiuc. edu/gosurfer | [ |
| FatiGO | Fast Assignment and Transference of Information | Web-based tool to analyze and compare GO terms in 2 sets of gene list. | Free | Y | GO | fatigo.bioinfo.cipf.es | [ |
Figure 5.Genome-wide integrative analysis to identify pathways disrupted in cancer. Genome-wide analyses including copy number profiling, epigenetic profiling, and transcription profiling performed on the same cancer sample could narrow down the number of candidate genes, which would in turn help to pinpoint disrupted pathway involved in cancer.