| Literature DB >> 19458762 |
Adeel Malik1, Hemajit Singh, Munazah Andrabi, Syed Akhtar Husain, Shandar Ahmad.
Abstract
In this review, we take a survey of bioinformatics databases and quantitative structure-activity relationship studies reported in published literature. Databases from the most general to special cancer-related ones have been included. Most commonly used methods of structure-based analysis of molecules have been reviewed, along with some case studies where they have been used in cancer research. This article is expected to be of use for general bioinformatics researchers interested in cancer and will also provide an update to those who have been actively pursuing this field of research.Entities:
Year: 2007 PMID: 19458762 PMCID: PMC2675501
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
General Bioinformatics Databases.
| DNA Data Bank of Japan (DDBJ) | All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration | |
| EMBL Nucleotide Sequence Database | All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration | |
| GenBank | All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration | |
| NCBI Reference Sequence Project | Non-redundant collection of naturally-occurring biological molecules | |
| Ensembl | Annotated information on eukaryotic genomes | |
| UCSC Genome Browser | Genome assemblies and annotation | |
| UniGene | Non-redundant, gene-oriented clusters | |
| CSDBase | Cold shock domain-containing proteins | |
| DExH/D Family Database | DEAD-box, DEAH-box and DExH-box proteins | |
| Endogenous GPCR List | G protein-coupled receptors; expression in cell lines | |
| EXProt | Proteins with experimentally-verified function | |
| GenProtEC | ||
| Histone Database | Histone and histone fold sequences and structures | |
| HIV Molecular Immunology Database | HIV epitopes | |
| HIV RT and Protease Sequence Database | HIV reverse transcriptase and protease sequences | |
| Homeodomain Resource genomic | Homeodomain sequences, structures and related genetic and genomic information | |
| HUGE | Large (>50 kDa) human proteins and cDNA sequences | |
| IMGT | Immunoglobulin, T cell receptor and MHC sequences from human and other vertebrates | |
| IMGT/HLA | Polymorphic sequences of human MHC and related genes | |
| IMGT/MHC Database | Major histocompatibility complex sequences | |
| InBase | All known inteins (protein splicing elements): properties, sequences, bibliography | |
| InterPro | Protein families and domains | |
| LGICdb | Ligand-gated ion channel subunit sequences | |
| Nuclear Protein Database (NPD) | Proteins localized in the nucleus | |
| NRMD | Nuclear receptor superfamily | |
| NUREBASE | Nuclear hormone receptors | |
| ooTFD | Transcription factors and gene expression | |
| PANTHER | Gene products organized by biological function | |
| Peptaibol | Peptaibol (antibiotic peptide) sequences | |
| Phospho.ELM | Protein phosphorylation sites | |
| PKR | Protein kinase sequences, enzymology, genetics and molecular and structural properties | |
| Prolysis | Proteases and natural or synthetic protease inhibitors | |
| Protein Information Resource (PIR) | Comprehensive, annotated, non-redundant protein sequence databases | |
| ProtoNet | Hierarchical clustering of protein sequences | |
| RTKdb | Receptor tyrosine kinase sequences | |
| SEVENS | 7-transmembrane helix receptors | |
| SWISS-PROT/TrEMBL | Curated protein sequences | |
| TIGRFAMs | Functional identification of proteins | |
| trEST, trGEN, Hits | Hypothetical protein sequences | |
| ASTRAL | Sequences of domains of known structure, selected subsets and sequence-structure correspondences | |
| BioMagResBank acids | NMR spectroscopic data from proteins, peptides, and nucleic acids | |
| CATH | Protein domain structures | |
| CKAAPs DB | Structurally-similar proteins with dissimilar sequences | |
| CSD | Crystal structure information for organic and metal organic compounds | |
| Database of Macromolecular Movements | Descriptions of protein and macromolecular motions, including movies | |
| Decoys ‘R’ Us | Computer-generated protein conformations based on sequence data | |
| DSMM | Database of Simulated Molecular Motions | |
| Gene3D | Precalculated structural assignments for genes within whole genomes | |
| GTOP | Protein fold predictions from genome sequences | |
| HIC-Up | Structures of small molecules (‘hetero-compounds’) | |
| HSSP | Structural families and alignments; structurally-conserved regions and domain architecture | |
| LPFC | Library of protein family core structures | |
| MMDB linked | All experimentally-determined three-dimensional structures, linked to NCBI Entrez | |
| ModBase | Annotated comparative protein structure models | |
| NDB | Nucleic acid-containing structures | |
| NTDB | Thermodynamic data for nucleic acids | |
| PALI | Phylogeny and alignment of homologous protein structures | |
| PASS2 | Structural motifs of protein superfamilies | |
| PDB | Structure data determined by X-ray crystallography and NMR | |
| PDB-REPRDB | Representative protein chains, based on PDB entries | |
| PDBsum | Summaries and analyses of PDB structures | |
| ProTherm | Thermodynamic data for Pro-wild-type and mutant proteins | |
| PSSH | Alignments between protein sequences and tertiary structures | |
| RNABase | RNA-containing structures from PDB and NDB | |
| SCOP | Familial and structural protein relationships | |
| SCOR | RNA structural relationships | |
| Sloop | Classification of protein loops | |
| Structure-Superposition Database | Pairwise superposition of TIM-barrel structures | |
| SUPERFAMILY | Assignments of proteins to structural superfamilies | |
| TESS | Transcription element search system | |
| Virgil | Database interconnectivity | |
Cancer related bioinformatics databases.
| Database Name | URL | Description |
|---|---|---|
| Atlas of Genetics and Cytogenetics in Oncology and Haematology | Cancer-related genes, chromosomal abnormalities in oncology and haematology, and cancer-prone diseases | |
| Cancer Chromosomes | Cytogenetic, clinical and reference information on cancer-related aberrations | |
| CGED | Cancer gene expression database | |
| COSMIC | Catalogue of somatic mutations in cancer: sequence data, samples and publications | |
| Germline p53 Mutations | Mutations in germline_mut_ human tumor and cell line p53 gene | |
| IARC TP53 Database | Human TP53 somatic and germline mutations | |
| MTB | Mouse tumor biology database: tumor types, genes, classification, incidence, pathology | |
| OncoMine | Cancer microarray data by gene or cancer type | |
| Oral Cancer Gene Database | Cellular and molecular data for genes involved in oral cancer | |
| RB1 Gene Mutation DB | Mutations in the human retinoblastoma (RB1) gene | |
| RTCGD | Mouse retroviral tagged cancer gene database | |
| SNP500Cancer | Re-sequenced SNPs from 102 reference samples | |
| SV40 Large T-Antigen Mutants | Mutations in SV40 large tumor antigen gene | |
| Tumor Gene Family Databases | Cellular, molecular and biological data about genes involved in various cancers |
These sites could not be opened at the time of revising the manuscript.
Figure 1.Identical A & B Chain Residues of 1A1E in complex with its ligand (ACE-PTR-GLU-DIY). Ligand in red.
Available QSAR and molecular descriptor programs.
| Name of the software | Brief Description | URL/Reference |
|---|---|---|
| PEST | Shape properties, Wavelet decomposition properties, Electrostatic potential, electronic kinetic energy density etc. | |
| Pharma Algorithm’s QSAR Builder | QSAR and QSPR modeling; Excess molar refraction, H-bond acidity, H-donor capability, H-bond basicity, H-acceptor capability Hexadecane/gas partition coefficient, LogP partition coefficient, TPSA - topological polar surface area | |
| Bioreason | QSAR, QPSR | |
| ClassPharmer | ||
| ChemTK | Molecule design, descriptors and modeling | |
| Molinspiration toolkit | Java Based software and free online calculations of fragments and basic properties/descriptors. | |
| ShapeSig | Shape desciptors, and statistical analysis | |
| Cerius (QSAR module) | Modeling and QSAR, includes MOPAC Quantum mechanical calculations, alignments etc. | |
| CODESSA | QSAR program | |
| HASL | 3D QSAR | |
| QTRFIT | Rigid body superposition | |
| DRAGON | 1664 molecular descriptors |