| Literature DB >> 20158919 |
Yi-Ruen Lin1, Hsin-Yuan Wei, Tsung-Lin Tsai, Thy-Hou Lin.
Abstract
BACKGROUND: The protein structures of the disease-associated proteins are important for proceeding with the structure-based drug design to against a particular disease. Up until now, proteins structures are usually searched through a PDB id or some sequence information. However, in the HDAPD database presented here the protein structure of a disease-associated protein can be directly searched through the associated disease name keyed in. DESCRIPTION: The search in HDAPD can be easily initiated by keying some key words of a disease, protein name, protein type, or PDB id. The protein sequence can be presented in FASTA format and directly copied for a BLAST search. HDAPD is also interfaced with Jmol so that users can observe and operate a protein structure with Jmol. The gene ontological data such as cellular components, molecular functions, and biological processes are provided once a hyperlink to Gene Ontology (GO) is clicked. Further, HDAPD provides a link to the KEGG map such that where the protein is placed and its relationship with other proteins in a metabolic pathway can be found from the map. The latest literatures namely titles, journals, authors, and abstracts searched from PubMed for the protein are also presented as a length controllable list.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20158919 PMCID: PMC2833151 DOI: 10.1186/1471-2105-11-88
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The protein structures collected in HDAPD can be routinely updated through six PHP-MySQL templates designed namely (a), (b), (c), (d), (e), and (f). These templates are used for entering (a) disease and protein names; (b) disease and protein classification; (c) disease and PDB code; (d) protein names, UniProt and GO id; (e) protein name, UniProt id, GO id, gene name, NCBI id, and KEGG id; and (f) protein name, protein description, source, authors, resolution, and method.
A comparison for database contents and searching functions provided by HDAPD with those provided by databases NCBI Entrez, EMBL, UniProt, and GHR
| Databases | HDAPD | NCBI Entrez | EMBL | UniProt | GHR | |
|---|---|---|---|---|---|---|
| Web site | ||||||
| Functions | Disease types | ICD-10; Genes and Disease (classified diseases into 14 groups; 285 diseases) | (classified diseases into 17 groups) | |||
| Disease list | ∘ | ∘ | ||||
| Disease introduction | Gene Review; Genetics Home Reference | |||||
| Disease introduction | ∘ | ∘ | ∘ | |||
| Disease-associated proteins | ICD-10; Genes and Disease; OMIM | |||||
| Disease-associated protein list | ∘ | * | * | * | ||
| *A protein list is provided by typing in keywords. | ||||||
| Determine-d and annotated protein sequences | SwissProt, PIR, PRF, PDB, | PRIDE | UniProtKB | |||
| Sequence database | ∘ | ∘ | ∘ | |||
| Determine-d protein structure | 3-D macromolecular structures | PDB | MMDB | PDBe | PDB | |
| PDB ID | ∘ | ∘ | ∘ | ∘ | ||
| Compound | ∘ | ∘ | ∘ | ◎ | ||
| Classification | ∘ | ∘ | ∘ | ◎ | ||
| Source | ∘ | ◎ | ∘ | ◎ | ||
| Resolution | ∘ | ◎ | ∘ | ∘ | ||
| Method | ∘ | ◎ | ∘ | ∘ | ||
| Author List | ∘ | ∘ | ∘ | ◎ | ||
| Accession Date | ∘ | ∘ | ∘ | ◎ | ||
| Protein sequence | ∘ | ∘ | ∘ | ◎ | ||
| Molecular viewer | ∘ | ◎ | ∘ | ◎ | ||
| ◎: A hyperlink to PDB is provided for a search result. | ||||||
| Gene ontology | GO | Taxonomy | BioCatalogue | GO | ||
| Gene Ontology (GO) | ∘ | ◎ | ∘ | ∘ | ||
| ◎: A hyperlink to PubMed of NCBI is provided for a search result. | ||||||
| Pathway | Pathway and systems of interacting molecules | KEGG | KEGG | BioModels | UniProtKB/Swiss-Prot | |
| Pathway description | ∘ | ∘ | ∘ | ∘ | ||
| Pathway map | ∘ | ∘ | ∘ | ∘ | ||
| Literature | Full text and journal articles | PubMed | PubMed | Medline Patents | PubMed SRS CiteXplore | PubMed |
| Literature extracting | ∘ | ∘ | ∘ | ∘ | ||
| Author | ∘ | ∘ | ∘ | ∘ | ◎ | |
| Journal | ∘ | ∘ | ∘ | ∘ | ◎ | |
| Relative date | ∘ | ∘ | ∘ | ∘ | ◎ | |
| Sorting | ∘ | |||||
| Date | ∘ | ∘ | ∘ | ∘ | ||
| ◎: A hyperlink to PubMed of NCBI is provided for a search result. | ||||||
PRIDE: Proteomics identification database, UniProtKB: UniProt knowledge base of protein sequences, UniRef: UniProt Non-redundant reference databases, UniParc: Non-redundant archive of protein sequences, PDB: Protein database bank, MMDB: The Molecular modeling database, PDBe: Macromolecular structures database, GO: Gene ontology, Taxonomy: NCBI Taxonomy database of organism names, BioCatalogue: BioCatalogue, SBO: Systems biology ontology, KEGG: Kyoto encyclopedia of genes and genomes, Reactome: Database of core biochemical pathways and reactions, BioModels: Database of mathematical models of biological interest, Rhea: Manually annotated database of chemical reactions created in collaboration with the Swiss Institute of Bioinformatics (SIB), PubMed: PubMed of NCBI, Medline: Citations and abstracts from many life-science journals, Patents: Biology-related abstracts of patent applications.
The searched results for two diseases and a disease-associated protein namely Lung cancer, Diabetes, and Tumor protein 53 by HDAPD and NCBI Entrez, EMBL, UniProt, and GHR are compared.
| describe | HDAPD | NCBI Entrez | EMBL | UniProt | GHR | |
|---|---|---|---|---|---|---|
| Disease & protein | Lung cancer | 293 | 699 | |||
| tumor protein 53 | 1 | 1 | ||||
| diabetes | 161 | 705 | ||||
| Protein sequences | Lung cancer | - | 20314 | 1524 | 765 | |
| tumor protein 53 | - | 679 | 35 | 32 | ||
| diabetes | - | 19353 | 751 | 991 | ||
| Protein structure | Lung cancer | 2155 | 17 | 33 | 2360 | |
| tumor protein 53 | 61 | 1 | 22 | 79 | ||
| diabetes | 906 | 254 | 547 | 1082 | ||
| GO: biological process | Lung cancer | 1872 | 0 | 625 | ||
| tumor protein 53 | 48 | 0 | 31 | |||
| diabetes | 1127 | 1 | 902 | |||
| GO: cellular component | Lung cancer | 2017 | 0 | 668 | ||
| tumor protein 53 | 11 | 0 | 31 | |||
| diabetes | 1080 | 0 | 895 | |||
| GO: molecular function | Lung cancer | 1953 | 0 | 662 | ||
| tumor protein 53 | 12 | 0 | 27 | |||
| diabetes | 1096 | 0 | 862 | |||
| Number of KEGG paths involved | Lung cancer | 893 | 56 | 1 | 26 | |
| tumor protein 53 | 20 | 128 | 0 | 2 | ||
| diabetes | 354 | 65 | 21 | 60 | ||
| literature | Lung cancer | 651248 | 173048 | 105635 | 2876 | |
| tumor protein 53 | 3831 | 15739 | 2818 | 251 | ||
| diabetes | 13635 | 339343 | 325435 | 11155 | ||
* The GO information in NCBI Entrez is indirectly provided through a hyperlink to PDB.