| Literature DB >> 16381872 |
Andrei Kouranov1, Lei Xie, Joanna de la Cruz, Li Chen, John Westbrook, Philip E Bourne, Helen M Berman.
Abstract
The RCSB Protein Data Bank (PDB) offers online tools, summary reports and target information related to the worldwide structural genomics initiatives from its portal at http://sg.pdb.org. There are currently three components to this site: Structural Genomics Initiatives contains information and links on each structural genomics site, including progress reports, target lists, target status, targets in the PDB and level of sequence redundancy; Targets provides combined target information, protocols and other data associated with protein structure determination; and Structures offers an assessment of the progress of structural genomics based on the functional coverage of the human genome by PDB structures, structural genomics targets and homology models. Functional coverage can be examined according to enzyme classification, gene ontology (biological process, cell component and molecular function) and disease.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16381872 PMCID: PMC1347482 DOI: 10.1093/nar/gkj120
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1August 2005 report from the structural genomics information portal showing structural genomics structures with sequence similarity <30% relative to solved structures in the PDB by year. Sequence comparisons are performed using the blastclust application (7).
Genome coverage
| Function coverage | Cluster coverage | |
|---|---|---|
| Genome sequences | 1.000 | 1.000 |
| PDB structures | 0.372 | 0.094 |
| SG targets | 0.324 | 0.156 |
| Homology models | 0.563 | 0.283 |
| PDB structures + SG tragets | 0.515 | 0.239 |
| PDB structures + homology models | 0.595 | 0.303 |
| SG targets + homology models | 0.663 | 0.411 |
| PDB structures + SG targets + homology models | 0.687 | 0.428 |
Data are based upon 10801 functionally described human genome sequences from Ensembl, 942 PDB structures from human, 1680 structural genomics targets identified in human and 2823 homology models from SUPERFAMILY mapped on to the human genome. Cluster Coverage is the ratio of number of protein clusters that are structurally covered versus all clusters in the genome for a functional class with a specified sequence identity (40% in this case). Functional class and sequence identity are input parameters.
Figure 2Normalized functional coverage of the human genome by sequence (from Ensembl; red), by structures from the PDB (blue), by structural genomics targets (green) and homology models from SUPERFAMILY (yellow). When viewing the figure from the online structural genomics portal, clicking on the appropriate bar of the histogram will produce a list of sequences or structures that define the distribution.