| Literature DB >> 15096277 |
Alexander V Diemand1, Holger Scheib.
Abstract
BACKGROUND: Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn.Entities:
Mesh:
Year: 2004 PMID: 15096277 PMCID: PMC419980 DOI: 10.1186/1471-2105-5-39
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Class diagram of The coloured rectangles symbolise the classes implemented in libmoltalk. The inheritance between classes is shown as lines starting with a triangle at the super-class and ending at the child class(es). Aggregates of instances of classes are marked with a line ending at the container class with a rhomb. Other dependencies between classes are shown as dashed lines. Below the class name in bold, the most important fields of a particular class are given. Classes were grouped according to their function as follows: in yellow classes for input/output, in blue structural classes, in orange factories, in green mathematical classes, in pink classes representing a structural alignment, and in magenta the selection class.
The ten classes summarised in group "structure" with labelled difficulty level and a selection of methods available.
| Structure | Basic | Returns 4-letter PDB code |
| Returns HEADER, TITLE, REVDAT lines | ||
| Extracts date from HEADER line | ||
| Returns type of experimental method | ||
| Returns resolution as in REMARK2 lines | ||
| Writes out complete structure to a stream in PDB format | ||
| Returns enumerator over all chains | ||
| Returns chain for a given code | ||
| Removes a chain from structure | ||
| ... | ||
| Structure Factory | Xtra | Reads structure from directory or file |
| Offers parsing options from directory or file | ||
| ... | ||
| Chain | Basic | Returns code of this chain (as string/number) |
| Returns chain identifier consisting of PDB and chain code | ||
| Returns COMPND and SOURCE lines, EC code Transforms all residues/atoms in chain by transformation matrix | ||
| Returns number of residues (amino acids and nucleic acids), standard amino acids, heterogeneous residues, solvent residues | ||
| Provides access to residues, heterogeneous residues, solvent residues | ||
| Adds new residue, heterogen, new solvent molecule to chain | ||
| Removes a residue, heterogen, solvent molecule from chain | ||
| Derives amino acid sequence from connected residues | ||
| Derives amino acid sequence with filled gaps ("X") where missing residues occur | ||
| Returns amino acid sequence from SEQRES entry | ||
| Computes geometric hashing table of all residues | ||
| Finds residues in chain which are close to given co-ordinates based on geometric hashing | ||
| ... | ||
| Chain Factory | Xtra | Creates a new chain with given code |
| Residue | Basic | Returns the residue name/number |
| Returns the name of the standard residue as the base of this modified residue as in MODRES lines | ||
| Returns description of residue modification as given in MODRES lines | ||
| Translates residue name into amino acid one letter code | ||
| Adds new atom to residue | ||
| ... | ||
| Residue Factory | Xtra | Creates new residue with number and name |
| ... | ||
| Atom | Basic | Returns atom name/number |
| Returns temperature factor for an atom | ||
| Returns chemical element | ||
| Returns partial charge of atom | ||
| Returns enumerator over all bonded atoms | ||
| Adds bond from this atom to given atom2 | ||
| Removes all bonds | ||
| Removes bond to atom2 | ||
| Sets atom to be of chemical type | ||
| ... | ||
| Coordinates | Basic | Calculates Euclidian distance between two co-ordinates |
| Returns x, y, z from co-ordinates | ||
| Transforms co-ordinates by transformation matrix | ||
| ... | ||
| Pairwise Structural Alignment | Basic | Provides access to first/second chain |
| Computes transformation based on superimposed chains | ||
| Re-computes transformation from selection of residue pairs | ||
| Calculates RMSD of structural alignment | ||
| Counts alignment positions in structural alignment | ||
| Counts aligned pairs only | ||
| Counts aligned pairs with distance below given cut-off | ||
| Reads external pairwise alignment from stream in T_Coffee format and re-computes structural alignment from this | ||
| Writes structural alignment to stream in T_Coffee library format | ||
| ... | ||
| Selection | Basic | Counts number of residues in this selection |
| Returns enumerator over selected residues | ||
| Includes/excludes a single residue to/from selection | ||
| Adds all selected residues from selection2 to this selection | ||
| Structurally aligns selection1 to selection2 and returns the resulting transformation matrix | ||
| ... | ||
A: Information stored in PDBChainSaw for the G chain of bovine mitochondrial F1-ATPase (1OHHG) as an exemplar. In the field "chainid", the PDB four-letter code is followed by the single character chain identifier. The field "chainid2" is the concatenation of the PDB code and the ASCII value of the chain identifier, in this case, "71" corresponding to "G". B: The two other options to return the sequence from a protein structure stored in PDB file format.
| pdb | 1OHH |
| code | G |
| chainid | 1OHHG |
| chainid2 | 1OHH71 |
| residues | 94 |
| aminoacids | 94 |
| heterogens | 0 |
| solvent | 0 |
| sequence from co-ordinates (inferred) | ATLKDITRRLKSIKNIQKITKSMKMVAAAKXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX LCGAIHSSVAKQXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXTTSEQSAR MTAMDNASKNASEMIDKLTLTFNRTRQAVITKELIEIISGAAAL |
| organism | BOS TAURUS |
| compound | ATP SYNTHASE GAMMA CHAIN, MITOCHONDRIAL SYNONYM: BOVINE MITOCHONDRIAL F1-ATPASE GAMMA SUBUNIT |
| eccode | 3.6.1.34 |
| sequence from co-ordinates without 'X' | ATLKDITRRLKSIKNIQKITKSMKMVAAAKLCGAIHSSVAKQTTSEQSARMTAMDNASKNASEMIDKLTLT FNRTRQAVITKELIEIISGAAAL |
| sequence from SEQRES | ATLKDITRRLKSIKNIQKITKSMKMVAAAKYARAERELKPARVYGVGSLALYEKADIKTPEDKKKHLIIGVSS DRGLCGAIHSSVAKQMKSEAANLAAAGKEVKIIGVGDKIRSILHRTHSDQFLVTFKEVGRRPPTFGDASVIAL ELLNSGYEFDEGSIIFNRFRSVISYKTEEKPIFSLDTISSAESMSIYDDIDADVLRNYQEYSLANIIYYSLKE STTSEQSARMTAMDNASKNASEMIDKLTLTFNRTRQAVITKELIEIISGAAAL |
Figure 2PDBChainSaw database schema. The database schema of the tables in PDBChainSaw models the one-to-many relation between structures and their chains. Entities in the table "chsaw_pdb" are uniquely identified with the four-letter PDB code. Entities in the table "chsaw_chain" are linked to their parent structure through the PDB code and add their single character chain identifier to their unique identifier field named "chainid". Since chain identifiers may appear to be of lower case or non-alphanumeric, another identifier "chainid2" was set to the numerical value of the chain identifier to allow for unique formatting in FASTA files.