| Literature DB >> 15980586 |
Abstract
ProTarget is a Web-based tool for the automatic prediction of fold novelty. It offers the structural genomics community a method for target selection by providing an online analysis of any new or pre-existing sequence for its relationship to any previously solved three-dimensional structure. ProTarget takes as input an amino acid sequence. Regions of this sequence that exhibit high similarity to an existing PDB (Protein Data Bank) sequence are removed, leaving one or more subsequences. Each of these subsequences is then analyzed against a clustering of the protein space to determine the likelihood of its representing a new structural superfamily. This likelihood is derived from the distance in the clustering between the (sub)sequence and sequences that have known structures. The output of ProTarget is a graphical visualization of the protein of interest together with the likelihood that a protein sequence represents a novel structural superfamily. ProTarget is updated regularly and currently covers over 160 000 protein sequences from the SwissProt and PDB databases. ProTarget is available at http://www.protarget.cs.huji.ac.il.Entities:
Mesh:
Year: 2005 PMID: 15980586 PMCID: PMC1160150 DOI: 10.1093/nar/gki389
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1A schematic illustration of the ProTarget algorithm. Sequences are filtered using a BLAST similarity search to remove subsequences similar to those in the PDB, and inserted into the ProtoNet clustering hierarchy, where a prediction is provided. For details see (8).
Figure 2Graphical view of the ProTarget output for a query protein. The query protein (AGRN_RAT) of 1959 amino acids is fragmented by the cropping method to 11 segments. Segments labeled ‘New SF’ (new superfamily) are colored orange; those labeled ‘Old SF’ are in white. Each segment is associated with its sequence, the confidence score for its being new (score =1 indicates highest confidence) or a link to the best hit in the PDB. The inset shows the structure from the PDB that corresponds to segment 8. This is PDB 1PZ9, a 201 amino acid domain from the tail of AGRN-CHICK. The E-value of the similarity score between the sequence of any segment and the best correspondence structure from the PDB is indicated in the header of the interactive opened frame. An example for such frame for segment 2 is shown.