| Literature DB >> 19389726 |
Hong Sain Ooi1, Chia Yee Kwo, Michael Wildpaner, Fernanda L Sirota, Birgit Eisenhaber, Sebastian Maurer-Stroh, Wing Cheong Wong, Alexander Schleiffer, Frank Eisenhaber, Georg Schneider.
Abstract
Function prediction of proteins with computational sequence analysis requires the use of dozens of prediction tools with a bewildering range of input and output formats. Each of these tools focuses on a narrow aspect and researchers are having difficulty obtaining an integrated picture. ANNIE is the result of years of close interaction between computational biologists and computer scientists and automates an essential part of this sequence analytic process. It brings together over 20 function prediction algorithms that have proven sufficiently reliable and indispensable in daily sequence analytic work and are meant to give scientists a quick overview of possible functional assignments of sequence segments in the query proteins. The results are displayed in an integrated manner using an innovative AJAX-based sequence viewer. ANNIE is available online at: http://annie.bii.a-star.edu.sg. This website is free and open to all users and there is no login requirement.Entities:
Mesh:
Year: 2009 PMID: 19389726 PMCID: PMC2703921 DOI: 10.1093/nar/gkp254
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Sequence analytic algorithms
| Algorithm | Description | Parameters |
|---|---|---|
| CAST ( | Algorithm for low-complexity region (LCR) detection and selective masking | Threshold: 40 |
| IUPred ( | Prediction method for recognizing ordered and intrinsically unstructured/disordered regions in proteins | Prediction type: long disorder |
| SAPS ( | Statistical analysis of protein sequences with respect to amino acid composition and simple sequence motifs | n/a |
| SEG ( | Prediction of low complexity regions | Three parameter sets: Window-size 12, Locut 2.2, Hicut 2.5 Window-size 25, Locut 3.0, Hicut 3.3 Window-size 45, Locut 3.4, Hicut 3.75 |
| Big-∏ ( | Prediction of protein GPI lipid anchor cleavage sites | Taxon-specific learning set |
| NMT ( | Prediction of N-terminal N-myristoylation of proteins | Taxon-specific parameter set |
| PrePS – FT ( | Farnesylation prediction | n/a |
| PrePS – GGT1 ( | Geranylgeranylation prediction | n/a |
| PrePS – GGT2 ( | Rab geranylgeranylation Prediction | n/a |
| PeroPS/PTS1 ( | Prediction of peroxisomal targeting signal 1 | Taxon-specific prediction function |
| DAS-TMfilter ( | Prediction of transmembrane regions | Quality cutoff: 0.72 |
| HMMTOP ( | Transmembrane topology prediction using Hidden Markov models | n/a |
| PHOBIUS ( | Combined transmembrane topology and signal peptide predictor | n/a |
| TMHMM ( | Transmembrane helix predictor | n/a |
| IMP-COIL ( | Prediction of coiled-coil regions, modified implementation of the algorithm Lupas | n/a |
| PROSITE ( | Pattern search in the PROSITE database | n/a |
| PROSITE-Profile ( | Profile search in the PROSITE database | n/a |
| HMMER ( | Profile Hidden Markov Models | SMART ( |
| IMPALA ( | Tool to compare a query sequence against a library of position-specific scoring matrices | Wolf-library ( |
| RPS-BLAST against CDD ( | Reverse-position-specific BLAST against the Conserved Domain Database (CDD) |
Figure 1.Interactive sequence view. This figure shows an exemplary interactive sequence view using the sequence of Dysferlin. The sequence features found by the various programs are organized in panes that coalesce findings with similar functional significance. The different color coding is just for the purpose of easing navigation.
Figure 2.Histogram view. This view shows the occurrence of sequence features in the sequence set under investigation. The features are sorted by their number of incidences in the set. Clicking on the link provided with the feature name will generate the sublist of sequences with this feature. In this example of Eco1-type proteins, the top four entries in the histogram are related to low-complexity regions as well as short motifs from PROSITE that are less reliable predictions. The fifth entry indicates the occurrence of the KOG3014 domain model that is characteristic for the Eco1-class of proteins necessary for the establishment of sister chromatid cohesion in mitosis.
Figure 3.Taxonomy view. The taxonomic distribution of the sequence set is displayed. The numbers in brackets refer to the number of sequences below a branch in the taxonomic tree and those assigned to a particular taxon. For the given Eco1 example set, this view shows that it contains one plant sequence (Arabidopsis thaliana) together with a trypanosome, one fungal sequence and four from Bilateria.