| Literature DB >> 24150944 |
Mikhail Yu Lobanov1, Igor V Sokolovskiy, Oxana V Galzitskaya.
Abstract
We focus our attention on multiple repeats of one amino acid (homorepeats) and create a new database (named HRaP, at http://bioinfo.protres.ru/hrap/) of occurrence of homorepeats and disordered patterns in different proteomes. HRaP is aimed at understanding the amino acid tandem repeat function in different proteomes. Therefore, the database includes 122 proteomes, 97 eukaryotic and 25 bacterial ones that can be divided into 9 kingdoms and 5 phyla of bacteria. The database includes 1,449,561 protein sequences and 771,786 sequences of proteins with GO annotations. We have determined homorepeats and patterns that are associated with some function. Through our web server, the user can do the following: (i) search for proteins with the given homorepeat in 122 proteomes, including GO annotation for these proteins; (ii) search for proteins with the given disordered pattern from the library of disordered patterns constructed on the clustered Protein Data Bank in 122 proteomes, including GO annotations for these proteins; (iii) analyze lengths of homorepeats in different proteomes; (iv) investigate disordered regions in the chosen proteins in 122 proteomes; (v) study the coupling of different homorepeats in one protein; (vi) determine longest runs for each amino acid inside each proteome; and (vii) download the full list of proteins with the given length of a homorepeat.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24150944 PMCID: PMC3965023 DOI: 10.1093/nar/gkt927
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Dependence of the number of proteins that contain homorepeats of different lengths for 20 amino acids in D. discoideum proteome.
Figure 2.Dependence the number of proteins with at least one occurrence of homorepeats of ≥6 residues long in 3617 proteomes on the size of proteomes.
Figure 3.A screenshot of HRaP results filtered for HomoRepeats of the all 122 proteomes.
Figure 4.The percentage of proteins with at least one occurrence of homorepeats of ≥6 residues long in 122 proteomes.