| Literature DB >> 27789699 |
Abstract
Proteome-pI is an online database containing information about predicted isoelectric points for 5029 proteomes calculated using 18 methods. The isoelectric point, the pH at which a particular molecule carries no net electrical charge, is an important parameter for many analytical biochemistry and proteomics techniques, especially for 2D gel electrophoresis (2D-PAGE), capillary isoelectric focusing, liquid chromatography-mass spectrometry and X-ray protein crystallography. The database, available at http://isoelectricpointdb.org allows the retrieval of virtual 2D-PAGE plots and the development of customised fractions of proteome based on isoelectric point and molecular weight. Moreover, Proteome-pI facilitates statistical comparisons of the various prediction methods as well as biological investigation of protein isoelectric point space in all kingdoms of life. For instance, using Proteome-pI data, it is clear that Eukaryotes, which evolved tight control of homeostasis, encode proteins with pI values near the cell pH. In contrast, Archaea living frequently in extreme environments can possess proteins with a wide range of isoelectric points. The database includes various statistics and tools for interactive browsing, searching and sorting. Apart from data for individual proteomes, datasets corresponding to major protein databases such as UniProtKB/TrEMBL and the NCBI non-redundant (nr) database have also been precalculated and made available in CSV format.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27789699 PMCID: PMC5210655 DOI: 10.1093/nar/gkw978
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
General statistics of the Proteome-pI database
| Number of proteomes | Total number of proteins | Mean number of proteins (±SD) | Mean size of proteins (±SD) | Mean mw of proteins (±SD) | |
|---|---|---|---|---|---|
| Viruses | 504 | 20 920 | 42 ± 89 | 297 ± 375 | 33 ± 42 |
| Archaea | 135 | 318 388 | 2358 ± 920 | 283 ± 212 | 31 ± 23 |
| Bacteria | 3776 | 12 082 903 | 3200 ± 2510 | 311 ± 240 | 34 ± 26 |
| Eukaryote | 614 | 9 299 039 | 15 145 ± 11 830 | 438 ± 429 | 49 ± 48 |
| Eukaryote (major) | 614 | 8 629 591 | 14 055 ± 9899 | 434 ± 416 | 48 ± 46 |
| Eukaryote (minor) | 448 | 669 448 | 1494 ± 5130 | 495 ± 564 | 55 ± 63 |
mw—molecular weight in kDa; for more statistics, see Supplementary Table S1. ‘Major’ and ‘minor’ refer to splicing isoforms of proteins used for calculation of the statistics.
Amino acid frequency for the kingdoms of life in the Proteome-pI database
| Kingdom | Ala | Cys | Asp | Glu | Phe | Gly | His | Ile | Lys | Leu | Met | Asn | Pro | Gln | Arg | Ser | Thr | Val | Trp | Tyr | Total amino acids |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Viruses | 6.61 | 1.76 | 5.81 | 6.04 | 4.25 | 5.79 | 2.15 | 6.53 | 6.35 | 8.84 | 2.46 | 5.41 | 4.62 | 3.39 | 5.24 | 7.06 | 6.06 | 6.50 | 1.19 | 3.94 | 6 150 189 |
| Archaea | 8.20 | 0.98 | 6.21 | 7.69 | 3.86 | 7.58 | 1.77 | 7.03 | 5.27 | 9.31 | 2.35 | 3.68 | 4.26 | 2.38 | 5.51 | 6.17 | 5.44 | 7.80 | 1.03 | 3.45 | 89 488 664 |
| Bacteria | 10.06 | 0.94 | 5.59 | 6.15 | 3.89 | 7.76 | 2.06 | 5.89 | 4.68 | 10.09 | 2.38 | 3.58 | 4.61 | 3.58 | 5.88 | 5.85 | 5.52 | 7.27 | 1.27 | 2.94 | 3 716 982 916 |
| Eukaryota | 7.63 | 1.76 | 5.40 | 6.42 | 3.87 | 6.33 | 2.44 | 5.10 | 5.64 | 9.29 | 2.25 | 4.28 | 5.41 | 4.21 | 5.71 | 8.34 | 5.56 | 6.20 | 1.24 | 2.87 | 3 743 221 293 |
| All | 8.76 | 1.38 | 5.49 | 6.32 | 3.87 | 7.03 | 2.26 | 5.49 | 5.19 | 9.68 | 2.32 | 3.93 | 5.02 | 3.90 | 5.78 | 7.14 | 5.53 | 6.73 | 1.25 | 2.91 | 7 555 843 062 |
*Similar statistics for all 5029 proteomes included in Proteome-pI are available online on individual subpages. For di-amino acid frequencies see Supplementary Table S2.
Figure 1.Proteome-pI example report for Salmonella enterica. At the top, the average isoelectric point, precalculated fractions of proteins according to isoelectric point and virtual 2D-PAGE plot for the proteome are shown. In the next section, the user can retrieve a subset of proteins within specified isoelectric point and molecular weight ranges calculated using a particular method. Next, proteins with minimal and maximal isoelectric points are presented along with some general statistics.
Figure 2.Isoelectric points and molecular weights across kingdoms of life. Data for the proteomes of 135 Archaea, 127 viruses (>50 proteins), 3775 bacteria and 614 eukaryotes.