| Literature DB >> 19828071 |
Alessandra Tiengo1, Nicola Barbarini, Sonia Troiani, Luisa Rusconi, Paolo Magni.
Abstract
BACKGROUND: One of the topics of major interest in proteomics is protein identification. Protein identification can be achieved by analyzing the mass spectrum of a protein sample through different approaches. One of them, called Peptide Mass Fingerprinting (PMF), combines mass spectrometry (MS) data with searching strategies in a suitable database of known protein to provide a list of candidate proteins ranked by a score. To this aim, several algorithms and software tools have been proposed. However, the scoring methods and mainly the statistical evaluation of the results can be significantly improved.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19828071 PMCID: PMC2762060 DOI: 10.1186/1471-2105-10-S12-S11
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1PMF consists of three steps. (1) The preparation of the biological sample: a band or a spot of the electrophoretic gel is selected and digested by a suitable protease, such as trypsin. The resulting mixture of peptides is analyzed with a mass spectrometer, usually in MALDI-TOF configuration. (2) A reference protein database is created, reproducing in silico on a set of known proteins the step 1, considering also possible missed cleavages and post-translational modifications. (3) The acquired spectrum is matched against the theoretical spectra generated by in silico digestion of all proteins in the reference database (step 2) and a ranked list of candidate proteins is obtained.
The amino acids and their monoisotopic and average masses used for computing the molecular weight of a protein or a peptide are reported (source [20]).
| Amino acid | Monoisotopic mass | Average mass | |
| Alanine | A | 71.04 | 71.08 |
| Arginine | R | 156.10 | 156.19 |
| Asparagine | N | 114.04 | 114.10 |
| Aspartic acid | D | 115.03 | 115.09 |
| Cysteine | C | 103.01 | 103.14 |
| Glutamic acid | E | 129.04 | 129.12 |
| Glutamine | Q | 128.06 | 128.13 |
| Glycine | G | 57.02 | 57.05 |
| Histidine | H | 137.06 | 137.14 |
| Isoleucine | I | 113.08 | 113.16 |
| Leucine | L | 113.08 | 113.16 |
| Lysine | K | 128.09 | 128.17 |
| Methionine | M | 131.04 | 131.19 |
| Phenylalanine | F | 147.07 | 147.18 |
| Proline | P | 97.05 | 97.12 |
| Serine | S | 87.03 | 87.08 |
| Threonine | T | 101.05 | 101.11 |
| Tryptophan | W | 186.08 | 186.21 |
| Tyrosine | Y | 163.06 | 163.18 |
| Valine | V | 99.07 | 99.13 |
The amino acids and the terminal groups that influence the pI of a protein with their charge polarity are shown. Four pKr values are tabulated for each of them, as defined by Lehninger [21], Solomon [22], Sillero [22], Rodwell [22]. In this work, Lehninger's values were used.
| Amino acid | Polarity | pKr Lehninger | pKr Solomon | pKr Sillero | pKr Rodwell |
| Glutamic acid (E) | - | 4.25 | 4.30 | 4.50 | 4.25 |
| Aspartic acid (D) | - | 3.65 | 3.90 | 4.00 | 3.68 |
| Cysteine (C) | - | 8.18 | 8.30 | 9.00 | 8.33 |
| Tyrosine (Y) | - | 10.07 | 10.10 | 10.00 | 10.07 |
| Histidine (H) | + | 6.00 | 6.00 | 6.40 | 6.00 |
| Lysine (K) | + | 10.53 | 10.50 | 10.40 | 11.50 |
| Arginine (R) | + | 12.48 | 12.50 | 12.00 | 11.50 |
| Carboxyl terminal (COOH) | - | 3.10 | 2.40 | 3.20 | 3.10 |
| Amino terminal ( | + | 8.00 | 9.60 | 8.20 | 8.00 |
The exceptions to the basic cleavage rules of the trypsin are shown: AA-1 is the amino acid immediately preceding the K or R amino acid, AA0 is the last amino acid before the cleavage site (K or R) and AA1 is the first amino acid after the cleavage site. In these cases, trypsin does not easily cut the amino-acid chain (Source [23]).
| not W | K | P |
| not M | R | P |
| C | K | D |
| D | K | D |
| C | K | H |
| C | K | Y |
| C | R | K |
| R | R | H |
| R | R | R |
The hypotheses under the three scoring methods proposed by Samuelsson et al. are shown.
| Method | Mass tolerance | Peptide mass distribution |
| 1 | Absolute (Da) | Uniform |
| 2 | Relative (ppm) | Uniform |
| 3 | Relative (ppm) | Not uniform |
The number of query masses in each sample band before and after contaminant mass removing by MsPI routine are shown.
| Band | Number of query masses | Mass tolerance | |
| 0.3 Da | 100 ppm | ||
| 1 | 121 | 64 | 65 |
| 2 | 111 | 72 | 72 |
| 3 | 123 | 73 | 74 |
| 4 | 106 | 55 | 55 |
| 5 | 164 | 98 | 99 |
| 6 | 172 | 116 | 116 |
| 7 | 123 | 62 | 62 |
| 8 | 71 | 39 | 38 |
| 9 | 175 | 119 | 120 |
| 10 | 52 | 27 | 28 |
For MsPI 1, Mascot 1 and Piums the mass tolerance was set to 0.3 Da; in the other cases it was set to 100 ppm. For MsPI 1, MsPI 2 and MsPI 3, score 1, 2 and 3 were respectively used with uniform distribution. For each band the position of the "true" protein in the significant candidate list, the length of that list either using or not using the knowledge about the MW of the band, the MW and the pI of the "true" protein, the number of matching masses and the sequence coverage are reported.
| Gel band (#) | Software tool | Rank (# significant proteins without MW filtering) | Rank (# significant proteins with MW filtering) | MW (Da) | pI | Matches (#) | Coverage (%) |
| 1 | MsPI 1 | 2 (2) | 1 (1) | 123852 | 5.31 | 23 | 0.267 |
| MsPI 2 | 7 (8) | 1 (1) | 123852 | 5.31 | 23 | 0.270 | |
| MsPI 3 | 2 (2) | 1 (1) | 123852 | 5.31 | 23 | 0.267 | |
| Mascot 1 | 1 (1) | 1 (1) | 124292 | 5.50 | 23 | 0.250 | |
| Mascot 2 | 1 (1) | 1 (1) | 124292 | 5.50 | 23 | 0.250 | |
| Piums | - | - (-) | - | - | 21 | 0.242 | |
| 2 | MsPI 1 | 1 (2) | 1 (1) | 191785 | 5.32 | 36 | 0.252 |
| MsPI 2 | 1 (4) | 1 (1) | 191785 | 5.32 | 35 | 0.252 | |
| MsPI 3 | 1 (2) | 1 (1) | 191785 | 5.32 | 35 | 0.252 | |
| Mascot 1 | 1 (1) | 1 (1) | 193260 | 5.48 | 36 | 0.250 | |
| Mascot 2 | 1 (3) | 1 (1) | 193260 | 5.48 | 35 | 0.250 | |
| Piums | 1 (1) | - | - | - | 36 | 0.257 | |
| 3 | MsPI 1 | 2 (8) | 1 (1) | 104981 | 5.08 | 21 | 0.245 |
| MsPI 2 | - (4) | - (-) | 104981 | 5.08 | 21 | 0.245 | |
| MsPI 3 | 2 (4) | 1 (1) | 104981 | 5.08 | 21 | 0.245 | |
| Mascot 1 | 1 (2) | 1 (1) | 105245 | 5.27 | 21 | 0.240 | |
| Mascot 2 | 1 (6) | 1 (2) | 105245 | 5.27 | 21 | 0.240 | |
| Piums | - (-) | - | - | - | 19 | 0.254 | |
| 4 | MsPI 1 | 2 (4) | 1 (1) | 95382 | 6.38 | 20 | 0.286 |
| MsPI 2 | 1 (5) | 1 (1) | 95382 | 6.38 | 20 | 0.286 | |
| MsPI 3 | 1 (6) | 1 (2) | 95382 | 6.38 | 20 | 0.286 | |
| Mascot 1 | 1 (1) | 1 (1) | 96246 | 6.41 | 22 | 0.320 | |
| Mascot 2 | 1 (1) | 1 (1) | 96246 | 6.41 | 22 | 0.320 | |
| Piums | - (-) | - | - | - | 19 | 0.341 | |
| 5 | MsPI 1 | 2 (8) | 2 (3) | 84351 | 4.70 | 28 | 0.418 |
| MsPI 2 | 3 (3) | 2 (2) | 84351 | 4.70 | 28 | 0.418 | |
| MsPI 3 | 2 (12) | 2 (3) | 84351 | 4.70 | 28 | 0.418 | |
| Mascot 1 | 1 (2) | 1 (2) | 83554 | 4.97 | 35 | 0.480 | |
| Mascot 2 | 1 (3) | 1 (2) | 83554 | 4.97 | 34 | 0.480 | |
| Piums | 1 (1) | - | - | - | 28 | 0.364 | |
| 6 | MsPI 1 | 2 (8) | 1 (1) | 67900 | 7.37 | 24 | 0.515 |
| MsPI 2 | - (1) | - (-) | 67900 | 7.37 | 23 | 0.509 | |
| MsPI 3 | 1 (6) | 1 (1) | 67900 | 7.37 | 23 | 0.509 | |
| Mascot 1 | 1 (2) | 1 (1) | 68519 | 7.58 | 24 | 0.510 | |
| Mascot 2 | 1 (4) | 1 (2) | 68519 | 7.58 | 23 | 0.500 | |
| Piums | - (-) | - | - | - | 19 | 0.446 | |
| 7 | MsPI 1 | 1 (3) | 1 (1) | 53146 | 4.82 | 21 | 0.468 |
| MsPI 2 | 1 (2) | 1 (1) | 53146 | 4.82 | 21 | 0.468 | |
| MsPI 3 | 1 (2) | 1 (1) | 53146 | 4.82 | 21 | 0.468 | |
| Mascot 1 | 1 (1) | 1 (1) | 53676 | 5.06 | 21 | 0.500 | |
| Mascot 2 | 1 (3) | 1 (1) | 53676 | 5.06 | 21 | 0.500 | |
| Piums | - (-) | - | - | - | 16 | 0.413 | |
| 8 | MsPI 1 | - (2) | - (-) | 49674 | 4.53 | 8 | 0.232 |
| MsPI 2 | - (1) | - (-) | 49674 | 4.53 | 8 | 0.232 | |
| MsPI 3 | - (1) | - (-) | 49674 | 4.53 | 8 | 0.232 | |
| Mascot 1 | - (-) | - (-) | 50095 | 4.78 | 8 | 0.230 | |
| Mascot 2 | - (-) | - (-) | 50095 | 4.78 | 8 | 0.230 | |
| Piums | - (-) | - | - | - | 8 | 0.232 | |
| 9 | MsPI 1 | 1 (4) | 1 (1) | 45982 | 9.79 | 31 | 0.784 |
| MsPI 2 | 1 (4) | 1 (1) | 45982 | 9.79 | 32 | 0.787 | |
| MsPI 3 | 1 (5) | 1 (1) | 45982 | 9.79 | 32 | 0.787 | |
| Mascot 1 | 1 (1) | 1 (1) | 46180 | 9.45 | 31 | 0.730 | |
| Mascot 2 | 1 (1) | 1 (1) | 46180 | 9.45 | 31 | 0.730 | |
| Piums | 1 (1) | - | - | - | 27 | 0.715 | |
| 10 | MsPI 1 | 1 (9) | 1 (2) | 41611 | 5.10 | 13 | 0.349 |
| MsPI 2 | 2 (5) | 2 (2) | 41611 | 5.10 | 13 | 0.379 | |
| MsPI 3 | 1 (6) | 1 (4) | 41611 | 5.10 | 13 | 0.379 | |
| Mascot 1 | 1 (4) | 1 (4) | 42052 | 5.29 | 13 | 0.370 | |
| Mascot 2 | 1 (4) | 1 (4) | 42052 | 5.29 | 13 | 0.370 | |
| Piums | 2 (2) | - | - | - | 13 | 0.380 | |
Figure 2The directory structure created during the installation of the MsPI tool.