| Literature DB >> 23082934 |
Malik N Akhtar1, Bruce R Southey, Per E Andrén, Jonathan V Sweedler, Sandra L Rodriguez-Zas.
Abstract
Neuropeptide identification in mass spectrometry experiments using database search programs developed for proteins is challenging. Unlike proteins, the detection of the complete sequence using a single spectrum is required to identify neuropeptides or prohormone peptides. This study compared the performance of three open-source programs used to identify proteins, OMSSA, X!Tandem and Crux, to identify prohormone peptides. From a target database of 7850 prohormone peptides, 23550 query spectra were simulated across different scenarios. Crux was the only program that correctly matched all peptides regardless of p-value and at p-value<1×10(-2), 33%, 64%, and >75%, of the 5, 6, and ≥7 amino acid-peptides were detected. Crux also had the best performance in the identification of peptides from chimera spectra and in a variety of missing ion scenarios. OMSSA, X!Tandem and Crux correctly detected 98.9% (99.9%), 93.9% (97.4%) and 88.7% (98.3%) of the peptides at E- or p-value<1×10(-6) (<1×10(-2)), respectively. OMSSA and X!Tandem outperformed the other programs in significance level and computational speed, respectively. A consensus approach is not recommended because some prohormone peptides were only identified by one program.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23082934 PMCID: PMC3516866 DOI: 10.1021/pr3007123
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Summary of the Peptides Used to Simulate the Query Spectra and Populate the Target Database
| Number of prohormones | 92 |
| Number of peptides | 7850 |
| Average (minimum, maximum) number of peptides/prohormones | 74.06 (1, 1139) |
| Average (min, max) peptide size (amino acids) | 75.23 (5, 255) |
| Percentage of peptides from UniProt | 3.35% |
| Percentage of peptides not from UniProt | 96.65% |
Number of Peptides Correctly Matched by X!Tandem, OMSSA and Crux for Precursor Charge States +1, +2, and +3 Across Various Scenarios
| correctly matched | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | OMSSA+X!Tandem+Crux | OMSSA+Crux | X!Tandem+Crux | Crux | |||||||||||
| | significant | ||||||||||||||
| scenario | charge | All | OC | OX | O | C | N | OC | O | C | N | XC | C | N | N |
| +1 | 7028 | 8 | 378 | 327 | 0 | 23 | 0 | 0 | 1 | 84 | 1 | 0 | 0 | 0 | |
| + neutral | +2 | 7012 | 7 | 397 | 313 | 0 | 35 | 0 | 0 | 0 | 85 | 1 | 0 | 0 | 0 |
| mass loss | +3 | 7027 | 5 | 379 | 265 | 0 | 87 | 0 | 0 | 3 | 82 | 2 | 0 | 0 | 0 |
| +1 | 6874 | 5 | 503 | 339 | 0 | 3 | 41 | 0 | 1 | 84 | 0 | 0 | 0 | 0 | |
| – neutral | +2 | 6888 | 5 | 485 | 340 | 0 | 3 | 44 | 0 | 0 | 85 | 0 | 0 | 0 | 0 |
| mass loss | +3 | 6978 | 3 | 389 | 337 | 0 | 8 | 50 | 0 | 1 | 84 | 0 | 0 | 0 | 0 |
| +1 | 6837 | 109 | 105 | 184 | 46 | 484 | 0 | 0 | 1 | 84 | 0 | 0 | 0 | 0 | |
| + neutral | +2 | 6831 | 99 | 109 | 175 | 60 | 491 | 0 | 0 | 0 | 85 | 0 | 0 | 0 | 0 |
| mass loss | +3 | 6887 | 99 | 57 | 105 | 57 | 560 | 0 | 0 | 1 | 84 | 0 | 0 | 0 | 0 |
| +1 | 6911 | 126 | 118 | 221 | 17 | 370 | 2 | 0 | 1 | 84 | 0 | 0 | 0 | 0 | |
| + neutral | +2 | 6905 | 116 | 133 | 202 | 11 | 397 | 1 | 0 | 1 | 84 | 0 | 0 | 0 | 0 |
| mass loss | +3 | 6897 | 99 | 133 | 157 | 24 | 454 | 1 | 0 | 2 | 83 | 0 | 0 | 0 | 0 |
| 50% ions | +1 | 6646 | 230 | 69 | 394 | 14 | 410 | 1 | 0 | 0 | 85 | 1 | 0 | 0 | 0 |
| + neutral | +2 | 6638 | 244 | 69 | 382 | 10 | 421 | 1 | 0 | 2 | 83 | 0 | 0 | 0 | 0 |
| mass loss | +3 | 6668 | 254 | 36 | 278 | 27 | 502 | 0 | 0 | 1 | 84 | 0 | 0 | 0 | 0 |
Correctly matched peptide regardless of E- or p-value level on all three, two or one of the database search programs. Missing columns (OMSSA+Crux, X!Tandem, OMSSA) are columns with “0” in all rows.
Detection at E- or p-value < 1 × 10–6. All: OMSSA, X!Tandem and Crux; OC: only OMSSA and Crux E- or p-value < 1 × 10–6; OX: only OMSSA and X!Tandem E-value < 1 × 10–6; XC: only X!Tandem and Crux E- or p-value < 1 × 10–6; O: only OMSSA E-value < 1 × 10–6; C: only Crux p-value < 1 × 10–6; N: No program E- or p-value < 1 × 10–6. Missing columns (XC and X within OMSSA+X!Tandem+Crux, X in X!Tandem+Crux) are columns with “0” in all rows.
Simulated query scenarios: b + y ions + neutral mass loss: Match using all b- and y-ion series including neutral mass losses; b + y ions - neutral mass loss: Match using all b- and y-ion series excluding neutral mass losses; b ions + neutral mass loss: Match only using the b-ion series including neutral mass losses; y ions + neutral mass loss: Match only using the y-ion series including neutral mass losses; 50% ions + neutral mass loss: Match only using random 50% of all ions including neutral mass losses; 25% ions + neutral mass loss: Match only using random 25% of all ions including neutral mass losses.
Figure 1Venn diagram depicting the common and distinct true positive peptides identified from the three database search programs, X!Tandem, OMSSA, and Crux with peptide charge state +3 using (A) all ion information; (B) only y-ion series information; (C) only b-ion series information.
Figure 2Comparison of OMSSA, Crux, and X!Tandem log10 (E- or p-values) averaged across peptide length and precursor charge states for all peptides and (inset) magnified for peptides up to 60 amino acids in length.
Number of Peptides Unmatched, Mismatched and Correctly Matched at Various Significance Levels by X!Tandem, OMSSA and Crux Including or Excluding Neutral Mass Losses When for All Ions from Both Series in the Query Are Available and for Precursor Charge State +1
| OMSSA | X!Tandem | Crux | ||||
|---|---|---|---|---|---|---|
| significance | including | excluding | including | excluding | including | excluding |
| Unmatched | 1 | 0 | 69 | 115 | 0 | 0 |
| Mismatch | 0 | 0 | 16 | 11 | 0 | 0 |
| 0 | 0 | 0 | 4 | 2 | 1 | 1 |
| 1 | 1 | 0 | 73 | 73 | 118 | 129 |
| 2 | 11 | 1 | 91 | 93 | 214 | 236 |
| 3 | 48 | 2 | 82 | 80 | 171 | 226 |
| 4 | 24 | 10 | 33 | 30 | 160 | 178 |
| 5 | 24 | 75 | 75 | 69 | 151 | 170 |
| 6 | 49 | 4 | 91 | 95 | 172 | 172 |
| 7 | 73 | 5 | 83 | 74 | 171 | 213 |
| 8 | 28 | 63 | 47 | 13 | 194 | 200 |
| ≥9 | 7591 | 7690 | 7186 | 7195 | 6498 | 6325 |
| Prop >
6 | 98.6% | 98.9% | 94.4% | 94.0% | 89.6% | 88.0% |
Significance threshold (t) for matched to be considered significant at E- or p-value < 1 × 10–t.
Including or excluding neutral mass losses.
Unmatched: the program does not provide a match with the program setting.
Mismatched: the program provided an incorrect match.
Percentage of the matches that have E- or p-value < 1 × 10–6.
Number of Peptides Unmatched, Mismatched and Correctly Matched at Various Significance Levels by X!Tandem, OMSSA and Crux when Either the b-, y-Ion Series, 50%, or 25% of the Ions in the Query Available for Precursor Charge State +1 and Including Neutral Mass Losses
| OMSSA | X!Tandem | Crux | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| significance | 50 | 25 | 50 | 25 | 50 | 25 | ||||||
| Unmtch | 0 | 0 | 1 | 73 | 79 | 72 | 73 | 295 | 0 | 0 | 0 | 0 |
| Mismtch | 0 | 0 | 0 | 4 | 6 | 15 | 13 | 10 | 0 | 0 | 0 | 0 |
| 0 | 160 | 48 | 71 | 492 | 237 | 138 | 316 | 1133 | 0 | 2 | 9 | 60 |
| 1 | 84 | 62 | 72 | 182 | 109 | 113 | 170 | 284 | 93 | 131 | 151 | 322 |
| 2 | 87 | 86 | 85 | 184 | 149 | 156 | 133 | 228 | 229 | 196 | 243 | 302 |
| 3 | 100 | 99 | 87 | 178 | 122 | 113 | 166 | 229 | 215 | 155 | 188 | 218 |
| 4 | 94 | 88 | 106 | 160 | 96 | 98 | 104 | 140 | 154 | 169 | 180 | 183 |
| 5 | 90 | 89 | 88 | 167 | 105 | 109 | 157 | 136 | 167 | 140 | 188 | 209 |
| 6 | 64 | 77 | 81 | 133 | 137 | 139 | 109 | 120 | 188 | 173 | 190 | 273 |
| 7 | 94 | 73 | 86 | 113 | 104 | 105 | 122 | 146 | 167 | 196 | 209 | 326 |
| 8 | 93 | 90 | 74 | 106 | 89 | 103 | 127 | 131 | 168 | 226 | 244 | 365 |
| ≥9 | 6984 | 7138 | 7099 | 6058 | 6617 | 6689 | 6360 | 4998 | 6469 | 6462 | 6248 | 5592 |
| Prop >6 | 92.2 | 94.0 | 93.5 | 81.7 | 88.5 | 89.6 | 85.6 | 68.7 | 89.1 | 89.9 | 87.8 | 83.5 |
Significance threshold (t) for matched to be considered significant at E- or p-value < 1 × 10–.
b-, y-ion series, 50%, or 25% of the ions in the query are available.
Unmatched: the program does not provide a match with the program setting.
Mismatched: the program provided an incorrect match.
Percentage of the matches that have E- or p-value < 1 × 10–6.
Figure 3Venn diagram depicting the common and distinct peptides identified by all three database search programs with peptide charge state +3 using only (A) 50% or (B) 25% of all ion information.
Number of Peptides Unmatched, Mismatched and Correctly Matched at Various Significance Levels by X!Tandem, OMSSA and Crux When Either the b- or the y-Ion Series Is Used to Score the Match for Precursor Charge State +1 and Including Neutral Mass Losses
| OMSSA | X!Tandem | |||
|---|---|---|---|---|
| significance | ||||
| Unmatched | 415 | 365 | 75 | 74 |
| Mismatched | 11 | 11 | 10 | 11 |
| 0 | 122 | 47 | 248 | 151 |
| 1 | 63 | 50 | 113 | 108 |
| 2 | 76 | 69 | 187 | 179 |
| 3 | 76 | 66 | 270 | 214 |
| 4 | 103 | 87 | 746 | 591 |
| 5 | 62 | 81 | 1902 | 1785 |
| 6 | 102 | 90 | 1536 | 1948 |
| 7 | 108 | 80 | 842 | 861 |
| 8 | 116 | 126 | 690 | 648 |
| ≥9 | 6596 | 6778 | 1231 | 1280 |
| Prop >6 | 88.2% | 90.1% | 54.8% | 60.3% |
Significance threshold (t) for matched to be considered significant at E- or p-value < 1 × 10–.
b- or y-ions used to score the peptide match.
Unmatched: the program does not provide a match with the program setting.
Mismatched: the program provided an incorrect match.
Percentage of the matches that have E- or p-value < 1 × 10–6.
Number of Peptides Identified from Chimera Spectra of Groups of 2–5 Peptides with Precursor Charge State +1, All Ions Are Available, and Including Neutral Mass Losses by X!Tandem, OMSSA and Crux
| number of peptides correctly matched in a spectra
with an | percentage
of peptides detected | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| program | N pep | 0 | 1 | 2 | 3 | 4 | 5 | >2 | >6 |
| OMSSA | 2 | 11 | 213 | 580 | 85.4 | 84.1 | |||
| 3 | 3 | 25 | 64 | 34 | 67.5 | 61.9 | |||
| 4 | 1 | 3 | 5 | 2 | 1 | 47.9 | 33.3 | ||
| 5 | 0 | 0 | 1 | 0 | 1 | 1 | 73.3 | 66.7 | |
| Total | 15 | 241 | 650 | 36 | 2 | 1 | 81.1 | 78.7 | |
| X!Tandem | 2 | 0 | 799 | 5 | 50.3 | 12.8 | |||
| 3 | 59 | 67 | 0 | 0 | 17.7 | 0.5 | |||
| 4 | 11 | 1 | 0 | 0 | 0 | 2.1 | 0.0 | ||
| 5 | 3 | 0 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | |
| Total | 73 | 867 | 5 | 0 | 0 | 0 | 42.8 | 10.2 | |
| Crux | 2 | 0 | 10 | 794 | 99.4 | 81.3 | |||
| 3 | 0 | 0 | 3 | 123 | 99.2 | 61.6 | |||
| 4 | 0 | 0 | 0 | 0 | 12 | 100.0 | 20.8 | ||
| 5 | 0 | 0 | 0 | 0 | 0 | 3 | 100.0 | 13.3 | |
| Total | 0 | 10 | 797 | 123 | 12 | 3 | 99.4 | 75.8 | |
Number of peptides simulated in a chimera spectra.
Percentage of correctly matched peptides with an E- or p-value < 1 × 10–2.
Percentage of correctly matched peptides with an E- or p-value < 1 × 10–6.