| Literature DB >> 19448641 |
Alexander W Bell1, Eric W Deutsch, Catherine E Au, Robert E Kearney, Ron Beavis, Salvatore Sechi, Tommy Nilsson, John J M Bergeron.
Abstract
We performed a test sample study to try to identify errors leading to irreproducibility, including incompleteness of peptide sampling, in liquid chromatography-mass spectrometry-based proteomics. We distributed an equimolar test sample, comprising 20 highly purified recombinant human proteins, to 27 laboratories. Each protein contained one or more unique tryptic peptides of 1,250 Da to test for ion selection and sampling in the mass spectrometer. Of the 27 labs, members of only 7 labs initially reported all 20 proteins correctly, and members of only 1 lab reported all tryptic peptides of 1,250 Da. Centralized analysis of the raw data, however, revealed that all 20 proteins and most of the 1,250 Da peptides had been detected in all 27 labs. Our centralized analysis determined missed identifications (false negatives), environmental contamination, database matching and curation of protein identifications as sources of problems. Improved search engines and databases are needed for mass spectrometry-based proteomics.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19448641 PMCID: PMC2785450 DOI: 10.1038/nmeth.1333
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Initial Results of Test Sample reporting of 24 academic laboratories (1-24) and 3 vendors (A-C)
Groups I-IV identify those labs who scored 100% (group I), those with naming (N) errors (group II), and those with naming errors as well as false positive, contaminant and redundant identifications (group III). Group IV includes labs with these errors as well as errors attributed to acrylamide alkylation (AC), database searching (DB), excessive stringency (ST), under-sampling (A) or trypsinization (TR) related errors.
| Laboratory / Vendor | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | B | 9 | 1 | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | C | 2 | 2 | ||
| Gene | MW | Group I | Group II | Group III | Group IV | |||||||||||||||||||||||
| KHK | 33 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | D | + | + | + | + | + |
| ATPAF2 | 33 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | S | N | T | + | T |
| SETD3 | 34 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | D | + | + | + | + | + |
| SPRY2 | 35 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | T |
| GLB1L3 | 35 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N | + | N | + | + | T | + |
| FYTTD1 | 36 | + | + | + | + | + | + | + | + | + | + | N | N | N | N | + | + | N | N | N | N | + | + | + | + | T | + | T |
| IHPK1 | 50 | + | + | + | + | + | + | + | + | + | N | + | + | + | N | + | + | + | + | N | N | + | + | + | + | + | S | T |
| IFRD1 | 50 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N | + | + | N | + | + | N |
| GCNT3 | 51 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | T | + |
| EIF2S3 | 51 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | A | + | + | + |
| F2 | 70 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N | + |
| + | + | A | T | S | T |
| FARP2 | 73 | + | + | + | + | + | + | + | + | + | + | + | + | + | N | + | + | + | + | + | N | + | + | + | + | + | + | + |
| ENOX1 | 73 | + | + | + | + | + | + | + | N | + | N | + | N | N | + | + | + | N | + | N | N | + | N | + | + | + | + | N |
| KLHL13 | 74 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N | + | + | + | N | + | S | + |
| NIBP | 101 | + | + | + | + | + | + | + | + | N | + | N | + | + | + | + | N | + | N | + | + | + | + | + | + | + | + | + |
| MARS | 101 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | T | + |
| NUP210 | 106 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + |
| THBS4 | 106 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | S | A | T | + | + |
| KIAA0746 | 112 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N | N | S | + | T | T | + |
| HIRA | 112 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | N |
Designed Peptide Mass Complexity Reporting of 24 academic laboratories and 3 vendors
Initial (grey shading) and updated (black shading) reporting of peptides of mass 1250 ± 5 Da. Analysis scoring was calculated from the fraction of correct peptide identifications and the accuracy of reporting peptides of mass 1250 Da, whereas the Report Scoring was based on the fraction of correct peptide identifications reported ÷ the number identified by the centralized analysis. Results not reported (NR); no raw data (NRD); submitted and data reprocessing difference (DRD). DRDs are indicated by fewer peptides identified by the centralized analysis as compared to the number reported.
| Laboratory / Vendor | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene | A | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | B | 9 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | C | 2 | 2 | 2 | C | 2 | 2 | |
| KHK | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||||
| ATPAF | + | + | + | + | + | + | + | + | + | + | ||||||||||||||||||||||
| SETD3 | + | + | + | + | + | + | + | + | + | |||||||||||||||||||||||
| SPRY2 | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||||||||||||||
| GLB1L3 | + | + | + | + | + | + | + | + | + | + | ||||||||||||||||||||||
| FYTTD1 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||||
| IHPK1 | + | + | + | + | + | + | + | + | + | + | ||||||||||||||||||||||
| IFRD1 | + | + | + | + | + | + | + | + | + | |||||||||||||||||||||||
| GCNT3 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||||||
| EIF2S3 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||||||||
| F2 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||||||
| FARP2 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||
| ENOX1 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||||||
| KLHL13 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||
| NIBP | + | + | + | + | + | |||||||||||||||||||||||||||
| MARS | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||
| NUP210 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||
| NUP210 | + | + | + | + | + | + | ||||||||||||||||||||||||||
| THBS4 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||||||||||||
| KIAA0746 | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||||||||||
| HIRA | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||||||
| HIRA | + | + | + | + | + | + | ||||||||||||||||||||||||||
Cysteine containing peptides.
iTRAQ labeling employed.
Alkylated peptide reported.
Reported peptide contains 1 missed trypsin cleavage site.
Peptides assigned at <95% confidence reported.
Methionine residue oxidized.
Figure 1Number of tandem mass spectra assigned to tryptic peptides
(a) Comparison of protein abundances (% total redundant peptides) from the centralized analysis of the raw data collected from the 27 labs (left side) and (right side) after removal of individual lab contaminants including keratins as well as trypsin. (b) Peptide heat map representation for each of the 20 proteins (gene symbol) from the centralized analysis of the raw data from all 27 labs, revealing the frequency of observation of a given peptide as well as its position in the protein sequence. Blue, the 1250 Da peptides; red, all other tryptic peptides.. Raw data from lab 24 was excluded (see Online Methods). Scale bar represents the number of redundant peptides. Scale bar is linear from 1 to 500 peptides.
Figure 2Discrepancies between reported data and centralized analysis identify erroneous reporting
Peptide heat map comparisons of the centralized analysis compiled for all 27 labs (Total), with the data from selected individual labs indicated below for the proteins (a) ATPAF2, (b) SETD3 and (c) F2. Blue, the 1250 Da peptides; red, all other tryptic peptides. Scale bar represents the number of redundant peptides. Missed cleavages account for the different degree of shading for peptides of mass 1250 Da.