| Literature DB >> 31888964 |
Ting Huang1, Roland Bruderer2, Jan Muntel2, Yue Xuan3, Olga Vitek4, Lukas Reiter5.
Abstract
In bottom-up, label-free discovery proteomics, biological samples are acquired in a data-dependent (DDA) or data-independent (DIA) manner, with peptide signals recorded in an intact (MS1) and fragmented (MS2) form. While DDA has only the MS1 space for quantification, DIA contains both MS1 and MS2 at high quantitative quality. DIA profiles of complex biological matrices such as tissues or cells can contain quantitative interferences, and the interferences at the MS1 and the MS2 signals are often independent. When comparing biological conditions, the interferences can compromise the detection of differential peptide or protein abundance and lead to false positive or false negative conclusions.We hypothesized that the combined use of MS1 and MS2 quantitative signals could improve our ability to detect differentially abundant proteins. Therefore, we developed a statistical procedure incorporating both MS1 and MS2 quantitative information of DIA. We benchmarked the performance of the MS1-MS2-combined method to the individual use of MS1 or MS2 in DIA using four previously published controlled mixtures, as well as in two previously unpublished controlled mixtures. In the majority of the comparisons, the combined method outperformed the individual use of MS1 or MS2. This was particularly true for comparisons with low fold changes, few replicates, and situations where MS1 and MS2 were of similar quality. When applied to a previously unpublished investigation of lung cancer, the MS1-MS2-combined method increased the coverage of known activated pathways.Since recent technological developments continue to increase the quality of MS1 signals (e.g. using the BoxCar scan mode for Orbitrap instruments), the combination of the MS1 and MS2 information has a high potential for future statistical analysis of DIA data.Entities:
Keywords: Cancer Biomarker(s); Label-Free Quantification; Lung Cancer; Mass Spectrometry; Quantification; SWATH-MS
Mesh:
Year: 2019 PMID: 31888964 PMCID: PMC7000113 DOI: 10.1074/mcp.RA119.001705
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 5.911
Fig. 1.Quantitative data structure label-free discovery proteomics. (A) Schematic representation of the acquisition layout of data-dependent acquisition methods with regular MS1 scans. The lower panels show the extracted ion current in MS1, which can be used for quantification. (B) Schematic representation of the acquisition layout of data-independent acquisition experiment with a regular MS1 and MS2 pattern. The lower panels show the two extracted ion currents, which can be used for quantification.
Fig. 2.MS1 and MS2 quantification characteristics in DIA (A) The MS1 and MS2 quantitative signals can be viewed as technical replicates from the same biological samples. (B) Extracted ion currents of two peptides derived from spike-in proteins from the Spike-in-biol-var-OT dataset of sample 1, 3, and 5. The interferences were manually identified as not following the predefined pattern of differential abundance. (C) The CVs for all precursors on condition level were calculated for the controlled datasets separately for MS1 and MS2 and separately for each condition. The graph displays the counts of precursors with CVs below 20%. (D) Pearson correlation between the precursor abundances in MS1 and MS2 space in the controlled datasets. The median is indicated by the red dotted line.
Fig. 3.Benchmarking of the MS1-MS2 combined method (A) Statistical inference of differential abundance was performed for spike-in datasets. The 200 proteins with the smallest adjusted p values were sorted by their p value. Next, the number of true positive differentially abundant proteins was displayed as a function of the candidate list containing true and false positives. The dotted line indicates a perfect candidate list containing only true positives (slope = 1). Inset: the number of true positives in the list of 200 proteins with the smallest adjusted p values. (B) As in (A), but for the mixed proteome datasets as in (A). (C) Statistical detection of differentially abundant proteins was performed as above for subsets of the Spike-in-biol-var-OT dataset with decreasing maximal true fold change, by selecting subsets of the dataset. The first plot to the left is based on the samples 1 to 5, the second to the left on 1 to 4, the third from the left on 1 to 3, and the right plot shows 1 to 2. The resulting candidate lists were analyzed as above. (D) Statistical detection of differentially abundant proteins was performed as above for subsets of the Spike-in-biol-var-OT dataset with decreasing numbers of replicates. The resulting candidate lists were analyzed as above.
Fig. 4.MS1-MS2-combined method based differential abundance testing in clinical samples (A) 12 healthy lung and 12 cancer (six adenocarcinomas and six squamous cell carcinomas) were analyzed by mass spectrometry. The resulting data were subjected to principal component analysis. (B) Statistical detection of differentially abundant proteins was performed with MS1-, MS2-based and the MS1-MS2-combined method. The overlap of differentially abundant proteins (FDR <0.05) was calculated on the protein level. (C) The candidate lists from the testing approaches were compared with the candidate list of an independent lung cancer study by Tenzer et al. (32). (D) The functional analyses were generated through the use of IPA (33). The figure plots the activation states of the pathways according to IPA.