Literature DB >> 24571364

Improved intensity-based label-free quantification via proximity-based intensity normalization (PIN).

Susan K Van Riper¹, Ebbing P de Jong, LeeAnn Higgins, John V Carlis, Timothy J Griffin.

Abstract

Researchers are increasingly turning to label-free MS1 intensity-based quantification strategies within HPLC-ESI-MS/MS workflows to reveal biological variation at the molecule level. Unfortunately, HPLC-ESI-MS/MS workflows using these strategies produce results with poor repeatability and reproducibility, primarily due to systematic bias and complex variability. While current global normalization strategies can mitigate systematic bias, they fail when faced with complex variability stemming from transient stochastic events during HPLC-ESI-MS/MS analysis. To address these problems, we developed a novel local normalization method, proximity-based intensity normalization (PIN), based on the analysis of compositional data. We evaluated PIN against common normalization strategies. PIN outperforms them in dramatically reducing variance and in identifying 20% more proteins with statistically significant abundance differences that other strategies missed. Our results show the PIN enables the discovery of statistically significant biological variation that otherwise is falsely reported or missed.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2014 PMID： 24571364 PMCID： PMC3993879 DOI： 10.1021/pr400866r

Source DB: PubMed Journal: J Proteome Res ISSN： 1535-3893 Impact factor: 4.466

Introduction

Differential quantification of complex mixtures using high-performance liquid chromatography coupled to electrospray ionization and tandem mass spectrometry (HPLC–ESI–MS/MS) can help researchers study biological variation at the molecular level and gain insights into the molecular machinery of cellular activity and disease progression.[1,2] Researchers reveal biological variation most often by comparing two or more populations, typically collecting data for thousands of distinct molecules in each population, for example, genes, peptides, or metabolites, and then statistically analyzing the differences among populations. Here a population comprises biological or technical replicates having a biological state in common, for example, healthy or diseased. HPLC–ESI–MS/MS workflows aimed at discovering biological variation fall into two categories: labeled and label-free quantification strategies. Labeled quantification strategies are popular because they allow researchers to analyze multiple peptide mixtures (often derived from protein samples) in a single HPLC–ESI–MS/MS run. (While we use peptide as a representative analyte, the set of analytes is not limited to peptides.) Researchers compute relative abundance (fold changes) of resulting ion intensities between the concurrently analyzed samples and determine which peptides (or inferred proteins) are differentially abundant. Unfortunately, labeled strategies require expensive labeling reagents. Furthermore, the number of unique labels is limited. Thus, labeling strategies do no scale to the level required for large-scale comparative studies, for example, a clinical study with tens, hundreds, or even thousands of biological samples. Because labeled strategies do not scale, researchers are increasingly turning to label-free quantification strategies, either intensity-based (MS1) or spectral counting (MS2). Spectral counting is straightforward but is biased toward peptides derived from high abundance proteins because spectral counting requires multiple MS2 scans matched to each protein to obtain statistically valid results. Intensity-based quantification is less straightforward than spectral counting, requiring the area under the curve computation using MS1 scans. However, intensity-based quantification is well-suited to study lower abundance peptides,[3] which are often more interesting than higher abundance peptides. Intensity-based strategies require repeatability and reproducibility,[4] which are inherently problematic in HPLC–ESI–MS/MS workflows and lead to excessive false-positive and false-negative results. Researchers will eventually discard the false-positive peptides via hypothesis-driven experiments, such as selected reaction monitoring (SRM), but false-negative results are detrimental because these candidates are never pursued. As a result, poor repeatability and reproducibility cause researchers to miss possible insights and draw incorrect inferences. Poor repeatability and reproducibility in HPLC–ESI–MS/MS chromatographic data stem from extraneous variability, which includes systematic bias (sample variability and instrument variability) and complex variability. Sources of sample variability include incomplete enzymatic digestion and pipetting errors made during sample preparation. It tends to be global, affecting each peptide in a sample.[5,6] Instrument variability can stem from physical changes in the liquid chromatography, mass spectrometry (MS) hardware, or environment, including HPLC performance degradation, MS calibration drift, and volatiles in the lab air that affect ionization efficiency. Instrument variability is global in nature because each change similarly affects each ion’s intensity in a run. Complex variability is local in nature. It stems from signal distortion due to transient stochastic events that occur during an HPLC–ESI–MS/MS run. For example, variability in ESI performance due to the mobile phase composition or flow rate fluctuations distorts the measured ion signal.[7,8] It is complex because each event affects only a narrow temporal window within an HPLC–ESI–MS/MS run, where window widths vary and can overlap among runs when analyzing separate samples. Normalization strategies in HPLC–ESI–MS/MS workflows attempt to remove systematic biases from the data before statistical inference.[9] Traditionally, normalization strategies use a combination of a global scaling function and a peptide selection method. Global scaling functions include median scale, mean scale, quantile, ranking, and least-squares fitting using linear or polynomial regression.[5,6] Unfortunately, these global scaling functions often require a complete matrix on which to compute, specifically, no missing data. However, missing data are prevalent in HPLC–ESI–MS/MS workflows.[9,10] While it is possible to impute missing values, it is recommended to do so after normalization.[9] Thus, the selection of peptides for inclusion in the global scaling function is critical. Peptide selection methods include: (1) common within sample (CWS);[5] (2) top L order statistics (LOS);[6,11] (3) percentage of peptides present (PPP);[12] and (4) peptides with rank invariant peptide (RIP).[10] Webb-Robertson et al. developed a useful application named Statistical Procedure for the Analyses of LC–MS proteomics Normalization Strategies (SPANS), which recommends the best global scaling function and peptide selection method combinations based on rigorous statistical tests.[10] Unfortunately, SPANS only includes global scaling functions and, regardless of the peptide selection method, global scaling functions cannot capture and mitigate complex variability.[11] Although largely unexplored,[13] complex variability during an HPLC–ESI–MS/MS run seems inevitable, even when researchers follow strict protocols. The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) studies provide an example. CPTAC established standard operating procedures to enable interlaboratory comparisons of proteomic studies, particularly in the context of cancer biomarker discovery.[14] In their sixth study, they used their standard operating procedures to produce publicly available, community reference data sets generated from a yeast proteome digest with 48 spiked-in proteins (UPS1 standard from Sigma Aldrich). Rudnick et al. found irregularities attributed to electrospray instability[14] in one of the technical replicates (sample C, replicate 2) from this data set generated by the instrument aliased LTQ-XL-OrbitrapP@65. The chromatogram’s distinctive sawtooth pattern (Supplemental Figure 1 in the Supporting Information) is a textbook example of complex variability. While Rudnick et al. reported modestly diminished peptide identification performance for this replicate analysis, we suspected that the complex variability also diminished intensity-based peptide quantification performance. Further investigating this CPTAC data reveals shortcomings in current normalization strategies. Examining the extracted chromatograms plots (XCs), each representing all peptide signals in a run (Experimental Procedures), reveals a trough in replicate 2 (REP 2) (Figure 1a) corresponding to the same time frame as the observed electrospray instability in the TIC generated from the raw data (Supplemental Figure 1 in the Supporting Information). Ideally, normalization would produce nearly identical XCs between replicates. Unfortunately, global scaling functions fail to mitigate complex variability (Figure 1b). Furthermore, global normalization can have unintended consequences and can adversely affect regions where no complex variability exists. Figure 1b shows that two regions of the XC for REP 2 now have more extraneous variability than before normalization, potentially disguising true biological variation.

Figure 1

Complex variability in technical replicates. Extracted chromatograms (Experimental Procedures) from three HPLC–ESI–MS/MS technical replicates (CPTAC Study 6, LTQ-XL-OrbitrapP@86) show that even well-controlled HPLC–ESI–MS/MS experiments are vulnerable to complex variability. (a) Extracted chromatogram for Sample C’s replicate 2’s un-normalized data (EXP C – REP 2, long-dashed red line) contains a distinctive trough during the same time frame as the electrospray instability in its corresponding TIC (Supplemental Figure 1 in the Supporting Information). (b) Same data normalized by median scale result in extracted chromatograms where the distinctive trough remains. Extracted chromatograms for replicates 1 and 3 (EXP C – REP 1, solid blue line, and EXP C – REP 3, short-dashed orange line) only slightly diverge. However, replicate 2’s (EXP C – REP 2, long-dashed red line) extracted chromatograms show that the complex variability is now exaggerated, showing the adverse effects of median scale normalization. (c) Same data normalized by PIN result in similar extracted chromatograms for each of the three replicates. PIN removes the trough in replicate 2 (EXP C – REP 2, long-dashed red line). This demonstrates PIN’s ability to mitigate complex variability.

Experimental Procedures

Saliva Sample Collection and Processing

We collected and processed salivary endogenous peptides as previously described.[15] In brief, we collected all saliva samples according to protocols approved by the University of Minnesota Institutional Review Board. Donors declared that they were healthy nonsmokers and were free of confounding conditions.

Endogenous Peptide Isolation

Clarified saliva was prepared from fresh whole saliva samples by centrifuging at 3000g at 4 °C for 15 min, followed by 16 100g at 4 °C for 1 min. The supernatant was mixed in a 10:1 ratio with denaturing buffer consisting of 4% SDS, 100 mol/L dithiothreitol, and 100 mmol/L Tris, pH 7.4. The samples were boiled for 5 min, cooled to room temperature, then added to a centrifugal filter (Amicon Ultra, 0.5 mL, 10 kDa, Millipore). Two hundred microliters of buffered urea (8 mol/L urea with 100 mmol/L tris pH 8.5) was added to the sample, and the mixture was centrifuged at 14 000g at room temperature for 40 min. An additional 200 μL of buffered urea was added, and the sample was centrifuged at 14 000g at room temperature for 40 min. The filters were discarded, and the collected peptides were alkylated, by the addition of iodoacetamide in buffered urea to 50 mmol/L in the dark for 20 min. MCX (Oasis 3 cc, 60 mg, Waters) cleanup was performed by diluting the samples to 3 mL with 2% formic acid and H2O to pH ≤3. The MCX columns were equilibrated with 2 mL of 1:1 methanol: water followed by the addition of the entire sample, then washed with 3 mL of 0.1% formic acid and 2 mL of methanol; peptides were eluted with 1 mL of 95% methanol/5% ammonium hydroxide. The eluted peptides were dried in a speed-vacuum, redissolved in water, and quantified by a modified BCA assay (Thermo Scientific, Waltham, MA) using trypsin-digested saliva as a standard. Three micrograms of peptides were further purified and concentrated using the STAGE-tip protocol.[16]

Instrument Variability Experiment Sample Preparation

Fresh saliva was collected from a single donor and processed for isolation of endogenous peptides. Sufficient sample quantity was loaded in a single autosampler vial to run three replicate injections in succession.

Sample Variability Experiment Sample Preparation

Fresh saliva was collected from a single donor and divided into three portions. Each aliquot was processed for endogenous peptide isolation with the identical protocol, placed into individual autosampler vials and analyzed in succession.

Serial Dilution Experiment Sample Preparation

Fresh saliva was collected from a single donor and processed for isolation of endogenous peptides and aliquoted with increasing amounts (0.5, 1.0, 1.5, 2.0, 2.5, and 3 μg) into individual vials. Five hundred fmol of bradykinin was added to each vial.

HPLC–ESI–MS/MS

We analyzed the resulting peptide mixtures by online HPLC–ESI–MS/MS on an LTQ-Orbitrap XL mass spectrometer (Thermo Scientific, Waltham, MA) equipped with an Eksigent (Eksigent Technologies, Redwood City, CA) 1DLC nanoflow system and a MicroAS autosampler. An in-house, pulled tip capillary column with a 100 μm inner diameter was packed to 13 cm with Magic C18AQ 5 μm, 200 Å pore particles (Michrom Bioresources). Peptide mixtures were dissolved in an aqueous solution containing 2% ACN with 0.1% formic acid and then separated by a 2–40% ACN gradient in 0.1% formic acid over 60 min at 250 nL/min. Full-scan mass spectra were acquired in the Orbitrap at 60 000 resolution at m/z 400, followed by tandem mass spectrometry (MS/MS) in the LTQ of the five most intense ions from the full scan. Further details of the mass spectrometer settings were previously reported.[15]

Clinical Proteomic Tumor Analysis Consortium Study 6

National Cancer Institute’s CPTAC network’s Study 6[14] data set LTQ-XL-OrbitrapP@65, which was downloaded from Tranche, is now available at https://cptac-data-portal.georgetown.edu/cptac/public. Because data acquisition for this study was performed in profile mode, we converted the .raw files to mzXML files using msConvert version 3.0.3364 specifying centroid = true.

Peptide Signal Extraction

Peptide signals were extracted using an in-house software application. In brief, the software takes in a list of mzXML files and processes the file sequentially. Each mzXML file was processed scan by scan. First, monoisotopic peak clusters were detected and deisotoped. Second, after all scans were processed, extracted ion chromatograms (XICs) were constructed from the deisotoped peaks. Next, peptide signals were then constructed from XICs. The peptide signals’ m/z values and retention times were adopted from the XIC’s apex peak. In lieu of computing an XIC’s area under the curve, a peptide signal’s intensity was computed by summing its XIC peak intensities. (Note that XICs were truncated to eliminate the exceedingly trailing chromatographic peaks. We arbitrarily selected 2 min from the XICs apex peak.) Finally, corresponding peptide signals were grouped across multiple analyses based on m/z and retention time tolerances. As a quality measure, we required at least two MS2 scans (among all sequentially analyzed files) corresponding to each peptide signal. This software and Thermo .raw files used for analyses are available from the authors upon request.

Protein and Peptide Identification

All MS/MS data were analyzed using Sequest version 27, rev 12 (Thermo Scientific). The steps for searches are described. First, we downloaded the yeast Uniprot FASTA database (ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/proteomes/YEAST.fasta.gz, February 3, 2012) and the cRAP contaminant FASTA database (downloaded from ftp://ftp.thegpm.org/fasta/cRAP, February 28, 2012). The cRAP UPS entries were replaced with entries from an updated UPS FASTA database (http://www.sigmaaldrich.com/life-science/proteomics/mass-spectrometry/ups1-and-ups2-proteomic.html, February 22, 2012). The YEAST and cRAP FASTA databases were concatenated and designated as the forward database. Each protein sequence was then reversed with a perl script (Matrix Science, Boston, MA), designated as the decoy database, and concatenated to the forward database. The resulting database contained 13 532 proteins. For the C versus E experiment, Sequest parameters included a fragment ion mass tolerance of 1 Da, oxidation of methionine as a variable modification, and 2 Da mass tolerance. Scaffold version 3.6.1 (Proteome Software, Portland, OR) was used to validate MS/MS-based peptide and protein identifications. Peptide identifications were accepted if they met minimum criteria 7 ppm precursor mass tolerance, one tryptic terminus minimum, six amino acid minimum length, and 90.0% probability as specified by the Peptide Prophet algorithm.[17] Protein identifications were accepted if they could be established at >80.0% probability (Protein Prophet)[18] and contained at least one identified peptide. Proteins with similar peptides that could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. The resulting false discovery rate (FDR) within Scaffold was 1.4% at the protein level and 0.1% at the peptide level (Supplemental Files 1 and 2 in the Supporting Information).

Statistical Analysis

Unless otherwise specified, all statistical analyses were performed using the R statistical package, version 2.14-0 2011-10-31, R.app 1.41 or 3.02 2013-09-25. Two sample t tests were conducted using the R function t.test and the default confidence level was 0.95. Prior to multivariate analyses, for example, pooled estimate of variance, data were first log-transformed to obtain a normal distribution prior to computing variance.

Extracted Chromatograms

We generated XCs by first determining each peptide’s XIC and then summing their recorded intensities within each scan. The resulting summed intensities (y axis) were plotted over time (x axis) using R’s lowess function (locally weighted polynomial smoother); the smoothing span parameter set to 0.07 for Figure 4 and 0.05 for Figure 6. All other lowess default parameters were accepted.

Figure 4

Serial dilution - PIN versus median scale minus versus average (MA) pots. (a) MA plot visually demonstrates that PIN outperforms median-scale normalization in mitigating systematic bias. (a) MA plot of un-normalized data for six different loading amounts reveals systematic bias (regression lines diverge from y = 0). (b) MA plot of median scale data for six different loading amounts demonstrates a slight improvement in systematic bias (regression lines begin to converge around y = 0). (c) MA plot for PIN normalized data for six different loading amounts demonstrates a substantial improvement in systematic bias (regression lines converge around y = 0).

Figure 6

Peptide signal extracted chromatograms PIN versus spiked-in standard normalization for six different loading amounts. (a) Un-normalized extracted chromatogram plot shows systematic bias. (The 2.0 and 3.0 μg extracted chromatograms are not in the correct order.) (b) Spiked-in (bradykinin) normalized extracted chromatogram plots show that the 2.0 μg extracted chromatogram is now above the 1.0 μg extracted chromatogram but is still below the 2.0 μg extracted chromatogram. (c) PIN normalized extracted chromatogram shows convergence, indicating the removal of systematic bias. (d) Scaling PIN normalized data by the original loading amounts plot shows extracted chromatograms monotonically increasing and in the correct order.

Current Normalization Strategies

Linear Regression

Linear regression normalization was performed by applying least-squares regression on minus versus average (MA) scatter plots using a pairwise iterative algorithm.[5,6] First, the algorithm selected peptide signals using the CWS method and then MA-transformed peptide signal intensities for each pair of runs using eqs 1 and 2 in Supplementary Note 2 in the Supporting Information. Fitted data were generated in R with the function lm, for example, lm(m∼a), and subtracted from observed ratios with eq 3. The data then were deconvoluted using eq 4 in Supplementary Note 2 in the Supporting Information. We performed the iteration process twice because the difference between the mean of all intensity ratios from the previous iteration was <0.005, as previously described.[6]

Loess

Author: Loess normalization was performed in R using the normalizeCyclicLoess function found in the limma package.[19] The algorithm first selected peptide signals using the CWS method; next, the algorithm selected log2 transformed intensities (as required by normalizeCyclicLoess) prior to submitting them to normalizeCyclicLoess using default parameters. The algorithm returned results to their nonlog scale.

Quantile

Quantile normalization was performed in R using the normalizeQuantile function found in the limma package.[19] The algorithm selected peptide signals using the CWS method prior to normalizedQuantile analysis with default parameters.

Reference Run

Normalization was performed by selecting a single run as a reference. Peptide signal intensities for all nonreference data were normalized to the reference run. The algorithm first selected signals using the CWS method prior to computing the median of peptide signal intensity ratios, which was used as a normalization factor.[6] The first replicate was arbitrarily selected as the reference run.

Median Scale (Central Tendency)

Median scale was performed by scaling peptide signal intensities values within each run by the median of peptide signals selected using the CWS method.

PIN Neighborhood Construction

While we mathematically define PIN’s neighborhood construction (see Results and Discussion), in practice we define a peptide signal’s neighborhood boundary using a retention time window around the peptide signal’s retention time (Figure 2). The retention time window can be static or dynamic. A static retention time window is centered at the peptide signal’s retention time, plus and minus a specified time period, for example, 2 min. A dynamic retention time window is based on the width of a peptide signal’s XIC. Once the neighborhood boundaries are established, the neighborhood is then populated with all peptide signal XIC peaks within the retention time window. A peptide signal’s intensity is then normalized by computing the proportion of signal within the neighborhood. The proportion is computed by dividing the peptide signal’s intensity by the sum of neighborhood XIC peak intensities. For all analyses discussed herein, neighborhood boundaries were determined by a dynamic retention time window corresponding to the peptide signal’s XIC width.

Figure 2

PIN’s neighborhood construction. XICs for three peptide signals (A–C) are shown in three dimensions. Here peptide signal B’s neighborhood boundaries (RT window depicted by the horizontal red lines) are determined by its XIC width. Peptide signal B’s neighborhood construction includes all XIC peaks within its retention time window (peaks in red and blue). In this case, the neighborhood includes a portion of peptide signals A’s XIC and a portion of peptide signal B’s XIC.

Results and Discussion

Overview of the PIN Strategy

In PIN, we take a nontraditional approach. We observe that complex variability, such as ESI instability, affects bounded temporal regions within chromatographic data, as opposed to systematic bias which affects the entire HPLC–ESI–MS/MS run. Therefore, to mitigate complex variability, PIN normalizes each peptide signal’s intensity by computing its proportion of intensity relative to its neighboring peptide signals (Figure 2). Mathematically, we define a peptide signal’s neighborhood aswhere N is the set of neighboring peptide signals, n is a peptide signal, rtmin is the index of the peptide signal corresponding to the neighborhood’s lower retention time boundary, and rtmax is the index of the peptide signal corresponding to the neighborhood’s upper retention time boundary. With the neighborhood defined, a peptide signal’s normalized intensity, that is, its proportion of neighborhood intensity, iswhere n is the intensity of peptide signal j and n is the intensity of peptide signal i falling within the neighborhood retention time boundaries. The premise for this new approach is that we view biological samples and HPLC–ESI–MS/MS chromatographic data as compositional. Mathematically, a composition is defined as “... x of D parts is a D x 1 vector (x1, x2, ..., x) of positive components whose sum is 1”.(20) Because the components’ quantities must sum to 1, compositional data are an example of sum-constrained data.[21] While the concept of sum-constrained data may seem foreign to most, sum-constrained data are actually quite prevalent. For example, simple percentages and parts per million (ppm) are sum-constrained measurements. Percentages are sum-constrained because the total is constrained to 100, and ppm measurements are sum-constrained because the total is constrained to one million.[20,22] In analyzing compositional data, statistical analyses must be done with care because naïve analysis of compositional data can lead to incorrect inferences.[23] This is because compositional data loses its absolute abundance information. The only abundance information that remains for a single component is relative to the other components making up the whole.[21] In other words, a component’s abundance is its proportion to the whole. Prior to statistical analyses, compositional data must meet two conditions: (1) the components of interest must be relatively small parts of the whole; (2) components within the whole must remain relatively constant in size and composition.[22] When these conditions are met, univariate statistical tests, such as a student’s t test, on compositional data should not lead to incorrect inferences.[22] The rationale for our treatment of biological samples as compositions is two-fold. First, biological systems, whether at the molecular, cellular, or organ level, are dynamic and interactive. For example, within a proteome, the presence, absence, or change in abundance of one or more proteins can affect the presence, absence, or change in abundance of one or more other proteins in the system. Therefore, the abundance of a particular protein, relative to other proteins is important. As a result, researchers reveal biological variation by finding differences in a biological sample’s composition.[24] Second, sample collection and preparation impose constraints on the number of proteins available for measurement.[20,22] For example, aliquoting a portion of a sample, perhaps based on a Bradford assay, puts a cap on the amount of protein used for comparison. As a result, we deem biological samples as compositional. The rationale for our treatment of chromatographic data as compositional stems from the fact that constraints are imposed within HPLC–ESI–MS/MS workflows. For example, during ESI, coeluting peptides compete for a finite number of charges and a limited space on droplets.[25,26] The finite number of charges and the limited space are constraints. As a result, mass spectral intensities of ionized peptides depend on their coeluting peptides,[27] and thus we deem the resulting chromatographic data are compositional. Therefore, when we treat chromatographic data resulting from the analysis of proteomic samples as compositional, we can detect statistically significant differences in proportions across populations. This compositional data meets the two prerequisite conditions for its statistical analysis: (1) the amount of a single component (peptide) is small relative to the whole, which remains true in its corresponding chromatographic data, and (2) in the vast majority of biological systems, the core proteome and its digested peptides, accounts for more than 90% of the measured protein mass in a sample; that is, only 10% is compositionally different between similar biological samples.[28] This also remains true in the corresponding chromatographic data. Thus, when we view complex biological samples and chromatographic data as compositional, we can use simple univariate statistics (such as student’s t test) to reveal biological variation. We also reason that temporal regions within in chromatographic data generated by HPLC–ESI–MS/MS workflows are subcompositions. This reasoning stems from the fact that when using the same (or very similar) LC column, peptides elute in approximately the same order.[29] As a result, retention time windows within resulting chromatograms will contain approximately the same set of peptide signals when analyzing different samples with overall similar composition. Thus, this region becomes a subcomposition, which, for mathematical purposes, is just another composition. We can do so even when complex variability is present because complex variability similarly affects peptide signal intensities within close temporal proximity. Finally, when we view regions (also called neighborhoods) of chromatographic data as compositional, we can detect statistically significant differences in peptide signal proportions across populations. This is true because the data again meet the two prerequisite conditions for comparisons of compositional data: (1) provided that neighborhood boundaries are properly set, a peptide signal’s neighborhood population is large enough so that the intensity of a peptide signal remains small relative to the sum of its neighbors’ intensities and (2) provided that similar LC (column type and gradient duration) is used, neighborhoods will contain sufficiently overlapping populations. With underlying assumptions as a basis, we assessed our PIN strategy’s ability to mitigate systematic bias and complex variability and reveal statistically significant biological variation.

Results

As an initial evaluation, we applied PIN to the CPTAC data described in the Introduction – data that alluded to the capabilities of global scaling functions and motivated our development of a new method. Use of PIN on this data produced the ideal result expected from normalization (Figure 1c): nearly identical XCs between replicates, even for the run containing the complex variability due to electrospray instability. Encouraged by these results, we further evaluated PIN vis-à-vis median scale and other current normalization strategies. We applied several different peptide selection strategies and global scaling functions to archived data sets from four experiments: Instrument Variability, Sample Variability, Serial Dilution (as a proxy for loading amount differences), and CPTAC Study 6. The results of these experiments, when taken together, demonstrate PIN’s superior mitigation of systematic bias and complex variability while retaining biological variability.

Reduction in CV and PEV

To assess reduction in instrument variability, we generated three replicates by analyzing a single aliquot of salivary endogenous peptides using an autosampler and HPLC–ESI–MS/MS three times consecutively. (See Experimental Procedures and Supplemental Tables 3 and 4 in the Supporting Information.) First, we assessed reduction in variability between un-normalized data, data normalized by five global scaling functions, and data normalized by PIN. We used two commonly employed metrics: pooled estimate of variance (PEV) and median standard deviation coefficient of variance (CV). Although we evaluated numerous scaling functions,[5,6] we only report PIN and the five best performing normalization strategies determined by PEV and CV reduction. For the global scaling functions, we employed the CWS peptide selection strategy prior to normalization. For PIN, we employed the CWS peptide selection strategy after normalization. Results are shown in Figure 3. In the instrument variability experiment, PIN outperformed the five best current normalization strategies by reducing PEV by 73% and CV by 46%, compared with an average of 13 and 19% respectively for the global scaling functions.

Figure 3

PIN versus five current normalization strategies. Four experiments show PIN’s superior performance in reducing variance using two commonly used measurements: pooled estimate of variance (PEV) and coefficient of variation (CV). (a,c) PIN (right-most column in each figure) versus five common normalization strategies in reducing PEV for each of the four experiments. (b,d) PIN (again, right-most column in each figure) versus the same five common normalization strategies in reducing CV for each of the four experiments. To assess reduction in the variability resulting from sample handling, we followed the same protocol as the instrument variability experiment, except we prepared three aliquots of salivary endogenous peptides in parallel and analyzed each aliquot using an autosampler and HPLC–ESI–MS/MS. (See the Experimental Procedures and Supplemental Tables 5 and 6 in the Supporting Information.) Again, we employed CV and PEV and compared PIN’s results to the top five performing normalization strategies. PIN continued to outperform these normalization strategies by reducing PEV by 71% and CV by 41% compared with an average of 11 and 8%, respectively, for global scaling functions. To assess reduction in variability resulting from loading amount differences, we conducted serial dilution experiments using a complex mixture of salivary endogenous peptides and bradykinin as a spiked-in standard. (See the Experimental Procedures and Supplemental Tables 7 and 8 in the Supporting Information.) Traditionally, researchers have conducted serial dilution experiments to determine peptide abundances in a concentrated sample or produce calibration curves. We used serial dilution experiments unconventionally, as a proxy for loading amount differences We prepared six aliquots of this mixture by combining increasing amounts (0.5, 1.0, 1.5, 2.0, 2.5, and 3.0 μg) of salivary endogenous peptides with an equal amount of bradykinin (50 fmol). We then analyzed the six mixtures via HPLC–ESI–MS/MS. We again employ PEV and CV and compared PIN to the top five performing normalization strategies. PIN reduced PEV by 75% and CV by 55% compared with an average of 34 and 23%, respectively, for global scaling functions. To assess reduction in variability in the face of biological variation, we used data from the CPTAC Study 6 data set for instrument aliased LTQ-XL-OrbitrapP@65. In brief, CPTAC Study 6 evaluated mixtures of yeast with Sigma UPS1 spiked in at five different levels (A–E), each level three times greater than the previous level.[14] Each sample was then analyzed three times by HPLC–ESI–MS/MS. We selected samples C and E because sample C contained complex variability and sample E contained nine times the amount of UPS1 (Experimental Procedures, Supplemental Tables 9 and 10 in the Supporting Information). Using the C versus E data set, PIN again outperformed global scaling function PIN reduced PEV by 61% and CV by 19% while global scaling functions, on average, PEV by 9% and surprisingly increased CV by 14% (Figure 3d, fourth row). In this case, global scaling functions had a negative effect rather than a positive effect in normalizing intensities.

PIN and Systematic Bias Mitigation

To assess PIN’s ability to mitigate systematic bias, we employed MA plots. Again, we used the serial dilution data set as a proxy for loading amount differences. We chose to visualize loading amount differences because it was a classic example of systematic bias (Figure 4 and Supplementary Figure 2 in the Supporting Information). When we plotted the un-normalized peptide signal intensities, we observed divergent regression lines, indicative of systematic bias due to loading amount differences (Figure 4a). As described in Supplemental Note 1 in the Supporting Information, data with no systematic bias would result in regression lines lying on the horizontal line positioned at 0 on the y axis. We then plotted the same data after normalization using median scale as the global scaling function (Figure 4b). We observed that the regression lines have slightly converged and were repositioned below and above y = 0, indicating improvement in systematic bias. Finally, we plotted the data after normalization using PIN (Figure 4c). We observed that the regression lines on the right end of the plot were then positioned on or very near y = 0. However, we also noted that the lines remained below the horizontal flat line on the left end of the plot. Unlike median scale, the regression lines converged, making the regression lines nearly indistinguishable. From these observations, we concluded that using PIN performed well to mitigate the systematic bias, although a small amount remains. Furthermore, we concluded that PIN outperformed median scale normalization in making systematic bias consistent between runs. Serial dilution - PIN versus median scale minus versus average (MA) pots. (a) MA plot visually demonstrates that PIN outperforms median-scale normalization in mitigating systematic bias. (a) MA plot of un-normalized data for six different loading amounts reveals systematic bias (regression lines diverge from y = 0). (b) MA plot of median scale data for six different loading amounts demonstrates a slight improvement in systematic bias (regression lines begin to converge around y = 0). (c) MA plot for PIN normalized data for six different loading amounts demonstrates a substantial improvement in systematic bias (regression lines converge around y = 0).

Absolute Abundance, Relative Abundance, and Compositional Measurements

Typically, in a serial dilution experiment, the data are normalized using an internal or spiked in peptide with the goal of determining absolute abundance and relative abundance, rather than composition. The standard metric for absolute abundance is the coefficient of correlation (R2), with the goal of perfect correlation (R2 = 1.0). The standard metric for relative abundance is fold change, with the goal of monotonically increasing fold changes corresponding perfectly to the original amounts loaded. Here we expected fold changes of 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0 corresponding to the saliva peptide load amounts of 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0 μg, respectively. We first plotted the un-normalized intensity for a single salivary peptide (GPGIFPPPPPQP), indicative of other salivary peptides found in each of the six serial dilution runs and computed the R2 and fold changes (Figure 5a,e). We noted that R2 = 0.81, and fold changes were compressed and not monotonically increasing. We then plotted the spike-in normalized intensity (with 500 fmol bradykinin peptide) of the same peptide and computed the R2 and fold changes (Figure 5b,e). The measured correlation improved to R2 = 0.98, and the fold changes were now monotonically increasing.

Figure 5

PIN versus spiked-in standard normalization for a single peptide (GPGIFPPPPPQP) intensity with six different sample loading amounts. (a) Un-normalized intensity plot shows mediocre correlation (R2 = 0.82). (b) Spiked-in (bradykinin) normalized intensity plot shows improved correlation (R2 = 0.98) but does not remove systematic bias stemming from loading amount differences (slope is very high). (c) PIN normalized intensity plot shows decreased correlation (R2 = 0.81) but removes systematic bias stemming from loading amount differences (slope = 0.01). (d) PIN normalized data followed by original loading amount scaling shows improved correlation (R2 = 0.99), which is better than spiked-in standard normalization. (e) Table showing fold changes for each intensity plot (a–d) using the 1.0 μg sample loading amount as the common denominator, while PIN compresses the fold changes. However, when we use the serial dilution experiment as a proxy for loading amount differences, then, our goal should not be to achieve perfect correlation to absolute amounts; rather, our goal is to find no compositional differences (slope = 0.0). This stems from the fact that even though overall loading amounts varied between these samples, within each sample the proportion of any given peptide to the whole did not change because overall composition did not change. When considering composition, a flat line indicates no systematic bias. Furthermore, relative fold changes should all be equal to 1, indicating no changes in the composition. We plotted the PIN-normalized intensities for the same peptide and computed the slope and fold changes (Figure 5c,e). We observed that normalizing with PIN achieves a slope = 0.01, and fold changes are near 1, indicating little change in biological variation. Because we knew the loading amounts, we estimated the absolute abundance of peptides initially loaded onto the HPLC column by scaling PIN normalized intensities by the run’s loading amount. Scaling PIN normalized data by the loading amount showed R2 = 0.99 (Figure 5d) and monotonically increasing fold changes (Figure 5e). We also observed fold changes were compressed. In this case, scaled PIN outperformed spike-in normalization for estimating absolute peptide abundance and in estimating fold changes. We next turned our attention from a single peptide fold change to the analysis of composite XCs representing all peptide signals in a run. When we plotted the un-normalized XCs, we observed a clear separation of monotonically increasing XCs, indicating the presence of systematic bias (Figure 6a). However, upon closer inspection, we found that the XCs for the 2.0 μg and 3.0 μg samples were not in the expected order. (The 2.0 μg XC lay above the 3.0 μg XC.) We then plotted the spike-in normalized XC and observed that the 3.0 μg XC shifted its position but still lay below the 2.0 μg XC (Figure 6b). Next we plotted the XCs for the PIN normalized data and observed a convergence of the XCs (Figure 6c). Furthermore, we observed an undulation in the XCs, consistent across the different runs. Finally, we plotted the PIN normalized data scaled by loading amounts and observed that the XCs became monotonically increasing and were in the correct order for nearly all time points (Figure 6d), with the exception of the lowest loading amounts. Peptide signal extracted chromatograms PIN versus spiked-in standard normalization for six different loading amounts. (a) Un-normalized extracted chromatogram plot shows systematic bias. (The 2.0 and 3.0 μg extracted chromatograms are not in the correct order.) (b) Spiked-in (bradykinin) normalized extracted chromatogram plots show that the 2.0 μg extracted chromatogram is now above the 1.0 μg extracted chromatogram but is still below the 2.0 μg extracted chromatogram. (c) PIN normalized extracted chromatogram shows convergence, indicating the removal of systematic bias. (d) Scaling PIN normalized data by the original loading amounts plot shows extracted chromatograms monotonically increasing and in the correct order.

Detecting Biological Variation

Finally, we assessed PIN vis-à-vis common normalization strategies in the context of detecting of biological variation. Again, we used the CPTAC Study 6 and selected samples C and E. Here we used a student’s t test to determine the number of differentially abundant (or proportional) UPS1 and yeast proteins and peptides found. We used SPANS[10] to recommend and perform the optimal peptide select strategy and global scaling function combinations. SPANS performed correctly, finding no global scaling functions were appropriate for the CPTAC C versus E data set. These results confirmed our previous findings that global scaling functions cannot capture and mitigate complex variability. Nonetheless, SPANS was run using three normalization strategies: (1) LOS (5%) peptide selection strategy with mean global scaling; (2) RIP peptide selection strategy with mean global scaling; and (3) RIP peptide selection strategy with median global scaling (Supplemental Table 11 in the Supporting Information). We compared SPANS results to PIN normalized results (Figure 7).

Figure 7

CPTAC C versus E statistically different UPS1 and yeast proteins. The UP designation indicates proteins present lower abundance (or proportion for PIN) in sample E compared with sample C. The DOWN designation indicates lower abundance (or proportion in PIN) in sample E compared with sample C. (a) UPS1 proteins statistically different between samples C and E. (b) Yeast proteins that are statistically different between samples C and E. (c) Number of true positives, false-positives, and false-negatives for UPS1 and yeast proteins. PIN outperformed median and mean global scaling functions in finding statistically significant UPS1 proteins (Figure 7a). With PIN, we found 36 statistically different UPS1 proteins going up (true positives) and only 1 UPS1 protein going down in E versus C (false-positive). Global scaling functions performed abysmally, finding fewer UPS1 proteins as statistically different compared with the un-normalized data. PIN found 20% more (6) UPS1 proteins as statistically different that were false negatives in the un-normalized data (Figure 7c). One important point is that when we treat chromatographic data as compositional data, we also measure statistically significant decreases in background yeast proteins in sample C versus E. We assert these are true positives, reflecting physical realities of these samples. Our reasoning is that when viewing the proteome as compositional, then it follows that as the proportion of UPS1 proteins goes up, the proportion of yeast proteins must go down. It is also well known that increasingly abundant peptides can suppress ionization of coeluting peptides,[30,31] leading to decreased proportional intensities of some yeast peptides in response to increasing load of UPS1 peptides. This may be even more prevalent an affect within adding relatively large amounts of the UPS1 standard into the yeast background. Therefore, with PIN, we expect to find a larger number of yeast peptides with reduced proportion to the whole when compared with other scaling functions (Figure 7b,c).

Discussion

To our knowledge, we are the first to demonstrate that biological variation can be revealed in proteomic chromatographic data by viewing it as compositional. Here we also introduce PIN, a new local method that normalizes a peptide signal’s intensity by computing its proportional intensity relative to its neighborhood. We show that PIN dominates competing normalization strategies when measuring reduction in variability and finding biological variation, even when complex variability is present. The only related work is very recent work by Lyutvinskiy et al., which describes a new instrumental response correction method, a local normalization method, for significant variations in the eluent and analyte composition for improving accuracy of predictive models.[32] We compared and contrasted PIN with their method and found some similarities and some striking differences. Both methods employ temporally bounded neighborhoods to normalize each peptide signal. However, the composition of the neighborhoods is decidedly different in the two methods. While PIN takes an unbiased approach to populating its neighborhood by including all peptide signal XIC peaks within a retention time window, their method takes a biased approach by populating their neighborhoods using only peptides confidently identified from MS2 spectra. Furthermore, the method for correction is not clearly defined, making it difficult to evaluate. In one instance, they describe an abundance alignment method using unchanged peptides as internal standards. In another instance, they report using the square root of the median of all peptides within the retention time window. Unfortunately, their report did not evaluate performance vis-à-vis common normalization strategies, making it difficult to compare to our PIN strategy. They did report that their normalization strategy improved their predictive model, but it is unclear to us whether the improvement stemmed from the local nature of their strategy or if other common normalization strategies would have produced similar results. Regardless of the differences, we both agree on the central need to account for complex variability in intensity-based quantification. A key benefit of PIN over common normalization strategies is its simultaneous mitigation of multiple types of measurement error, including systematic bias and complex variability. This benefit stems from treating chromatographic data as compositions and temporal regions within the data as subcompositions. Thus, any bias affecting the composition as a whole will also affect subcompositions. As a result, when PIN mitigates complex variability within the subcomposition, it inherently also mitigates systematic bias. A second benefit of PIN over common normalization strategies is that PIN requires no a priori knowledge of the type or source of measurement error. For example, a pipetting error doubling the amount of the sample does not necessarily correspond to a linear response in corresponding ion intensities in the resulting data. Therefore, simply normalizing by a single global scaling factor, such as the median of measured ion intensities, is not accurate. Computing the appropriate global scaling factor to mitigate systematic bias involves a priori knowledge of both absolute loading amounts and absolute ion counts. However, because PIN treats chromatographic data as compositional, the only information about peptide abundance and ion counts is relative information; thus, PIN requires no knowledge of absolute loading amounts or absolute ion counts. A third benefit of PIN over common normalization strategies utilizing global scaling functions is that PIN does require a complete matrix on which to compute and thus implicitly handles the missing value problem. PIN simply requires a neighborhood populated with peptide signals. Then, to compare a single peptide signal’s PIN normalized intensities (proportions) across runs, its neighborhoods must be compositionally similar. If needed, missing values can still be imputed after normalization but prior to statistical inference, as is recommended.[9] Of course, as with any method, PIN’s performance depends on some factors. First, PIN relies on high-resolution instrumentation and accurate mass measurements (<10 ppm error) to extract and quantify relevant chromatographic information. Fortunately, such systems are routinely available.[33] Second, as with other methods, the peptide elution order for analyzed mixtures must be similar; PIN requires similar order to form similar neighborhoods. This means mixtures analyzed using different types of chromatographic systems may not be easily compared. Third, although PIN tends to compress the dynamic range of fold changes, the end results are unaffected because it is fold change statistical significance (not numerical value) that counts for determining biological variation. However, if the original loading amount is known, for example, in a serial dilution experiment, that information can be used to scale results and accurately compute fold changes. Fourth, with any normalization method, overfitting and underfitting data is a concern. With PIN, we first construct neighborhoods using a temporal window, and the width of this window primarily controls data fitting quality. We conducted several experiments (results not shown) varying the window width and found that varying the window width between 2 to 5 min as well setting window size to the temporal width of a true peptide signal’s XIC made little difference in the results. However, setting the window <2 min and >5 min tended to overfit and underfit the data, respectively. We expect that optimal window width will correlate somewhat with chromatographic characteristics, primarily gradient duration. Therefore, we utilize the width of the peptide signal’s XIC as input to a function dynamically computing each peptide signal’s temporal boundaries. Despite PIN’s improvements in reducing variability and the number of false-negatives reported, MS-based results must still be confirmed via hypothesis-driven experiments, such as a Western blot or targeted MS. Targeted MS, for example, SRM, is a powerful tool gaining popularity over traditional biochemical analyses due to the ever-increasing scale of today’s high-throughput experiments.[13] Such experiments require as input a transition list or an inclusion list containing feature values (m/z – retention time pairs, which are simply peptide signals). Because PIN operates on a list of peptide signals, each with an associated MS/MS spectrum containing potential SRM transitions, the software implementing PIN could easily be extended to construct these lists. Thus, PIN is well-suited to drive targeted MS experiments. Given that common normalization strategies cannot capture and correct systematic bias and complex variability inherently present in HPLC–ESI–MS/MS workflows, we expect PIN to dramatically improve intensity-based quantification from HPLC–ESI–MS/MS data. Furthermore, although we studied complex protein digests and endogenous peptide mixtures, we believe PIN will be widely applicable to many omics fields using HPLC–ESI–MS/MS to analyze complex mixtures, for example, lipidomics, glycomics, and metabolomics. The upshot will, we expect, be reproducibility- and repeatability-improved, and otherwise falsely reported or missed statistically significant biological variation will be discovered.

27 in total

1. Mechanistic investigation of ionization suppression in electrospray ionization.

Authors: R King; R Bonfiglio; C Fernandez-Metzler; C Miller-Stein; T Olah
Journal: J Am Soc Mass Spectrom Date: 2000-11 Impact factor: 3.109

2. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors: Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal: Anal Chem Date: 2002-10-15 Impact factor: 6.986

Review 3. Ion suppression in mass spectrometry.

Authors: Thomas M Annesley
Journal: Clin Chem Date: 2003-07 Impact factor: 8.327

Review 4. Mass spectrometry-based chemoproteomic approaches.

Authors: Marcus Bantscheff
Journal: Methods Mol Biol Date: 2012

5. Quantitative analysis of the intra- and inter-individual variability of the normal urinary proteome.

Authors: Nagarjuna Nagaraj; Matthias Mann
Journal: J Proteome Res Date: 2011-01-05 Impact factor: 4.466

6. Sample collection and handling considerations for peptidomic studies in whole saliva; implications for biomarker discovery.

Authors: Ebbing P de Jong; Susan K van Riper; Joseph S Koopmeiners; John V Carlis; Timothy J Griffin
Journal: Clin Chim Acta Date: 2011-08-24 Impact factor: 3.786