Literature DB >> 21604797

Isotopic signature transfer and mass pattern prediction (IsoStamp): an enabling technique for chemically-directed proteomics.

Krishnan K Palaniappan¹, Austin A Pitcher, Brian P Smart, David R Spiciarich, Anthony T Iavarone, Carolyn R Bertozzi.

Abstract

Directed proteomics applies mass spectrometry analysis to a subset of information-rich proteins. Here we describe a method for targeting select proteins by chemical modification with a tag that imparts a distinct isotopic signature detectable in a full-scan mass spectrum. Termed isotopic signature transfer and mass pattern prediction (IsoStamp), the technique exploits the perturbing effects of a dibrominated chemical tag on a n class="Chemical">peptide's mass envelope, which can be detected with high sensitivity and fidelity using a computational method. Applying IsoStamp, we were able to detect femtomole quantities of a single tagged protein from total mammalian cell lysates at signal-to-noise ratios as low as 2.5:1. To identify a tagged-peptide's sequence, we performed an inclusion list-driven shotgun proteomics experiment where peptides bearing a recoded mass envelope were targeted for fragmentation, allowing for direct site mapping. Using this approach, femtomole quantities of several targeted peptides were identified in total mammalian cell lysate, while traditional data-dependent methods were unable to identify as many peptides. Additionally, the isotopic signature imparted by the dibromide tag was detectable on a 12-kDa protein, suggesting applications in identifying large peptide fragments, such as those containing multiple or large posttranslational modifications (e.g., glycosylation). IsoStamp has the potential to enhance any proteomics platform that employs chemical labeling for targeted protein identification, including isotope coded affinity tagging, isobaric tagging for relative and absolute quantitation, and chemical tagging strategies for posttranslational modification.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：
Peptides
Proteins

Year: 2011 PMID： 21604797 PMCID： PMC3220624 DOI： 10.1021/cb100338x

Source DB: PubMed Journal: ACS Chem Biol ISSN： 1554-8929 Impact factor: 5.100

Common goals of mass spectrometry (MS)-based proteomics experiments are to identify, characterize, and quantify proteins and their posttranslational modifications from biological samples.(1) A popular strategy for protein identification is the bottom-up shotgun proteomics approach. In this method, a mixture of proteins is subjected to proteolytic digestion, the resulting peptides are separated by liquid chromatography (LC) and detected by MS, and their parent proteins are inferred from the assigned n class="Chemical">peptide sequences.(2) To convert MS data acquired from proteolytic digests into protein identifications, tandem MS can be used to obtain sequence information for individual peptides, followed by comparison to an in silico proteolytic digest of an organism’s proteome.[3−5] Typically, only the most abundant peptides are selected for fragmentation, whereas data for those peptides in relatively low quantities are not obtained.(1) An inherent problem in shotgun proteomics is identifying proteins of low abundance, such as biomarkers for disease states, against a background of proteins whose concentrations can span up to 12 orders of magnitude.[6,7] Directed proteomics strategies seek to address the sample complexity problem by focusing the analysis on a defined protein subset.[8−10] In one approach, proteins of interest are selectively enriched prior to proteolytic digestion, thereby foregoing the shotgun method altogether.[11,12] Alternatively, there is growing interest in the use of chemical tags that perturb the mass envelope of target peptides so as to render them more detectable. The progenitors of this approach are the isotope-coded affinity tag (ICAT) and isobaric tags for relative and absolute quantitation (iTRAQ), techniques now commonly used for quantitative comparative proteomics.[13−15] Chemical tags have also been elegantly employed to mark sites of protein posttranslational modifications(16) including glycosylation,(17) lipidation,[18−20] and phosphorylation.(21) Tags have also been used for labeling protein n class="Chemical">N-termini,(22) sites of cysteine oxidation,(23) enzyme active sites,(24) and points of cross-linking.(25) The halogens n class="Chemical">bromine and chlorine can be advantageous components of chemical tags for MS by virtue of their unique isotopic distributions. Unlike the proteogenic elements, which exist as one predominant isotope, bromine and chlorine have two abundant isotopes that create unique patterns in a mass spectrum: 79Br and 81Br are found in a 1:1 ratio, and 35Cl and 37Cl are found in a 3:1 ratio (isotopic ratios of proteogenic elements are given in Supplementary Table 1).(26) Although this feature has been well exploited in the field of small molecule and metabolite characterization,[27−31] its use in proteomics-related applications has been limited. The first example by Goodlett, Aebersold, and co-workers used a dichloride tag to discriminate peptides with and without a cysteine residue from digested protein samples.(32) Likewise, N-terminal labeling of peptides with a monobromide tag facilitated sequence identification by tandem MS.(33) Recently, Hang and co-workers used a monobromide cleavable tag to enrich for newly synthesized proteins in bacteria.(34) In addition to their distinctive isotopic signatures, bromine and chlorine have a negative mass defect that can endow a modified peptide with a unique fractional mass, a feature which Amster and co-workers made artful use of for peptide mass fingerprinting analysis.[35−37] To date, halogen profiling methods have not been extended to directed proteomic analysis of samples as complex as human cell or tissue lysates. To achieve this goal would require the ability to discriminate a halogen tag’s signature on peptides over a wide mass range, in multiple charge states, and against a background of >100,000 peptides, capabilities that present methods lack.(38) Here we report that a dibromide tag in concert with a novel computational pattern-searching algorithm enables detection of labeled n class="Chemical">peptides from complex biological samples with unprecedented sensitivity and fidelity. The overall approach, termed isotopic signature transfer and mass pattern prediction (abbreviated IsoStamp), was employed as illustrated in Figure . Cell lysates containing a chemically tagged protein were digested with trypsin, and the resulting peptides were analyzed by LC–MS in full-scan mode. Tagged peptides were detected using a pattern-searching algorithm and inventoried to form an inclusion list. The same sample was then subjected to a directed LC–MS/MS experiment where fragmentation was only performed on precursor ions defined by the inclusion list, allowing for direct site mapping. Unlike an intensity-driven data-dependent LC–MS/MS analysis, the IsoStamp method is not limited to identifying peptides of relatively high abundance. Instead, by rendering labeled peptides detectable in a full-scan mass spectrum, IsoStamp is an enabling tool for chemically-directed proteomics, maximizing the identification of peptides of interest from information-dense MS data.

Figure 1

The IsoStamp method improves shotgun proteomics by allowing tagged peptides to be detectable in full-scan mass spectra, facilitating an inclusion-list-driven directed LC–MS/MS experiment. (a) A mixture of proteins where some are chemically tagged (star) is subjected to proteolytic digestion producing (b) a mixture of peptides. (c) Peptides are separated using LC, and full-scan mass spectra are collected. (d) Tagged peptides are identified using a pattern-searching algorithm and inventoried into an (e) inclusion list (rt = retention time). (f1–3) The same sample is then subjected to a directed LC–MS/MS experiment where (f3) MS/MS analysis is only performed on (f2) precursor ions defined in the inclusion list. (g) Data are finally subjected to a database search for parent protein identification

The IsoStamp method improves shotgun proteomics by allowing tagged peptides to be detectable in full-scan mass spectra, facilitating an inclusion-list-driven directed LC–MS/MS experiment. (a) A mixture of proteins where some are chemically tagged (star) is subjected to proteolytic digestion producing (b) a mixture of n class="Chemical">peptides. (c) Peptides are separated using LC, and full-scan mass spectra are collected. (d) Tagged peptides are identified using a pattern-searching algorithm and inventoried into an (e) inclusion list (rt = retention time). (f1–3) The same sample is then subjected to a directed LC–MS/MS experiment where (f3) MS/MS analysis is only performed on (f2) precursor ions defined in the inclusion list. (g) Data are finally subjected to a database search for parent protein identification

Results and Discussion

Bromine and Chlorine Impart Unique Isotopic Signatures on Labeled Molecules

Mass spectra of low molecular weight (MW) compounds bearing a single bromine or n class="Chemical">chlorine atom show two major ions, M and M + 2, with equal or skewed peak heights, respectively. Compounds with two halogen atoms produce symmetrical (2 × Br) or skewed (2 × Cl) triplets with major peaks at M, M + 2, and M + 4. These unique isotopic patterns are evident in the mass spectra (Figure 2, panel b) for the halogenated tyrosine analogues 1–4 shown in Figure 2 (panel a), which were synthesized as iodoacetamide derivatives (Supplementary Figure 1). In principle, the uniqueness of the triplet patterns associated with the dibromide and dichloride motifs could facilitate the identification of tagged peptides from complex proteolytic digests. However, in larger molecules (i.e., MW > 1000) the halogen isotopic patterns are obscured due to the influence of heavy isotopes of C (13C, 1%), H (2H, 0.02%), and N (15N, 0.1%) on the overall mass envelope. To illustrate the point, we alkylated the surface-exposed cysteine residues of bovine serum albumin (BSA) with tags 1–4, digested the modified protein, and analyzed the peptides by LC–MS (Figure 2, panel c). Representative data corresponding to the tryptic peptide SLHTLFGDELC*K are shown in Figure 2, panel d. The isotopic envelope of each tagged peptide reflects a convolution of the parent peptide’s intrinsic isotopic distribution, as seen in the mass spectrum of the iodoacetic acid alkylated version (Figure 2, panel c), with the appropriate halogen pattern (Figure 2, panel b). These data illustrate that the dibromide tag imparts a more distinctive signature on a peptide’s mass envelope than the other halogen tags. Computational simulations suggested a similar advantage of the dibromide tag for peptides of MW up to at least 5,000 Da (Supplementary Figure 2). Still, the complexity of a tagged peptide’s mass spectrum prevented manual searches for isotopic envelopes in complex mixtures. Instead, labeled peptides were detected computationally using a pattern-searching algorithm.

Figure 2

Halogenated tags impart distinct isotopic patterns on peptides. (a) Iodoacetamide-derivatized halogen tags synthesized from tyrosine. (b) Mass spectra of the halogen tags. (c) In a model experiment, BSA was alkylated on cysteine residues with iodoacetic acid or halogen tags 1–4 and then digested with trypsin. (d) Mass spectra of the modified BSA tryptic peptide corresponding to residues 89–100. C* refers to a cysteine residue alkyated by either iodoacetic acid or tags 1–4.

Halogenated tags impart distinct isotopic patterns on n class="Chemical">peptides. (a) Iodoacetamide-derivatized halogen tags synthesized from tyrosine. (b) Mass spectra of the halogen tags. (c) In a model experiment, BSA was alkylated on cysteine residues with iodoacetic acid or halogen tags 1–4 and then digested with trypsin. (d) Mass spectra of the modified BSA tryptic peptide corresponding to residues 89–100. C* refers to a cysteine residue alkyated by either iodoacetic acid or tags 1–4. Although several peak-picking and isotope distribution prediction algorithms exist,[25,39−43] they were not designed, or have not been demonstrated, to search for any user-defined isotope pattern among the sample complexity present in mammalian whole cell lysate. Additionally, some of these approaches imposed restraints or required prior knowledge regarding the sequences of the targeted n class="Chemical">peptides. While such information is useful for many applications, including multiple reaction monitoring,[8,10] we sought to develop a versatile computational algorithm that could extract any isotopic signature from complex MS data without imposing any restrictions based on the structure or reactivity of the chemical tag or on the amino acid composition of the labeled peptides.

Development of a Pattern-Searching Algorithm

The algorithm described here analyzes peaks from a full-scan mass spectrum and matches real data with simulated data generated by convoluting each predicted peptide’s isotopic envelope with the pattern produced by a given tag. The algorithm received two inputs from the user: (1) a centroided mzXML data file and (2) a parameter file that includes the MW and isotopic pattern of the tag, charge states to be considered in the search, and weighting factors used to tune selectivity and sensitivity. The output included the m/z values and retention times of tagged species, which form an inclusion list for a subsequent directed LC–MS/MS analysis. The algorithm has two key steps. First, the full-scan MS data were analyzed to identify putative isotopic signature matches for a given elemental composition. Key to this step is a data-dependent approximation of the contributions of non-halogens to the observed isotopic envelope, while allowing for the inevitable imperfections in MS data derived from complex protein samples. In the second step, the putative matches from the first step were analyzed using a graph-theoretic construct to reduce false positives. Peaks contributing to a putative pattern match are tracked as a function of LC elution time and number of charge states detected to add confidence that they derive from a real species.

Step 1. Identifying Putative Pattern Matches

The algorithm took a list of peaks from the full-scan mass spectrum and divided them into sets that were possibly isotopically related. Each of these sets were searched for the presence of a desired isotopic pattern as follows. First, each peak in the chosen data set was presumed to represent a peptide. Knowing the charge state and m/z for that hypothetical n class="Chemical">peptide, the program predicts its mass and estimated its elemental composition using the “averagine” model (Supplementary Table 2).(44) We confirmed the accuracy of the averagine method by comparing the predicted elemental compositions of 20,000 human tryptic peptides (based on MW) with their actual elemental compositions, revealing a median deviation of less than 4% (Supplementary Figure 3). From the estimated elemental composition, the isotopic pattern of the unlabeled hypothetical peptide was predicted. Then, the isotopic pattern of a chemical tag (i.e., tags 1–4) was convoluted with the predicted n class="Chemical">peptide's isotopic envelope, generating a reference pattern that was compared with the experimental data to determine a fitness score. The program also samples reference patterns that model untagged peptides and instrument noise. Additional reference patterns can be incorporated to account for common sources of false positives in a sample-dependent manner. Each reference pattern (R) was scaled in the intensity dimension to produce an optimal alignment with the data (D). This was accomplished by determining the scaling factor k by a binary search such that the sum of the squared difference (SSD) between each peak in the reference pattern (r ∈ R) and its counterpart in the actual data set (d ∈ D) was minimized: After intensity alignment, the score for the entire pattern was calculated aswhere σ is a measure of peak intensity variance and f is a scoring function for each peak that produced a value in the range [0,1]in which erfc(x) is the complement of the Gaussian error function and the parameter was used to measure the tightness of the peak matching in the intensity dimension. The lower bound of ε was imposed on the function to reduce round-off errors in floating point arithmetic and to allow for robustness against contaminating peaks when used in a Bayesian system. In short, this system allowed for the identification of isotopic envelopes in actual MS data that do not perfectly match theoretically determined isotopic envelopes by virtue of overlapping peaks from other molecular species. After scores of all patterns of interest were determined, the best match was found using a Bayesian approach:where the P(pattern) terms were user-defined weighting factors that describe the probability that any peak in the data set was caused by a molecular species with the isotopic distribution described by pattern and were determined experimentally. These weighting factors allowed us to increase the specificity of the program for a selected pattern, thereby eliminating false positives, or conversely, increasing the number of hits, though potentially at the cost of more false positives.

Step 2. Reducing False Positives with a Graph-Theoretic Approach

Naive pattern matching, as described above, can produce a significant number of false positive matches depending on the complexity of the data. However, information from neighboring spectra was known to reduce false positive detections while enhancing the true positive identifications.[41,45] Therefore, our algorithm exploits two features of LC–MS data: n class="Chemical">peptides are often detected in multiple charge states and in several adjacent scans. To implement these features, a graph-theoretic approach was employed wherein each potential match was treated as a node in a graph. Edges were drawn between two nodes if the nodes could have come from the same molecular species and the nodes have sufficiently similar LC elution times. After edges were built, the graph was decomposed into disjoint subsets, where all nodes in a given subset could have been produced by the same peptide. Each of these subsets was then scored on a number of factors, including the number of nodes in the set and the number of unique charge states detected. Because matches that were made by chance are unlikely to score highly on these criteria, this process filters out false positive matches. A schematic representation of the graph-theoretic model is provided in Supplementary Figure 4, and an analysis of its effect on true positive and false positive identifications are shown in Supplementary Figure 5. Using a modern desktop computer (3.66 GHz, 4 GB ram), an average LC–MS data file was searched using standard settings in less than 2 min.

Application of IsoStamp in a Model LC–MS Experiment

As mentioned previously, the complexity of unfractionated cell or tissue lysates renders the identification of low abundance proteins by shotgun proteomics a challenging endeavor. We therefore sought to test the sensitivity of IsoStamp in identifying a single labeled protein from whole cell lysates. BSA was chosen as a model protein for labeling because it contains 35 cysteine residues that are spread throughout the entire protein. The protein comprises 50 tryptic n class="Chemical">peptides of which 24 possess cysteine residues, including 15 with a single cysteine residue (no missed cleavages and a mass range of 600–2500 Da). We generated detergent lysates of Jurkat cells, a n class="Species">human T-lymphoma cell line, and added known amounts of BSA that had been alkylated on its cysteine residues with dibromide tag 1. After digestion with trypsin, the sample was separated by in-line reversed-phase LC and analyzed on an LTQ-Orbitrap XL mass spectrometer. Figure 3, panel a shows a representative full-scan mass spectrum from the LC–MS data collected for a sample derived from 150 fmol of 1-labeled BSA in 10 μg of Jurkat whole cell lysate, representing 0.1% of the total protein content. When the full-scan mass spectrum was searched using the pattern matching software, we identified several halogen-labeled BSA peptides, collectively reflecting 32% coverage of single cysteine-containing peptides (Supplementary Table 3). The mass envelope of one such peptide, LKPDPNTLC*DEFK (residues 139–151), illustrates the unique isotopic pattern for a dibromide-labeled peptide (Figure 3, panel b). Notably, the pattern (in black) was computationally found 4 orders of magnitude below that of the most abundant ion, at a signal-to-noise ratio (S/N) of 2.5:1, despite the presence of intervening peaks (light gray) within the envelope. Using conventional shotgun proteomics methodologies in samples of this complexity, peaks at such a low level of intensity might be excluded from tandem MS analysis.(7)

Figure 3

The dibromide motif can be recognized at low signal-to-noise ratios. (a) Representative full-scan mass spectrum from LC–MS data derived from a trypsin digest of 150 fmol of dibromide-labeled BSA in 10 μg of Jurkat whole cell lysate. (b) The zoomed-in region shows a dibromide-labeled peptide (in black) LKPDPNTLC*DEFK at a S/N of 2.5:1. C* refers to a cysteine residue alkylated with dibromide tag 1.

The dibromide motif can be recognized at low signal-to-noise ratios. (a) Representative full-scan mass spectrum from LC–MS data derived from a trypsin digest of 150 fmol of n class="Chemical">dibromide-labeled BSA in 10 μg of Jurkat whole cell lysate. (b) The zoomed-in region shows a dibromide-labeled peptide (in black) LKPDPNTLC*DEFK at a S/N of 2.5:1. C* refers to a cysteine residue alkylated with dibromide tag 1.

The Dibromide Tag Is Superior to the Other Halogenated Tags with Respect to Sensitivity and False Positive Identifications

A central feature of the IsoStamp algorithm is that the user can tune its parameters to balance sensitivity against false positive identifications. Using BSA as a substrate, we compared the performance of the dibromide tag to the other n class="Chemical">halogenated tags. To determine the relative number of false positives, we first established searching parameters that found 50% of true halogen-labeled BSA peptides in a sample that contained 3 pmol of halogen-labeled BSA in 10 μg of Jurkat whole cell lysate. The true positives were defined by a manual search of LC–MS data (explained in Supplementary Methods) and are listed in Supplementary Table 4. The relative number of false positives was then determined by searching MS data derived from 10 μg of Jurkat whole cell lysate without BSA (and thus no real positives, Figure 4, panel a). Compared to the dibromide tag, the dichloride tag produced >30-fold more false positives while the monobromide tag produced >120-fold more false positives. Overall, the dibromide tag outperformed the dichloride and monobromide tags by a substantial margin.

Figure 4

The dibromide motif is superior to other halogen motifs with respect to the number of false positives and sensitivity. (a) Number of false positives identified in Jurkat whole cell lysate without BSA using searching conditions that found 50% of true positives for each halogen tag. (b) Sensitivity engendered by each halogen tag was determined by titrating 3.0–0.03 pmol of halogen-labeled BSA into 10 μg of Jurkat whole cell lysate and analyzing the tryptic digest by LC–MS.

The dibromide motif is superior to other n class="Chemical">halogen motifs with respect to the number of false positives and sensitivity. (a) Number of false positives identified in Jurkat whole cell lysate without BSA using searching conditions that found 50% of true positives for each halogen tag. (b) Sensitivity engendered by each halogen tag was determined by titrating 3.0–0.03 pmol of halogen-labeled BSA into 10 μg of Jurkat whole cell lysate and analyzing the tryptic digest by LC–MS. To determine the sensitivities of the halogen tags, we found searching parameters for each tag that fixed the maximum number of false positive identifications at 100. We then performed a titration experiment where known quantities of n class="Chemical">halogen-labeled BSA were added to 10 μg of Jurkat whole cell lysate. Each mixture was digested with trypsin and subjected to LC–MS analysis, and the resulting data were searched for the tag’s isotopic pattern. The proportion of single cysteine-containing BSA peptides as a function of protein concentration are shown in Figure 4, panel b where each computational match was manually verified (the computational detection rate for each peptide can be found in Supplementary Table 3). At all protein concentrations, the dibromide-labeled peptides were detected with a higher frequency than peptides labeled with other tags. While the data appear to converge at low protein concentrations, this may reflect the detection limits of the instrument rather than the capabilities of the pattern-searching algorithm. Overall, the dibromide isotopic signature was detected approximately twice as often as the dichloride and three times as often as the monobromide signatures (for the false negative rate and ROC analysis see Supplementary Figures 6 and 7, respectively). We should note that we were unable to determine the sensitivity and the relative number of false positives for the monochloride tag; reasonable searching parameters (i.e., with an acceptable number of false positives) to detect 50% of true positives in a sample containing 3 pmol of n class="Chemical">monochloride-labeled BSA in 10 μg of Jurkat whole cell lysate could not be found due to the minimal perturbation of this tag on the natural isotopic pattern of peptides (see Supplementary Figure 3).

Application of IsoStamp in a Model Directed Shotgun Proteomics Experiment

After establishing the advantage of using a dibrominated tag for detecting labeled n class="Chemical">peptides in complex mixtures, we tested its utility in a model directed shotgun proteomics experiment. As illustrated in Figure , we added known amounts of dibromide-labeled BSA to 10 μg of Jurkat whole cell lysate. Each mixture was digested with trypsin and then subjected to LC–MS analysis in full-scan mode. Data were processed to identify and inventory all dibromide-labeled peptides, generating an inclusion list that contained the M + 2 m/z value in the labeled peptide’s isotopic envelope and a retention time window. The same sample was then subjected to an LC–MS/MS experiment using the inclusion list to trigger fragmentation only if the listed ion abundance was above a threshold and appeared in the correct retention time window. This approach allowed us to focus the analysis on peptides of interest by using its recoded mass envelope as an indicator, allowing for direct site mapping. In this model directed LC–MS/MS experiment we focused on single cysteine-containing n class="Chemical">peptides. An example of one precursor ion (m/z = 883.26) from the inclusion list (Figure 5, panel a) clearly displays the dibromide pattern, and its sequence and site of modification were identified using collision-induced dissociation (CID) fragmentation when the full isotopic envelope was isolated (Figure 5, panel b). A database search identified the ion as BSA peptide YIC*DNQDTISSK (residues 286–297). Fragment ions that contain the dibromide tag also displayed the perturbed isotopic envelope, as shown for the y+ ion (Figure 5, panel c), strengthening confidence in the site of modification.

Figure 5

IsoStamp with the dibromide tag in a model directed shotgun proteomics experiment. (a) The isotopic envelope of a precursor ion that was selected for fragment ion (m/z = 883.26, highlighted in blue) and its isolation window (highlighted in yellow). (b) The CID fragmentation spectra and the n class="Chemical">peptide assignment for the 883.26 ion, indicating that it is a dibromide-labeled BSA peptide (C* refers to a cysteine residue alkylated with dibromide tag 1). (c) The y+ fragmention ion, which contains the dibromide tag, also displays a perturbed isotopic envelope. (d) Numbers of unique peptides detected using the data-dependent and the directed approaches. The indicated amounts of BSA were added to 10 μg of Jurkat whole cell lysate prior to digestion and LC–MS/MS analysis. Next, we compared the number of single n class="Chemical">cysteine-containing BSA peptides identified using either the directed approach or the conventional data-dependent approach (DDA) in which peptides do not bear a detectable tag.(2) The directed approach identified more single cysteine-containing peptides at all tested concentrations of labeled BSA in 10 μg of Jurkat whole cell lysate (Figure 5, panel d and Supplementary Table 5). The results of this analysis highlight the ability of chemically directed proteomics methods to increase both the number of distinct peptides and the number of low-abundance peptides identified in complex mixtures. Because the unique isotopic signature can be maintained in the MS/MS spectra by using a wide isolation window, modification sites are readily identified, which can assist in identifying sites of posttranslational modifications. The impressive performance of the dibromide tag motivated us to explore its use in the identification of large n class="Chemical">peptides or small proteins. In addition to improving coverage and confidence in protein identifications, the analysis of large MW fragments enables studies of multiple posttranslational modifications (PTMs) that might occur on a single protein molecule, such as glycosylation.(46) Toward this goal, we labeled a single cysteine residue in the small protein Barstar (11.7 kDa) with dibromide tag 1. The purified intact protein (100 pmol) was analyzed by LC–MS, and computational analysis using the averagine model could detect the dibromide-labeled from the unlabeled protein (Supplementary Figure 8). Efforts are underway to refine the computational analysis so that we can identify a similarly sized protein at lower concentrations and in the presence of a complex mixture.

Conclusion

In summary, we have shown that recoding a peptide’s isotopic envelope, in combination with a sophisticated pattern-searching algorithm, can enhance the performance of shotgun proteomics. The IsoStamp method, an extension of Isotopic Distribution Encoding Tagging,(32) exploits the perturbing effects of a din class="Chemical">brominated chemical tag on a peptide’s isotopic envelope, elevating the intensity of the M + 2 peak above the leading two peaks. While building additional halogens into a tag could yield an even more distinguishable isotopic signature, sensitivity for its detection may be compromised, as the same total signal intensity will be split among more peaks. The dibromide signature strikes a balance by enabling high-fidelity pattern matching with good sensitivity. The IsoStamp method can be employed in any proteomics experiment in which a tag is covalently bound to target peptides; in principle, any din class="Chemical">brominated tagging reagent could be used. This concept pertains to many quantitative labeling approaches (i.e., ICAT, iTRAQ), where an isotopically labeled tagging reagent is used to impart peptides with a defined mass shift or a particular reporter mass. Importantly, the IsoStamp method is distinct from these approaches. First, the perturbing effects of halogens on the isotopic envelope of a peptide are substantially greater that those of heavy H, N, C, or O isotopes. Second, IsoStamp relies solely on this isotopic perturbation to identify a labeled peptide, with no requirement of a defined mass shift or a specific reporter mass fragment. We view IsoStamp as complementary to quantitative approaches, where the two proteomics tools can be merged into one experiment with the use of light and heavy dibromide tags. More generally, we view IsoStamp as an enabling tool for chemical proteomics. The method can be incorporated into any proteomics experiment requiring a chemical tag, including emerging bioorthogonal ligation strategies that install uniquely reactive functionalities at sites of posttranslational modifications.(16) Affinity-based proteomics experiments in which tags are covalently bound to enzyme active site residues(24) and protein chemical cross-linking studies(47) can also benefit from integration of the IsoStamp method. In all cases, including a dibromide signature in the covalently bound tag could improve detection and identification of labeled n class="Chemical">peptides. Finally, the computationally detectable isotopic envelope pattern central to the IsoStamp method can be generated in ways other than covalent chemical labeling with bromine atoms. For example, metabolic labeling with a mixture of isotopolog substrates, molecules that differ only in their isotopic composition, could be used to biosynthetically endow biomolecules with similarly distorted isotopic envelopes. Consequently, we envision numerous future applications of IsoStamp in glycomics and metabolomics in addition to proteomics.

Methods

Synthesis of Cysteine-Alkylating Tags

See Supporting Information.

BSA Labeling with Cysteine-Alkylating Tags

A 100 μg aliquot of a 2 mg mL–1 solution of BSA in 250 mM ammonium bicarbonate was reduced by adding n class="Chemical">DTT to a concentration of 2.5 mM and keeping at 55 °C for 30 min. After cooling to RT, the halogenated tag was added to a concentration of 10 mM from a 500 mM solution in DMF. The reaction proceeded for 1 h in the dark before quenching with 5 μL of 1 M DTT for 30 min. Excess tag was removed by size exclusion chromatography using a Bio-Rad Micro Bio-Spin 6 column.

Preparation of Jurkat Whole Cell Lysate

Jurkat cells were lysed in a buffer containing 1% n class="Chemical">TritonX-100, 20 mM Tris pH 7.4, 150 mM NaCl, and protease inhibitors (inhibitor cocktail III from Calbiochem). Following lysis, the sample was precipitated using 9 volumes of acetone and kept at −20 °C for 4 h followed by centrifugation at 13,000 rpm for 20 min at 4 °C. The supernatant was removed, and the pellet was resolubilized in 8 M urea buffered to pH 8.0. A BCA assay was performed to determine protein concentration. Samples were diluted to 1 mg mL–1.

Serial Dilutions of Halogen-Labeled BSA in Jurkat Whole Cell Lysate

For full-scan LC–MS analysis, halogen-labeled BSA samples were serially diluted into 10 uL of 1 mg mL–1 n class="CellLine">Jurkat whole cell lysate at amounts of 3.0, 1.50, 0.80, 0.30, 0.15, 0.08, and 0.03 pmol. For data-dependent or directed LC–MS/MS analysis, either iodoacetamide or dibromide-labeled BSA samples, respectively, were serially diluted into 20 μL of 1 mg mL–1 Jurkat whole cell lysate were used at amounts of 6.0, 3.0, 1.60, and 0.60 pmol (doubled amounts so that two sequential injections could be performed). The samples were trypsin digested (50:1 protein/enzyme) at 37 °C for 16 h and desalted using Millipore C18 zip tips or C18 spin columns (Nest group) according to the manufacturer’s instructions.

LC–MS and Data Analysis

See Supporting Information.

45 in total

1. Cleavable hydrophilic linker for one-bead-one-compound sequencing of oligomer libraries by tandem mass spectrometry.

Authors: Margot G Paulick; Kathryn M Hart; Kristin M Brinner; Meiliana Tjandra; Deborah H Charych; Ronald N Zuckermann
Journal: J Comb Chem Date: 2006 May-Jun

2. Mass defect labeling of cysteine for improving peptide assignment in shotgun proteomic analyses.

Authors: Hilda Hernandez; Sarah Niehauser; Stacey A Boltz; Vijay Gawandi; Robert S Phillips; I Jonathan Amster
Journal: Anal Chem Date: 2006-05-15 Impact factor: 6.986

3. Comparative analysis of cleavable azobenzene-based affinity tags for bioorthogonal chemical proteomics.

Authors: Yu-Ying Yang; Markus Grammel; Anuradha S Raghavan; Guillaume Charron; Howard C Hang
Journal: Chem Biol Date: 2010-11-24

4. Halogenated peptides as internal standards (H-PINS): introduction of an MS-based internal standard set for liquid chromatography-mass spectrometry.

Authors: Hamid Mirzaei; Mi-Youn Brusniak; Lukas N Mueller; Simon Letarte; Julian D Watts; Ruedi Aebersold
Journal: Mol Cell Proteomics Date: 2009-05-01 Impact factor: 5.911

5. Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions.

Authors: M W Senko; S C Beu; F W McLaffertycor
Journal: J Am Soc Mass Spectrom Date: 1995-04 Impact factor: 3.109

6. Signal processing in proteomics.

Authors: Rene Hussong; Andreas Hildebrandt
Journal: Methods Mol Biol Date: 2010

Review 7. Mass spectrometric approaches for the identification of gel-separated proteins.

Authors: S D Patterson; R Aebersold
Journal: Electrophoresis Date: 1995-10 Impact factor: 3.535

8. Error-tolerant identification of peptides in sequence databases by peptide sequence tags.

Authors: M Mann; M Wilm
Journal: Anal Chem Date: 1994-12-15 Impact factor: 6.986

Review 9. Chemical 'omics' approaches for understanding protein cysteine oxidation in biology.

Authors: Stephen E Leonard; Kate S Carroll
Journal: Curr Opin Chem Biol Date: 2010-12-03 Impact factor: 8.822

10. Envelope: interactive software for modeling and fitting complex isotope distributions.

Authors: Michael T Sykes; James R Williamson
Journal: BMC Bioinformatics Date: 2008-10-20 Impact factor: 3.169

22 in total

Review 1. Recent Advances in the Analysis of Complex Glycoproteins.

Authors: Stefan Gaunitz; Gabe Nagy; Nicola L B Pohl; Milos V Novotny
Journal: Anal Chem Date: 2016-11-23 Impact factor: 6.986

2. Electron-deficient p-benzoyl-l-phenylalanine derivatives increase covalent chemical capture yields for protein-protein interactions.

Authors: Cassandra M Joiner; Meghan E Breen; Anna K Mapp
Journal: Protein Sci Date: 2019-04-29 Impact factor: 6.725

3. Targeting Reactive Carbonyls for Identifying Natural Products and Their Biosynthetic Origins.

Authors: Tucker Maxson; Jonathan I Tietz; Graham A Hudson; Xiao Rui Guo; Hua-Chia Tai; Douglas A Mitchell
Journal: J Am Chem Soc Date: 2016-11-14 Impact factor: 15.419

Review 4. Global and site-specific analysis of protein glycosylation in complex biological systems with Mass Spectrometry.

Authors: Haopeng Xiao; Fangxu Sun; Suttipong Suttapitugsakul; Ronghu Wu
Journal: Mass Spectrom Rev Date: 2019-01-03 Impact factor: 10.946

Review 5. Relative quantification of biomarkers using mixed-isotope labeling coupled with MS.

Authors: Heidi M Chapman; Katherine L Schutt; Emily M Dieter; Shane M Lamos
Journal: Bioanalysis Date: 2012-10 Impact factor: 2.681

6. Isotope Targeted Glycoproteomics (IsoTaG) to Characterize Intact, Metabolically Labeled Glycopeptides from Complex Proteomes.

Authors: Christina M Woo; Carolyn R Bertozzi
Journal: Curr Protoc Chem Biol Date: 2016-03-16

7. Protected amine labels: a versatile molecular scaffold for multiplexed nominal mass and sub-Da isotopologue quantitative proteomic reagents.

Authors: Scott B Ficarro; Jessica M Biagi; Jinhua Wang; Jenna Scotcher; Rositsa I Koleva; Joseph D Card; Guillaume Adelmant; Huan He; Manor Askenazi; Alan G Marshall; Nicolas L Young; Nathanael S Gray; Jarrod A Marto
Journal: J Am Soc Mass Spectrom Date: 2014-02-05 Impact factor: 3.109

Review 8. Chemical Biology Approaches to Interrogate the Selenoproteome.

Authors: Jennifer C Peeler; Eranthie Weerapana
Journal: Acc Chem Res Date: 2019-09-16 Impact factor: 22.384

9. Development of IsoTaG, a Chemical Glycoproteomics Technique for Profiling Intact N- and O-Glycopeptides from Whole Cell Proteomes.

Authors: Christina M Woo; Alejandra Felix; William E Byrd; Devon K Zuegel; Mayumi Ishihara; Parastoo Azadi; Anthony T Iavarone; Sharon J Pitteri; Carolyn R Bertozzi
Journal: J Proteome Res Date: 2017-02-28 Impact factor: 4.466

Review 10. Chemical Glycoproteomics.

Authors: Krishnan K Palaniappan; Carolyn R Bertozzi
Journal: Chem Rev Date: 2016-11-18 Impact factor: 60.622