Literature DB >> 35613060

Improved Peptide Backbone Fragmentation Is the Primary Advantage of MS-Cleavable Crosslinkers.

Lars Kolbowski¹, Swantje Lenz¹, Lutz Fischer¹, Ludwig R Sinn¹, Francis J O'Reilly¹, Juri Rappsilber^1,2.

Abstract

Proteome-wide crosslinking mass spectrometry studies have coincided with the advent of mass spectrometry (MS)-cleavable crosslinkers that can reveal the individual masses of the two crosslinked peptides. However, recently, such studies have also been published with noncleavable crosslinkers, suggesting that MS-cleavability is not essential. We therefore examined in detail the advantages and disadvantages of using the commonly used MS-cleavable crosslinker, disuccinimidyl sulfoxide (DSSO). Indeed, DSSO gave rise to signature peptide fragments with a distinct mass difference (doublet) for nearly all identified crosslinked peptides. Surprisingly, we could show that it was not these peptide masses that proved the main advantage of MS cleavability of the crosslinker, but improved peptide backbone fragmentation which reduces the ambiguity of peptide identifications. This also holds true for another commonly used MS-cleavable crosslinker, DSBU. We show furthermore that the more intricate MS3-based data acquisition approaches lack sensitivity and specificity, causing them to be outperformed by the simpler and faster stepped higher-energy collisional dissociation (HCD) method. This understanding will guide future developments and applications of proteome-wide crosslinking mass spectrometry.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 35613060 PMCID： PMC9178559 DOI： 10.1021/acs.analchem.1c05266

Source DB: PubMed Journal: Anal Chem ISSN： 0003-2700 Impact factor: 8.008

Introduction

Crosslinking combined with mass spectrometry (crosslinking MS) is a powerful tool for detecting protein–protein interactions and the structural characterization of proteins. Many key advances have been made in recent years to expand the complexity of the samples that can be analyzed with this technology. These include the database search software,[1−3] false discovery rate (FDR) estimation,[4] and the enrichment of crosslinked peptides.[5−7] One of the key problems when identifying crosslinked peptides is that one must, in principle, identify two peptides from the same MS1 signal. The search space is therefore initially very large, comprising every pairwise combination of the peptides that are in the database, i.e., (n2 + n)/2 crosslinked peptides (n = number of linear peptides in the database). This large search space can be reduced experimentally by separating the crosslinked peptides during the measurement by help of an MS-cleavable crosslinker such as disuccinimidyl sulfoxide (DSSO)[8] or any of its alternatives.[9] The conceptual advantage of MS-cleavable crosslinkers is evident. The crosslinker readily cleaves upon activation in the mass spectrometer, releasing the individual peptides and thereby enabling the measurement of their individual masses. In the case of the most popular MS-cleavable crosslinker DSSO, the crosslinker cleaves preferentially at two different sites, leading to different crosslinker remnants (also called stubs) for each peptide (Figure a). The asymmetric cleavage of this crosslinker produces a pair of alkene (A) and sulfenic acid (S) stub fragments.[8] The S stub fragment commonly loses water, forming the unsaturated thiol (T). The two most frequently observed stub peaks per peptide, the A and the T fragment, form a signature doublet signal with a distinct mass difference, allowing their detection and subsequent calculation of peptide masses.[10]

Figure 1

Statistics on frequency and intensity of peptide doublet peaks. (a) Illustration of DSSO cleavage and the resulting signature peptide doublets with the distinct mass difference Δm. Numbers annotate the intensity rank of the peaks, with the rank of the more intense of the doublet peaks being the rank of the whole doublet. (b) Ratio of identified target–target (TT) crosslink spectrum matches (CSMs) (self and heteromeric) that contain one (lighter color) or both (darker color) peptide doublets in each dataset (5% CSM-level FDR). Datasets using stepped HCD (sHCD) are shown in orange-red, while collision-induced dissociation (CID)-MS3-based methods are in blue. (c) Fraction of detected doublets from (b) passing each intensity rank cutoff. Shown is the cumulative proportion of CSMs containing doublets. Datasets are colored as in (b). Synapse (Syn); Ribosome (Ribo). Knowing the individual peptide masses simplifies the database search, as it reduces the search space to pairwise combinations of peptides with these masses. With the individual peptides released in the mass spectrometer, one can also design more intricate data acquisition approaches. The two peptides can be fragmented individually using MS3, which provides separate fragment information of the two—now linear—peptides. For this, generally the crosslinked peptide is fragmented with a low-energy CID fragmentation first, to preferentially cleave the crosslinker instead of the peptide backbone. Then, signature doublets are selected for MS3. This approach is routinely employed by studies that use the protein interaction reporter (PIR) crosslinker[7] and DSSO, while some others using DSSO supplement this with a complementary electron transfer dissociation (ETD) MS2 spectrum.[11] In an alternative acquisition method, stepped HCD (sHCD), only a single MS2 spectrum is recorded for each crosslinked peptide pair. The peptide is subjected to multiple different collision energies, and the fragments are recorded in a single MS2 spectrum. This spectrum should contain the signature doublet (from lower fragmentation energies) as well as additional backbone fragments (from higher fragmentation energies). These spectra can be searched in most crosslinking search tools, with optional filtering for spectra containing cleaved signature peaks during[2] or after[12] search. Despite the clean crosslinker cleavage producing dominant signature peaks in proof-of-concept data of either approach, there is a lack of statistical data of how often this happens in general. It is unclear how many crosslinked peptides give rise to doublets, how prominent these doublets are, and how successful doublet selection is at covering the peptides. It is therefore unknown how many crosslinked spectra are left unidentified when relying on these doublets. sHCD compared favorably to CID methods in the number of crosslinks identified,[13] but a methodical analysis comparing the information contained in their fragmentation spectra is missing and yet is crucial for future design of crosslinkers and acquisition methods. MS-cleavable crosslinkers have been the tool of choice in many proteome-wide crosslinking MS studies, and it has been suggested that large-scale crosslinking MS depends on MS-cleavable crosslinkers.[14] While conceptually appealing, these advantages and potential limitations of MS-cleavable crosslinkers have yet to be analyzed in detail in “real-world” scenarios—some comparisons exist, but usually only comparing a few crosslink spectrum matches (CSMs). We systematically investigated the influence of the popular MS-cleavable crosslinkers DSSO and DSBU on the fragmentation of crosslinked peptides. We achieve this using crosslinker search software that does not rely on the cleaved stubs for identification. This allowed us to clarify how widespread the cleavage of DSSO and DSBU actually is and to probe the gain of knowing the individual peptide masses for identifying crosslinks.

Methods

Database Search and FDR Filtering

Mass spectrometry raw data were processed using MSconvert[16] (v3.0.11729) to convert to mgf-file format. A linear peptide search using xiSEARCH was employed to determine median precursor and fragment mass errors. Peak list files were then recalibrated to account for mass shifts during measurement prior to analysis using xiSEARCH3 1.7.6.1 with the following settings: MS1 error tolerances of 3 ppm; MS2 error tolerance of 5 ppm for the Escherichia coli lysate dataset and 15 ppm for the others; up to two missing precursor isotope peaks; tryptic digestion specificity with up to two missed cleavages; modifications: carbamidomethylation (Cys, + 57.021464 Da) as fixed and oxidation (Met, + 15.994915 Da), deamidation (Asn and Gln, + 0.984016 Da), methylation (Glu and Asp, + 14.015650 Da), amidated crosslinker (Lys and protein N-terminus, DSSO–NH2: +175.03031 Da; BS3–NH2: 155.09463 Da; DSBU–NH2 + 213.11134 Da), and hydrolyzed crosslinker (Lys and protein N-terminus, DSSO–OH: +176.01433 Da; BS3–OH: +156.07864 Da; DSBU–OH + 214.095357 Da) as variable modifications; maximum number of variable modifications per peptide: 1; losses: −CH3SOH, −H2O, −NH3, and additionally masses for crosslinker-containing ions were defined accounting for its cleavability (DSSO A: 54.01056 Da, S: 103.99320 Da, T: 85.98264 Da; DSBU A: 85.05276 Da, B: 111.032028 Da). Crosslink sites for both reagents were allowed for side chains of Lys, Tyr, Ser, Thr, and the protein N-terminus. Note that we included a “noncovalent” crosslinker with a mass of zero to flag spectra potentially arising from gas-phase-associated peptides.[17] These spectra were removed prior to false discovery rate (FDR) estimation. Results were filtered prior to FDR to matches having a minimum of three matched fragments per peptide, a delta score of >15% of the match score and a peptide length of at least six amino acids. Additionally, identifications of peptide sequences that are found in two or more proteins were removed. FDR was estimated using xiFDR[18] (v2.1.2) on a unique CSM level to 5% grouped by self- and heteromeric matches. Results of the reanalysis are deposited in PRIDE with the accession number PXD032821.

Data Evaluation

CSMs passing FDR were re-annotated with pyXiAnnotator v0.3.4 (https://github.com/Rappsilber-Laboratory/pyXiAnnotator/) with peptide, b-, and y-type ions using MS2 tolerances as described above. The resulting matched fragments were used to check for the occurrence of DSSO A-T doublets and to calculate fragment sequence coverages. We calculated the sequence coverage for our CSMs conservatively, as the ratio of matched N-terminal and C-terminal sequence fragments to the number of theoretically possible sequence fragments (i.e., 100% sequence coverage would mean the detection of at least one fragment from the N-terminal and one from the C-terminal series between all amino acid residues of a peptide). For the doublet rank evaluation, the deisotoped ranks from pyXiAnnotator were used. To evaluate the MS3 triggering behavior, the MS3 precursor m/z was extracted from the scan header and compared with the fragment annotation result of the corresponding MS2 CSM. If the MS3 precursor matched a crosslinked peptide stub fragment with 20 ppm error tolerance, it was counted as correctly triggered.

Results and Discussion

Prevalence of Peptide Doublets in Fragmentation Spectra of DSSO Crosslinked Peptides

We analyzed three publicly available datasets of DSSO crosslinking experiments coming from three different labs, differing in acquisition method and sample complexity (Table ). The dataset of crosslinked E. coli lysate was acquired using sHCD with a low, medium, and high normalized collision energy for each MS2.[4] sHCD is also one of two acquisition methods used to record a dataset of crosslinked, purified 70S ribosomes.[13] In addition to this, Stieger et al. also employed a CID-MS2-HCD-MS3 approach. For this, first, a low-energy CID-MS2 was acquired. Then, MS3 was triggered when doublets of the correct mass difference (32 Da for A-T) were detected (Figure a). Finally, the third dataset called here “Synapse dataset” covered crosslinked mouse synaptosomes and was acquired with a CID-MS2-MS3 + ETD-MS2 approach.[15] As in the Ribosome dataset, a low-energy CID-MS2 was acquired for doublet detection. Then, MS3 was acquired as described above, supplemented by an additional ETD-MS2 on the same MS1 precursor.

Table 1

Overview of Analyzed Datasets

sample	crosslinker	acquisition method	variable modifications used in reanalysis	# MS2 spectra	PRIDE accession	ref
E. coli lysate	DSSO	stepped HCD (sHCD)	oxidation (M), methylation (D, E), deamidation (N, Q), BS3/DSSO −OH; −NH₂ (K, nterm)	3,469,500	PXD019120	(4)
E. coli lysate	BS3	stepped HCD (sHCD)		3,817,562	PXD019120	(4)
Mus musculus synaptosomes	DSSO	CID-MS2-MS3 + ETD-MS2	DSSO −OH; −NH₂ (K, nterm)	2,925,993	PXD010317	(15)
Mus musculus synaptosomes	DSSO	CID-MS2-MS3 + ETD-MS2	DSSO −OH; −NH₂ (K, nterm)	2,925,993	PXD015160	(15)
E. coli 70S ribosome	DSSO	stepped HCD (sHCD)	DSSO −OH; −NH₂ (K, nterm)	128,152	PXD011861	(13)
E. coli 70S ribosome	DSSO	CID-MS2-HCD-MS3	DSSO −OH; −NH₂ (K, nterm)	44,436	PXD011861	(13)
Drosophila melanogaster extract	DSBU	stepped HCD (sHCD)	DSBU −OH; −NH₂ (K, nterm)	3,390,787	PXD012546	(2)

To assess the prevalence of doublets in the fragmentation spectra of crosslinked peptides, we re-searched the datasets using a search algorithm that does not rely on peptide doublets for crosslink identification. After database search and filtering to 5% heteromeric (inter protein) CSM-level FDR,[18] we looked for signature A and T stub fragment doublet peaks of the identified peptides and the intensity rank of these doublets in each spectrum (Figure a). Even though we did not require doublets to identify crosslinked peptides, they were very common features in our CSMs. We found doublets frequently for at least one peptide, independent of dataset and acquisition method (90–98%) (Figure b). The same trend holds true also for DSBU (Figure S1). The CID acquisitions displayed a higher proportion of CSMs with both peptide doublets detected compared to the sHCD datasets. If one looks at only the common identifications of CID and HCD to make up for the difference in number of identifications, the amount of doublets detected for both peptides increases noticeably for sHCD (71%), making the difference to CID (81%) less pronounced (Figure b) as does considering only single stub peaks (Figure S3). We next looked at the intensity of the doublet peaks across these datasets, as this is important for their use during acquisition and data analysis (Figures c and S2 for DSBU). In the majority of the spectra, the more abundant doublet is among the most intense peaks, independent of the fragmentation method used. In fact, a doublet peak is frequently the most abundant peak (34–53% of the doublet-containing spectra). Almost all (94–98%) doublet-containing spectra have a peak of the more intense peptide doublet among the 20 most intense peaks (87% for DSBU). Spectra typically displayed in publication figures suggest that also the less intense doublet is seen prominently in CID spectra. However, this was only the case for 10% (Ribosome) or 20% (Synapse) of the doublet-containing CID spectra of our investigated data. Nevertheless, it is seen among the top 20 peaks in 78% (Ribosome) or 91% (Synapse) of the doublet-containing CID spectra. For the sHCD data, the doublet ranks are lower, yet still approximately 70% of spectra have them among the 20 most intense peaks (Figure c). In conclusion, the first doublet is among the most intense peaks for the majority of CSMs independent of the fragmentation method, albeit to a lesser extent for DSBU. While the second doublet increases confidence in doublet calling, only one peptide doublet is necessary for deriving both peptide masses, given that we know the precursor mass. The visibility of the second peptide doublet is crucial, however, for the successful selection of both peptides for MS3. We therefore investigated how successful selecting doublets from CID-MS2 spectra for MS3 was at covering one or both crosslinked peptides, and if this more complex approach produces more confident identifications than HCD-MS2.

Speed of HCD Outperforms Higher Sequence Coverage of CID + MS3

The ratio of identified doublets and their intensity ranks are important criteria for selecting peptides for MS3 fragmentation. However, absolute numbers of crosslink identifications may also be influenced by other aspects, such as backbone fragmentation and acquisition speed. We used the Ribosome dataset to compare these aspects, as it uses both methods on the same sample. Here, sHCD leads to 1.4 times more residue pairs identified than CID-MS3.[13] When comparing the common CSMs between CID and sHCD, the overall sequence coverage in sHCD is higher compared to low-energy CID (Figure a). This comes as no surprise, as low-energy CID is primarily applied to separate the crosslinked peptides and not for peptide backbone fragmentation. It is intentionally combined with MS3 scans and ETD fragmentation to provide additional sequence information. When we include the corresponding MS3 scans, the sequence coverage increases noticeably compared to that of low-energy CID alone. The overall coverage from combining fragments from CID and MS3 surpasses the sHCD coverage. Therefore, the backbone fragmentation does not explain the higher number of CSMs for sHCD.

Figure 2

Speed of HCD outperforms higher sequence coverage of CID + MS3 in the Ribosome dataset. (a) Sequence coverage of common CSMs (n = 776) identified in both sHCD and CID. Additionally, sequence coverage of CID spectra combined with their respective MS3 scans is shown. Sequence coverage differs between the two crosslinked peptides, which accordingly are defined as better or worse. Boxplots depict the median (middle line), upper and lower quartiles (boxes), and 1.5 times the interquartile range (whiskers). Asterisks indicate significance calculated by a two-sided Wilcoxon signed-rank (p-value > 0.05: n.s., p-value < 0.0001: ****). (b) Number of acquired MS scans per fragmentation method. Error bars show the 0.95 confidence interval (n = 7). (c) Number of triggered MS3 scans per MS2 scan, for CSMs, linear peptide spectrum matches, and crosslinker-modified linear peptide spectrum matches, respectively. Error bars show the 0.95 confidence interval. (d) Proportion of common CID CSMs having no doublets, only one, or both peptide doublets correctly triggered for MS3. MS3 acquisition schemes require multiple scan and fragmentation events, while sHCD only acquires a single MS2 scan. This difference in complexity and, more importantly, acquisition speed is reflected in the number of total MS2 scans acquired, which on average is almost 3 times lower for the CID-MS3 method because a lot of acquisition time is spent on acquiring the additional MS3 scans (Figure b). The drastically lower sampling of precursors for fragmentation will consequently lead to the reduced detection of crosslinked peptides, which subsequently results in a lower number of crosslink identifications. This is exacerbated by many MS3 spectra being acquired for crosslinker-modified and even for unmodified linear peptides (Figure c). Despite this excessive MS3 triggering, for only 41% of the CSMs, MS3 was triggered correctly on both peptide doublets (Figure d). This is also reflected in the wider spread of sequence coverage for the worse fragmented peptide (Figure a), which is crucial for the unambiguous identification of both linked peptides.[19] Note also that for this peptide, the sequence coverage is not significantly increased in CID + MS3 over sHCD. In this dataset, the speed of sHCD compensates for its slightly lower sequence coverage. sHCD also shows a more symmetric fragmentation of both peptides (Figure a), as the MS3 approach is limited by its dependency on triggering on the correct doublets (Figure d). Further development of MS3 approaches should focus on a more sensitive and selective MS3 selection, which in part is governed by the yield of the crosslinker cleavage.

Peptide Doublets for Quality Control

While some database search algorithms have been built around inference of peptide masses from doublets, others have been built without relying on them. Unarguably, peptide masses are useful information. In an attempt to quantify their value, we investigated the target–decoy CSMs (as a representation of the random matches) for the occurrence of peptide doublets. Because heteromeric CSMs are the focus of most biological research questions and are also more challenging to identify, we focused on those for the analysis. A substantial fraction of random matches has matching peptide doublets (>47% of heteromeric target–decoy CSMs, Figure a). However, their extent varies considerably between the datasets. The highest proportion of doublets among target–decoy CSMs is found in the Ribosome dataset (75 or 70% for sHCD and CID, respectively). The E. coli dataset contains at least one doublet in 66% of the target–decoy CSMs, while this proportion decreases to 47% for the Synapse dataset. The amount of identified doublets present in target–decoy matches seems less dependent on the acquisition method, and more on the sample and database.

Figure 3

Peptide doublets as a quality control metric for heteromeric identifications. (a) Percentage of heteromeric target–decoy CSMs that contain one or two peptide doublets across datasets at 5% CSM-level FDR. (b) Proportion of heteromeric CSMs at 5% CSM-level FDR when filtering spectra to contain a peptide doublet compared to unfiltered data. (c) Score distribution of heteromeric matches in the E. coli dataset. Distribution of targets and target–decoy matches with and without filtering for peptide doublets. Dashed lines show the resulting score cutoffs at 5% FDR. (d) FDR (interpolated values for visualization) of unfiltered and peptide doublet filtered E. coli data. Synapse (Syn); Ribosome (Ribo). Although heteromeric target–decoy CSMs contain peptide doublets, they do so less often than the heteromeric target–target matches (21–49 percentage points less for a single doublet and 33–50 percentage points less for both doublets, Figure S4). Based on this difference, we investigated the effect of using this metric as a quality filter. We prefiltered the search results to those spectra that contain at least one peptide with a detected doublet and then reestimated 5% CSM-level FDR. The gains using this approach are very much dependent on the complexity of the dataset (Figure b). Unsurprisingly, the Synapse dataset, which had the least target-decoys containing a matching doublet, shows the largest gains using this approach (19%). However, the E. coli dataset only gains 5% in heteromeric CSMs, even though there is a large difference in the proportion of peptide doublets between target and false matches (97% vs 66%; Figures S4 and 3a). This led us to investigate the score distribution of doublet-containing matches in more detail (Figure c). The vast majority of high-scoring target–target CSMs contain at least one doublet and are therefore not removed, while targets without a matched peptide doublet tend to have lower scores. In this lower-scoring region, there is a steep increase in target–decoy matches, which is only slightly reduced by pre-filtering for a doublet. The effect becomes more apparent when looking at the FDR at different score thresholds. While the increase in error is not as steep for the filtered matches as for the unfiltered, it still grows exponentially (Figure d). This holds true also for the Ribosome datasets and to a lesser extent for the Synapse dataset (Figures S5–S7). The moderate gains of using doublets for post-search filtering also suggest that using them during the search will offer only moderate gains. Presumably, spectra of high quality, which contain doublets, also tend to contain sufficient peptide fragment peaks so that identification is possible without relying on peptide mass information.

Comparison of a Cleavable to a Noncleavable Crosslinker

Noncleavable crosslinkers are widely believed to be unsuitable for complex samples.[14,20,21] This bases on an assumption: not knowing the individual peptide masses for a crosslinked peptide would require an exhaustive combination of all peptides in the database. This would lead to an explosion of the search space. However, there are multiple large-scale studies that have successfully employed a noncleavable crosslinker despite these assumptions.[4,12,22,23] These are based on a detailed understanding of how crosslinked peptides fragment[24] that offered a computational solution to knowing the individual peptide masses which was then implemented in the search algorithm xiSEARCH3. In light of the successful usages of both types of crosslinkers, we decided to compare their spectral information to understand any costs and benefits. In addition to DSSO, the published E. coli dataset also contains data from the noncleavable crosslinker BS3. As the data for both crosslinkers were prepared and acquired in a very comparable manner, this dataset offers an opportunity to directly compare the effects of BS3 to DSSO on a complex mixture analysis. Importantly, because of its size and the high number of CSMs identified, the dataset is well suited for statistical evaluation. A manual side-by-side comparison of CSMs identified in both datasets suggests DSSO to have richer spectra with more fragments. Especially, fragments containing the crosslinking site appear to be more present, mostly as fragments containing an A/S/T stub of DSSO (Figures a and S8). We then performed a statistical evaluation of this observation over common CSMs of the two crosslinkers (Figure b). This confirmed that DSSO led indeed to a significantly higher sequence coverage than BS3. While the coverage of linear fragments is very similar between the two crosslinkers, the coverage of link site-containing fragments is significantly higher for DSSO, especially for the worse fragmenting peptide (Figure S9). Link site-containing fragments contain the full second peptide (+P) or, additionally for cleavable crosslinkers, just a cleaved crosslinker stub. Indeed, A/S/T stub fragments are the major source of link site-containing fragments for DSSO, while +P coverage is lower than that of BS3. This means that the increased sequence coverage for DSSO stems exclusively from cleaved crosslinker fragments.

Figure 4

Comparison of noncleavable crosslinker BS3 to the MS-cleavable crosslinker DSSO. (a) Example MS2 spectrum of a high-scoring CSM identified in both datasets. (Top) CSM from the BS3 dataset. (Bottom) The same peptide m/z-species identified in the DSSO dataset. Unique fragments are highlighted in bold. (b) Sequence coverage of all, linear and link site-containing fragments of all common CSMs (n = 12919). For DSSO, link site-containing fragments are additionally separated into fragments containing the full second peptide (+P) or only the cleaved crosslinker stub (A/S/T). Boxplots depict the median (middle line), upper and lower quartiles (boxes), and 1.5 times the interquartile range (whiskers). Asterisks indicate significance calculated by a two-sided Wilcoxon signed rank (p-value < 0.0001: ****). (c) Target–target and target–decoy score distributions of heteromeric CSMs for BS3 and DSSO. Scores were normalized to their respective score cutoff at 10% FDR. (d) Number of heteromeric CSMs passing 5% CSM-level FDR for BS3 and DSSO. As a control, DSSO was additionally searched as a noncleavable crosslinker and also filtered for the presence of peptide doublets. The better sequence coverage of DSSO-linked peptides improves the separation of true from false CSMs (Figure c). For heteromeric matches, DSSO has a larger area under the curve, and especially more high-scoring targets, effectively leading to an increase in heteromeric CSMs. While for BS3 3308 heteromeric CSMs were identified, the DSSO dataset resulted in more than twice as many (7316, +121%) (Figure d). For self-CSMs, only 29% more CSMs were identified with DSSO than with BS3 (Figure S10), indicating that self-CSMs are approaching exhaustive coverage at the given experimental detection limit. Similar results were seen when including retention time data of heteromeric and self-CSMs.[22] To investigate the effect of the cleaved crosslinker fragments on the overall crosslink search performance, we performed another search in which the DSSO crosslinker was treated as noncleavable. In this search, only 1866 heteromeric CSMs were identified (−74%). Filtering these results for doublet-containing results, as described before, increased identifications to 3064. This is, however, still a loss of 58% of CSMs compared to the search considering DSSO as cleavable. Also in other datasets, crosslinker cleavability was central to search success (Figure S11). Collectively, these observations demonstrate that A/S/T stub fragments play a central role in the success of DSSO for crosslinking mass spectrometry, especially for more complex samples.

Conclusions

Our work finds a surprisingly limited value of doublet information stemming from crosslinker cleavage for the identification of crosslinks. Nonetheless, we find cleavable crosslinkers to lead to the identification of substantially more heteromeric CSMs. We pinpoint improved sequence coverage as the major contributor to this. This has implications for how to conduct crosslinking studies and the future development of the methodology. First, as many suspected but possibly not for the right reasons, cleavable crosslinkers are preferable for crosslink mixture analyses. Second, sHCD is the recommended acquisition method as it achieves almost the same sequence coverage as CID-MS3, but is much faster. CID-MS3 currently lacks speed, specificity, and sensitivity. Consequently, future developments of crosslinkers and acquisition methods should focus primarily on sequence information, without compromising acquisition speed. Current choices governing acquisition schemes rely on experimental comparisons, to which we add a methodological understanding of the key parameters that govern crosslink identification. With this, we hope to pave the way for simplified, cost-effective, and standardized workflows that a wider number of labs can use.

23 in total

1. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry.

Authors: Fan Liu; Dirk T S Rijkers; Harm Post; Albert J R Heck
Journal: Nat Methods Date: 2015-09-28 Impact factor: 28.547

Review 2. Divide and conquer: cleavable cross-linkers to study protein conformation and protein-protein interactions.

Authors: Andrea Sinz
Journal: Anal Bioanal Chem Date: 2016-10-12 Impact factor: 4.142

Review 3. Anatomy of a crosslinker.

Authors: Adam Belsom; Juri Rappsilber
Journal: Curr Opin Chem Biol Date: 2020-08-21 Impact factor: 8.822

Review 4. Cross-Linking Mass Spectrometry: An Emerging Technology for Interactomics and Structural Biology.

Authors: Clinton Yu; Lan Huang
Journal: Anal Chem Date: 2017-11-21 Impact factor: 6.986

Review 5. To Cleave or Not To Cleave in XL-MS?

Authors: B Steigenberger; P Albanese; A J R Heck; R A Scheltema
Journal: J Am Soc Mass Spectrom Date: 2019-12-27 Impact factor: 3.109

6. Quirks of Error Estimation in Cross-Linking/Mass Spectrometry.

Authors: Lutz Fischer; Juri Rappsilber
Journal: Anal Chem Date: 2017-03-24 Impact factor: 6.986

7. PhoX: An IMAC-Enrichable Cross-Linking Reagent.

Authors: Barbara Steigenberger; Roland J Pieters; Albert J R Heck; Richard A Scheltema
Journal: ACS Cent Sci Date: 2019-08-19 Impact factor: 14.553

8. Noncovalently Associated Peptides Observed during Liquid Chromatography-Mass Spectrometry and Their Effect on Cross-Link Analyses.

Authors: Sven H Giese; Adam Belsom; Ludwig Sinn; Lutz Fischer; Juri Rappsilber
Journal: Anal Chem Date: 2019-02-01 Impact factor: 6.986

Review 9. Cleavable Cross-Linkers and Mass Spectrometry for the Ultimate Task of Profiling Protein-Protein Interaction Networks in Vivo.

Authors: Manuel Matzinger; Karl Mechtler
Journal: J Proteome Res Date: 2020-11-05 Impact factor: 4.466

10. In-cell architecture of an actively transcribing-translating expressome.

Authors: Francis J O'Reilly; Liang Xue; Andrea Graziadei; Ludwig Sinn; Swantje Lenz; Dimitry Tegunov; Cedric Blötz; Neil Singh; Wim J H Hagen; Patrick Cramer; Jörg Stülke; Julia Mahamid; Juri Rappsilber
Journal: Science Date: 2020-07-31 Impact factor: 47.728