Literature DB >> 35404046

A Sensitive and Controlled Data-Independent Acquisition Method for Proteomic Analysis of Cell Therapies.

Camille Lombard-Banek^1,2, Kerstin I Pohl³, Edward J Kwee¹, John T Elliott¹, John E Schiel^1,2.

Abstract

Mass spectrometry (MS)-based proteomic measurements are uniquely poised to impact the development of cell and gene therapies. With the adoption of rigorous instrumental performance qualifications (PQs), large-scale proteomics can move from a research to a manufacturing control tool. Especially suited, data-independent acquisition (DIA) approaches have distinctive qualities to extend multiattribute method (MAM) principles to characterize the proteome of cell therapies. Here, we describe the development of a DIA method for the sensitive identification and quantification of proteins on a Q-TOF instrument. Using the improved acquisition parameters, we defined a control strategy and highlighted some metrics to improve the reproducibility of SWATH acquisition-based proteomic measurements. Finally, we applied the method to analyze the proteome of Jurkat cells that here serves as a model for human T-cells. Raw and processed data were deposited in PRIDE (PXD029780).

Entities: Chemical

Keywords: SWATH acquisition; biopharmaceutical; bottom-up proteomics; cell therapies; data-independent acquisition; mass spectrometry; performance qualification (PQ); quality control (QC)

Mesh：

Substances：
Proteome

Year: 2022 PMID： 35404046 PMCID： PMC9087334 DOI： 10.1021/acs.jproteome.1c00887

Source DB: PubMed Journal: J Proteome Res ISSN： 1535-3893 Impact factor: 5.370

Introduction

Complex therapies, whereby viruses or whole cells act as the drug, are emergent new treatments requiring new characterization strategies. For example, chimeric antigen receptor T-cells (CAR-Ts) are modified patient T-cells that utilize the existing biological properties of these immune cells to target and kill cancer cells. CAR-Ts are obtained by engineering the patient’s T-cells to express a receptor on their surface—the CAR—specific to the surface receptors on the targeted malignancy. The genomic information for the CAR protein is incorporated in the cell via transduction with a viral vector (retrovirus or lentivirus).[1] A series of complex cell sorting, activation, and expansion steps are required before reintroducing the transduced CAR-T cell product back into the patient. Details on the manufacturing of CAR-T cells are available in recent reviews.[1,2] Current state-of-the-art analyses of CAR-T drug products rely on the measurements of a few select proteins.[3−5] For example, fluorescence-activated-cell sorting (FACS) measures T-cell population purity and CAR expression using fluorescently labeled antibodies against T-cell surface markers (CD4 and/or CD8) or the CAR, respectively. The drug’s pharmacological activity is assessed via activation using beads decorated with the tumor surface receptor followed by cytokine-release assays.[3,4,6] Cytokine-release assays measure signaling proteins (interleukin, interferon, and growth factors) released by the CAR-Ts. Although useful, these assays only measure a limited number of quality attributes of the raw material (T-cells) or the product (CAR-Ts). Identifying attributes that better predict the quality of the product could better position these drugs from a last resort to the second or first line of treatments, which requires better characterization of the manufacturing process and the final product.[4,7] MS enables the characterization of a large number of proteins in a single label-free (i.e., no antibodies) experiment.[8,9] We have recently reviewed the potential benefits of MS-based proteomics in addressing the challenges to characterize cell therapies.[8] Two main data acquisition strategies exist to measure proteins in an untargeted fashion (also referred to as shotgun approaches): data-dependent and data-independent acquisitions (DDA and DIA, respectively). In DDA, peptide ions are selected for fragmentation using a narrow isolation window following a top-N scheme.[10,11] In a top-N scheme, the N most abundant precursor ions are selected for fragmentation per instrument cycle time. The fundamental nature of DDA renders the identification of the same peptide/proteins in replicate runs stochastic. Therefore, label-free quantification using DDA often leads to missing peptide information across replicates, referred to as missing values, which decreases quantitative coverage and statistical power. Strategies such as multiplexing with isobaric tags partially remediate the issue, albeit at a high cost.[11,12] Conversely, in DIA, broad isolation windows scanning across the entire m/z range enable the fragmentation of all peptide ions regardless of their intensity, leading to a significant decrease in missing values.[3,13,14] Mirroring the current multi-attribute-method (MAM) employed for single protein-drug molecules,[15−17] DIA can quantify multiple proteins with high precision for more complex biopharmaceutical systems like cell therapies. In MAM, a preliminary run is performed in DDA mode to determine the list of peptide identities and their respective retention times. Consecutive runs are then performed and compared in MS-only mode (no fragmentation) to quantify peptides of interest.[15] Similarly, in DIA, DDA is performed first to build a list of peptide-query-parameters (PQPs) used to extract peptide and protein identities from raw DIA. PQPs encompass fragment ion (transition) lists for each identified peptide and their respective retention times.[18] Proteins are quantified using the sum of the integrated area under the extracted chromatogram curve of the peptide fragment ions. During MS-based proteomic analysis, multiple factors that have been summarized elsewhere[19−21] contribute to the technical variability of the measurements. NanoLC-MS instruments contribute in large part to the measurement variability. The National Institute of Standards and Technology (NIST) and the National Cancer Institute (NCI) have established 46 metrics to evaluate the performance of LC-MS systems, called Mass Spectrometry Quality Control (MSQC). These 46 metrics correspond to 6 categories critical to LC-MS measurements: chromatography, dynamic sampling, ion source, MS1 signal, MS2 signal, and peptide identification.[20] Several informatics tools have been developed to monitor instrument performances.[21−24] Now, technical variability in large-scale bottom-up proteomics by DIA can be assessed using these principles and experimental design borrowed from MAM and/or clinical proteomics.[8] MS-based proteomics has proven beneficial to shed light on the mechanism of action of CAR-Ts[25−27] and is poised to identify additional process and/or product quality attributes by monitoring cell health at critical stages of the manufacturing process. Expansion of large-scale MS-based proteomics from the research setting to the process development requires stringent performance qualifications (PQs) to be fit-for-purpose. Here, we describe the development of a sensitive and controlled MS-based proteomic acquisition method on a quadrupole time-of-flight system using DIA toward the analysis of CAR-T cell therapies. We first established the different conditions that provided the highest sensitivity and reproducibility, including the separation, data-dependent acquisition for PQPs library building, and DIA for quantification. Then, we provide guidelines and metrics to extend DIA from a research setting to the biopharma space using the MSQC principles. Finally, we applied the strategy to the measurement of a Jurkat cells digest. Jurkat cells are immortalized lymphoblastic T-cells and are being used to produce CAR-T mimetic to be employed as a method development tool and system suitability test.[28]

Experimental Section

Reagents

Reagents were purchased at reagent grade or higher. Standard K562 protein digests were from SCIEX (Framingham, MA) or Promega (Madison, WI). PepCalMix, containing 20 heavy labeled peptides were from SCIEX. Dithiothreitol (DTT, #39255) and iodoacetamide (IAA, #39271) were procured in no-weigh format from Thermo Fisher Scientific (Waltham, MA). MS-grade trypsin/Lys-C protease mix was from Thermo Fisher Scientific (#A41007). Solvents for liquid-chromatography (LC) mass spectrometry (MS) measurements were purchased at LC-MS grade from Honeywell (Charlotte, NC).

Preparation of Peptide and Protein Digest Standards

Commercial PepCalMix (SCIEX), containing 20 heavy labeled peptides, at a concentration of 1 pmol/μL (stock solution). Aliquots of 10 μL each were stored at −80 °C until further use. For nanoLC-MS measurements, 1 μL of the PepCalMix aliquot was diluted in 99 μL of 5% v/v acetic acid in 10% v/v acetonitrile containing water (final peptide concentration: 10 fmol/μL). Commercial K562 digests were reconstituted to 2 μg/μL in 0.1% v/v formic acid in water and stored at −80 °C in 10 μL aliquots. Prior to nanoLC-MS measurements, 9 μL of 0.1% v/v formic acid in 2% v/v acetonitrile containing water and 1 μL of the stock PepCalMix solution (1 pmol/μL) were added to the 10 μL K562 digest aliquot. The final K562 peptide concentration was 1 μg/μL. To assess the sensitivity of our acquisition method, we built the calibration curve using the PepCalMix. Different amounts of PepCalMix were spiked into a 0.5 μg/μL K562 digest solution. A total of 6 dilutions were prepared with the following final PepCalMix concentrations: 0.01, 0.1, 1, 10, 50, and 100 nmol/L.

Jurkat Cell Culture and Preparation for Proteomic Analysis

Jurkat cells (ATCC) were cultured in T-75 flasks using RPMI-1640 media (ATCC) supplemented with 10% heat-inactivated fetal bovine serum (Gibco). Cells were passaged to maintain a cell density between 2 × 105 to 2 × 106 cells/mL. The desired number of cells was counted using a Multisizer 3 Coulter Counter (Beckman Coulter, Sykesville, MD) and aliquoted into Protein LoBind Tubes (Eppendorf). Cells were washed three times with Dulbecco’s phosphate buffered saline without calcium and magnesium (Gibco), centrifuging at 200g between washes. Cells were frozen at −20 °C. Jurkat cells digests were obtained following the manufacturer-recommended S-TRAP (Protifi, Farmingdale, NY) protocol. Briefly, 5 × 106 cells were lysed with 50 μL of lysis solution provided in the S-TRAP mini kit. Cysteine residues were reduced (50 mmol/L DTT, 20 min, 75 °C, 1000 rpm) and alkylated (150 mmol/L IAA, 20 min, RT. The protein extract was acidified with 5 μL of 12% v/v phosphoric acid. Then, 350 μL of 90% v/v methanol in 100 mmol/L triethylamine bicarbonate (TEAB) were added to the protein solution, which was then loaded onto the S-TRAP column. Proteins were digested with 7.5 μg of trypsin/Lys-C in 100 mmol/L TEAB for 1.5 h at 47 °C. Peptides were recovered by centrifugation at 1000g for 1 min and successive addition of 0.2% v/v formic acid in water and 0.2% v/v formic acid in 80% v/v acetonitrile in water. Finally, the sample was dried to completeness in a vacuum concentrator (Labconco; Kansas City, MO)

Peptide Separation by NanoLC

Peptide separation was performed using an Eksigent NanoLC 425 system (SCIEX, Framingham, MA) in a nanoflow setting with trap and elute configuration. Peptide samples were loaded onto a C18 trap column (ChromXP C18–3 μm and 120 Å, Eksigent, 350 μm × 0.5 mm, Part #5016752) using an isocratic delivery of 100% solvent A (water containing 0.1% v/v formic acid) at a flow rate of 2 μL/min for 5 min. Then, peptides were separated on a nanoLC column. The organic solution (solvent B) was composed of acetonitrile with 0.1% v/v formic acid. The different columns tested, their properties, and the gradient conditions used for the K562 samples are recapitulated in Table S1. Once the ideal column condition was established for our experiment, peptides were separated on a nanoAcquity nanoLC column (Waters, Wilford, MA, 75 μm × 250 mm) packed with BEH-C18 (1.7 μm × 300 Å) at a flow rate of 200 nL/min. PepCalMix samples were separated using the following 13 min gradient starting at 2% solvent B: 12 min 40% solvent B; 13 min 80% solvent B; 17 min 2% solvent B. The different elution gradients for the K562 digest sample are reported in Table S2. Peptide samples prepared to test the instrument/method sensitivity and the Jurkat digest samples were separated using the 90 min gradient listed in Table S2.

Data-Dependent Acquisition Mass Spectrometry

Eluting peptides were ionized using an OptiFlow Turbo V ion source (nanoESI) and mass analyzed and detected in the positive ion mode by a fast-scanning quadrupole-time-of-flight instrument (TripleTOF 6600+ system, SCIEX, Framingham, MA). The initial acquisition method was set as a “top 30” experiment. TOF-MS scans were acquired with an accumulation time of 250 ms over the m/z range 400 to 1250. Precursor ions were selected for fragmentation with 0.7 amu isolation width, fragmented by CID with nitrogen using manufacturer optimized rolling collision energy. MS/MS spectra were collected in high-sensitivity mode for ions presenting charges between 2 and 5, counts per second above 150, with the following settings: accumulation time, 50 ms and m/z range, 100 to 1500, dynamic exclusion on for 7 s for columns #1 and #2 and for 5 s for column #3, #4, and #5. Any changes applied for DDA method development purposes are summarized in Table S3.

Data Independent Acquisition (SWATH Acquisition)

The same ionization conditions and instruments were used for the DDA methods. Initially, TOF-MS scans were acquired with the following parameters: accumulation time, 50 ms; m/z range, 400 to 1250. Consecutive SWATH acquisition scans were acquired with the following conditions: accumulation time, 50 ms; m/z range, 100 to 1500; SWATH acquisition window, 20 amu with 1 amu overlap. For method evaluation purposes, the isolation window schemes and the mass ranges were changed and are reported in Table S4 and Table S6.

Database Search and Protein Identifications

Raw mass spectra acquired by DDA were processed for protein identification in ProteinPilot software running the Paragon search engine[29,30] (v5.0.2.0, SCIEX). MS/MS spectra were searched against the SwissProt canonical human database (containing 20,396 entries). The search was performed in “Thorough ID” mode, which automatically adjusts the mass tolerance to the resolution of the MS and MS/MS acquisitions. Carbamidomethylation of cysteines, trypsin for digestion, and TripleTOF 6600+ system were set as search defaults. “Thorough ID” mode allows all possible variable modifications including up to 2 missed cleavages per peptide. Protein and peptides are reported with 1% false discovery rate (FDR). Proteins are reported with a minimum of 1 peptide identification above 95% confidence. The lists of proteins identified by DDA for each condition are available in Table S5.

Protein Quantification

Peak extractions from the DIA experiments were performed in PeakView software 2.0 using the SWATH acquisition microapp (SCIEX) using DDA-generated PQPs libraries. The PQPs were generated for nonredundant and unmodified peptides identified from a combined search of K562 and Jurkat samples (total of 26 DDA MS files) in ProteinPilot software. The following criteria were used for MS/MS peak extraction and protein quantification: 6 transitions/peptide; 10 min retention time tolerance; 75 ppm mass tolerance; peptide identification scoring less than 1% FDR; up to 6 peptides/protein. Retention times were aligned using the spiked in PepCal Mix. The peakView software only allows for setting 1 transition per peptide and 1 peptide per protein. Extracted peptides were filtered for 1% FDR. Quantified proteins were filtered to contain a quantitative value across all three replicates. The lists of proteins quantified by SWATH acquisition are available in Table S6.

Statistical Data Analysis

Data analysis and parsing were performed using custom scripts in R language running in Rstudio. All measurements were performed in technical triplicates to calculate the means and the relative-standard-deviations (RSDs).

Data and Script Sharing

Scripts used for the analyses of data are made available on GitHub (https://github.com/Lombardbanekc/CART-SWATH-MS-Data-Processing.git). RAW data and search results files for DDA and DIA experiments have been deposited to the ProteomeXchange server (PXD029780).

Results and Discussion

Designing a Sensitive, Reproducible, and Controlled Method to Quantify Proteins from Cells

Multiple components of the data acquisition workflow were evaluated to enable a sensitive and robust quantification of the proteome of cell-based therapies and to provide suggestions for best practices on implementing instrument controls to expand the application of DIA to the biopharma space. Figure summarizes our approach. First, we revised the chromatography by evaluating a total of five reversed-phase (C18) commercial columns from three different vendors (see Table ), and we then evaluated different tuning parameters for DDA and DIA (Table S3 and Table S4). Although our main application is to perform DIA for quantification, DDA evaluation is still critical as it is used to build the list of PQPs necessary to extract peptide signals and identify proteins from DIA raw files.[18]

Figure 1

Overview of the analytical measurement parameters that were evaluated to build a sensitive method and draw our instrument performance qualifications (PQ) metrics.

Table 1

Summary of the Properties of the Five Columns Tested and the Corresponding Number of Identifications by DDA from 1 μg Injections of K562 Commercial Digesta

column #	vendor	length (cm)	particle size (μm)	pore size (A)	# peptide identified	# protein identified
1	A	15	3	120	17614 ± 186 (22717)	2599 ± 34 (2969)
2	B	15	3	300	18071 ± 590 (22473)	2613 ± 78 (2961)
3	C	15	1.7	130	23081 ± 468 (28231)	3228.3 ± 14 (3550)
4	C	25	1.7	130	28794 ± 131 (35336)	3557 ± 21 (3968)
5	C	25	1.7	300	29772 ± 184 (36151)	3774 ± 165 (4120)

Error is represented by the standard deviation of the number of identifications. Numbers in parentheses represent the combined identifications from triplicate measurements.

Error is represented by the standard deviation of the number of identifications. Numbers in parentheses represent the combined identifications from triplicate measurements. Overview of the analytical measurement parameters that were evaluated to build a sensitive method and draw our instrument performance qualifications (PQ) metrics. The chromatography is a critical component of nanoLC-MS-based proteomic experiments because improved separation can notably enhance method sensitivity.[31−33] For example, reducing the column diameter and support particle size improved peak shape and increased the signal-to-noise ratio for peptides by ≈2-fold.[34] Tuning the MS parameters to fit the application has also been shown to be sometimes valuable in increasing the proteome coverage. Multiple studies using TOF or trapping instruments such as Orbitraps have evaluated the importance of some acquisition parameters in improving the number of proteins identified/quantified.[35−38] For DDA experiments, the chromatographic conditions and the acquisition parameters were evaluated based on the numbers of peptides and proteins identified with 1% FDR. For DIA experiments, we evaluated the numbers of peptides and proteins quantified, the quantification range, and the reproducibility of the quantification measured by the RSD. Once our method was established, we devised an instrument control strategy and defined some metrics to survey to strengthen the applicability of our developed method to biopharmaceutical products like cell therapies (Figure ). Finally, we applied our method to Jurkat cells, an immortalized cancer T-cell cell line currently used to build precompetitive CAR engineered cells.[28]

Evaluating the Importance of the Chromatography in Improving the Identified Proteome Coverage

We evaluated the peptide separation for both DDA and DIA in tandem to make a concerted decision on the column to choose for our application. Each column tested was selected to represent different particle sizes, pore sizes, lengths, and support particle properties. The properties of each column are recorded in Table , and the gradients used for each column are summarized in Table S1. In this study, we purposely kept the vendors and models hidden to remain partial, as commended by the NIST mission. The goals of the evaluation are (1) to demonstrate that different performance metrics are unique to the instrument setup and (2) to evaluate the effect of the column properties on the measurements. During DDA, using 1 μg of a commercial K562 digest on the column, we found that the columns with the smallest particle size led to much higher numbers of peptides and proteins identified per run; more than 3000 proteins were identified per replicate for each of these 3 columns (Table ). The proteins identified using the different columns were mostly complementary (Figure A). Columns #4 and #5, which were the longest, demonstrated many unique proteins identified compared to the other three (Figure A). The increase in proteins identified using the columns with the smallest particle size (columns #3, #4, and #5) was proportionally distributed across the three main cell compartments: Cytosol, nucleus, and membrane (Figure B). This finding suggests that none of the columns were biased toward the cellular location of the proteins.

Figure 2

Column evaluation on protein identification and quantification using the 90 min gradients described in Table S1. (A) Venn diagrams showing the overlap of proteins identified from combined replicate DDA runs between all five columns from 1 μg of K562 digest. (B) Distribution of the identified combined proteins’ cell compartment for each column conditions. (C) Quantitative dynamic range for each column by DIA for 500 ng of K562 digest. (D) Quantitative reproducibility of the quantification between technical replicates as measured by the RSD. Next, we evaluated each column for DIA, using 500 ng (Figure CD) and 200 ng (Figure S1) of K562 protein digest on the column. For consistency, we processed the DIA data from each column using the same list of PQPs obtained as described in the Experimental Section. Retention times were aligned using the spiked in PepCalMix (20 heavy labeled peptides), and the same parameters were used to extract the data (see Experimental Section). Figure C shows the number of proteins quantified as well as their dynamic range. All five columns presented a similar dynamic range of ≈5 orders of magnitude. Surprisingly, column #3 underperformed—fewer proteins were quantified than with the other four columns, despite having good results in DDA experiments. Column #5 performed the best with close to 3000 proteins quantified across all three replicates using 500 ng of protein digest, which is ≈13% more than the next best one—column #4 (Figure C). Moreover, when analyzing only 200 ng of digest, we still quantified ≈2650 proteins using column #5 (Figure S1A). We evaluated the repeatability of our quantification by measuring the RSD across technical triplicate measurements (Figure D and Figure S1B). For all the columns, the median RSDs were below 15%, demonstrating the quality of DIA quantification measurements. Interestingly, the distribution of RSDs for the shorter columns (columns #1, #2, and #3) spread more than for the two longer columns (columns #4 and #5). Column #5 presented an outstandingly low median RSD of ≈4% (Figure D) and a tight distribution around the median value. The RSD values were only slightly increased when measuring 200 ng (Figure S1B). On the basis of the different properties of the chosen columns, we can attribute the improvement in the numbers of proteins identified and quantified using columns #4 and #5 to two main factors: Particle size and column length. These results are not particularly surprising and have been well documented previously for DDA-only experiments.[31,34] Here, we demonstrated that these important column attributes also play a critical role in the quality of the DIA-based quantification, as illustrated by the low median RSDs obtained when using columns #4 and #5. These improvements can be attributed to better chromatographic properties. Indeed, increased resolution of the chromatographic separation and improved chromatographic peak shapes lead to better peak extraction of DIA data. Better resolution of the chromatographic separation decreases the occurrence of coeluting peptides, which decreases tandem mass spectra complexity; better peak shape improves peak statistics. Overall, DDA and DIA-based quantitative results suggested that column #5 was the most favorable for our future experiments and was used for the remainder of this study. The gradient duration on column #5 was further refined using 500 ng of K562 digest (Figure A,B). We aimed to balance gain in protein identifications and throughput and evaluated 4 different gradient durations for DDA experiments: 45, 60, 90, and 120 min. Peptide elution windows of 90 and 120 min are standard for untargeted large-scale bottom-up proteomic experiments by DDA. We also considered shorter gradients, not typically used in large-scale bottom-up proteomics, due to the fast-scanning rate of our quadrupole-time-of-flight instrument. As DDA experiments are here used to build PQP libraries, we purposely report the combined number of protein identifications (Figure A and Table S5). Interestingly, we found that the gain in the number of protein identifications got smaller as we increased the gradient duration (Figure A). When increasing from 90 to 120 min, the number of identified proteins improved by ≈6%, while the throughput decreased by ≈30% (Figure A). The mitigated improvement as the gradient time increased can be attributed to the increase in chromatographic peak width and acquisition redundancy (Figure S2).

Figure 3

Revision of the acquisition conditions for DDA and DIA modes using 500 ng on-column of a K562 digest. DDA results are shown for combined data from three technical replicates. Protein identifications from DIA were filtered to include proteins that had quantification values across all three replicates. (A,B) The gradient length was evaluated for DDA (A) and DIA (B) modes. Gradient descriptions are available in Table S2. (C) Several acquisition parameters were studied for DDA acquisition. Details are reported in Table S3. (D) The window scheme was evaluated for DIA. Conditions are described in Table S4. Key: # Protein IDs, Numbers of protein identified; # Protein Quant., Numbers of protein quantified; Pept. sep. window, Peptide separation window; ctrl, control. DIA runs are typically acquired using similar or shorter gradient than for DDA. Therefore, as we established a 90 min gradient for DDA runs, we eliminated the 120 min gradient from the evaluation for DIA. In DIA, the increases in protein identifications were lesser than with DDA (Figure B). Indeed, we only notice an increase of a maximum of 5% when lengthening the gradient duration using the traditional DIA method. It is worth noting that, unlike with DDA, the % increase was more prominent as we lengthened the gradient duration. However, this observation could reflect a bias in data analysis since the PQP library was built using a 90 min gradient. Changing the gradient also impacted the reproducibility of the quantification. With 90 min gradient, the percent of proteins quantified with a percent relative standard deviation (% RSD) below 10% is higher than with 60 or 45 min gradients (≈55% vs ≈46% and 43%, Figure S2A); the percent of protein quantified with % RSD below 20% is the highest with 60 min gradient (≈79% vs ≈73%, Figure S2A). Overall, we noted that the gradient length played a minor role in improving the number of proteins identified and quantified in DIA, especially compared to the notable improvements resulting from the column change. Most recently, using the newest generation LC system, quantification of ≈2000 proteins was achieved in a 1 min span using this type of instrument and an advanced DIA strategy.[39]

Evaluation of the MS Acquisition Parameters on the Protein Identification and Quantification

Using column #5 and a 90 min gradient, we assessed the role of parameters, deemed critical by the instrument vendor, in improving the number of proteins identified by DDA and the quality of protein quantified by DIA (Figure CD). For DDA, we evaluated the number of precursors selected per instrument cycle, the TOF-MS accumulation time, and the collision energy spread (CES). The number of selected precursors and the TOF-MS accumulation time were adjusted in unison to maintain the cycle time constant. MS acquisition parameters are recapitulated in Table S3. Each change was evaluated against the suggested method from the SCIEX performance evaluation guidebook (ctrl, Figure C). We tested adjusting the cycle time to 1.2 s to match the median chromatographic peak width (fwhm) of 12s, resulting in fewer proteins identified (data not shown). Lowering the TOF-MS accumulation time and increasing the number of selected precursors (cond1) improved the identification numbers the most (≈2%, Figure C). We also evaluated the effect of adding CES to fragment peptides. When using a CES, peptides are fragmented in 10 increments of collision energy (CE), spanning CE–CES to CE+CES, over the TOF-MS/MS accumulation time (here 50 ms). We found that setting the CES to 3 (cond2, Figure C) gave better results. Finally, for our final method, we combined the values that led to the highest improvements in identification and adjusted the m/z TOF-MS scan range to more closely match the span of m/z values present in the digest (Figure S4). We found that changing the m/z range did not affect the number of protein and peptide identifications (data not shown). However, the m/z range affects the DIA by reducing the number of fragmentation windows, leading to a lower cycle time for the same window width and providing better chromatographic peak statistics for reproducible quantification. Pino et al. recently demonstrated the advantages of a narrower m/z scan range for DIA on trapping instruments.[38] The cumulative advantages of more precursors selected and improved CES led to an ≈5% increase in protein identified. This improvement, although small, led to an expansion of the list of PQPs for SWATH acquisition data processing. For DIA, we only evaluated the windowing scheme. All DIA data were searched against the same list of PQPs as defined in the method section. The different conditions were evaluated against the suggested method provided by SCIEX, which uses a 32 fixed windows scheme, but with the TOF-MS m/z scan range adjusted to our sample of choice: 375 to 1000 (ctrl, Figure D). Condition A (condA) and B (condB) were based on a 32 and 30 variable windows scheme, respectively. We found that 30 variable windows led to higher numbers of protein quantified with an ≈12% increase compared to the control. Interestingly, the number of proteins with % RSD lower than 10% was the same across all three conditions (Figure S3B), but the number of proteins with % RSD below 20% was lower with the variable window schemes (69% vs 65%, Figure S3B). It is likely due to the variable window scheme methods identifying more lowly abundant proteins compared to the fixed window scheme (Figure S5).

Evaluation of DIA Method Sensitivity

From the above method revisions, we determined the following DIA parameters: Separation, Column #5 and 90 min gradient; DIA, condition B. Using this method, we evaluated the quantitative performances of our approach by DIA. We established the sensitivity of our DIA method by creating samples containing different concentrations of the PepCalMix (0.01–100 nmol/L) in a constant K562 cell digest background. This sample mix better represents complex peptide samples, which can suffer major interferences due to coeluting and cofragmenting peptides during SWATH acquisition analyses. After extracting the data using the PeakView software, we built calibration curves to establish our analytical method’s lower limit of detection (LLOD) and lower limit of quantification (LLOQ). A representative calibration curve for peptide sequence AVGANPEQLTR (m/z 583.3136, z = 2+) is presented in Figure S6. Results for all peptides are presented in Table . Out of 19 peptides detectable in the method m/z range, we confidently detected 16 of them. The three peptides labeled as nondetected did not pass the FDR cutoff of 1%, likely due to the presence of interfering peptides from the K562 digest (Table ). The majority of the 15 peptides had an LLOQ of 0.1 nmol/L, and four peptides had an LLOQ of 1 nmol/L. Considering that 1 μL of the sample mixture was injected into the column for analysis, the absolute LLOQs are 0.1 fmol and 1 fmol, respectively. For most peptides, we never reached the LLOD, and so we noted the LLOD to be below our last measured concentration of 0.01 nmol/L (10 amol). For three peptides, the LLOD was estimated to be between the last two measured concentrations, 0.01 and 0.1 nmol/L or 10 to 100 amol (Table ).

Table 2

Quantification Sensitivity Measured Using Spiked in PepCalMix into a K562 Digest with a 90 min Gradient (Table S1) on Column #5 and the DIA Condition B (Table S4)a

sequence	m/z	z	LLOQ (nmol/L)	LLOD (nmol/L)	R²
AETSELHTSLK	408.5501	3	1	<0.01	0.89
GAYVEVTAK	473.2602	2	0.1	0.01–0.1	0.95
IGNEQGVSR	485.2530	2	ND	ND	NA
LVGTPAEER	491.2656	2	1	<0.01	0.94
LDSTSIPVAK	519.7997	2	0.1	0.01–0.1	0.93
AGLIVAEGVTK	533.3233	2	0.1	<0.01	0.95
LGLDFDSFR	540.2734	2	1	<0.01	0.84
GFTAYYIPR	549.2863	2	0.1	<0.01	0.95
SGGLLWQLVR	569.8340	2	ND	ND	NA
AVGANPEQLTR	583.3136	2	0.1	<0.01	0.99
SAEGLDASASLR	593.8005	2	0.1	<0.01	0.95
VFTPLEVDVAK	613.3496	2	0.1	<0.01	0.90
VGNEIQYVALR	636.3527	2	0.1	<0.01	0.95
YIELAPGVDNSK	657.3450	2	0.1	0.01–0.1	0.91
DGTFAVDGPGVIAK	677.8583	2	ND	ND	NA
YDSINNTEVSGIR	739.3615	2	1	<0.01	0.96
SPYVITGPGVVEYK	758.9105	2	0.1	<0.01	0.96
ALENDIGVPSDATVK	768.9034	2	0.1	<0.01	0.96
AVYFYAPQIPLYANK	883.4738	2	0.1	<0.01	0.99

A representative example of the calibration curves is presented in Figure S6. Key: LLOQ, lower limit of quantification; LLOD, lower limit of detection; ND, non detected.

A representative example of the calibration curves is presented in Figure S6. Key: LLOQ, lower limit of quantification; LLOD, lower limit of detection; ND, non detected. These LLOQs and LLODs values demonstrate very good sensitivity over a broad range of peptide properties. It raises confidence in the ability of the method to quantify proteins present in cells at low levels in an untargeted manner.

Establishing a Performance Qualification (PQ) Strategy for DIA

We designed a PQ strategy to ensure reproducible analysis of scarce material such as engineered T-cells. The design of our control strategy responds to the MSQC principles as described in the introduction.[20]Figure highlights the overall strategic design and some figures of merit for controlled parameters. Two primary samples are used for PQ. In our case, one is a simple mixture of peptides (QC1)—PepCalMix (20 heavy labeled peptides, ABSCIEX); the other is a complex commercial K562 cell digest (QC2, Figure A). The PepCalMix is used to control the chromatographic separation, ionization efficiency, MS1 signal, MS2 signal, and perform punctual mass calibration using a calibration LC run. The calibration LC run is specific to the TripleTOF 6600+ system used for this study; punctual calibration should be performed as needed following the mass spectrometer manufacturer recommendations. After the calibration run is completed, a calibration report containing data on peptide ions intensities, mass resolution for each peptide, retention times, and mass accuracy for MS1 and MS2 levels is produced. We present some representative data for one select peptide from the PepCalMix (Figure B,C). The K562 digest was run in both DDA mode (K562(D)) and SWATH acquisition mode (K562(S)). The K562 digest helps assessing the ion source, the MS1 and MS2 signals, dynamic ion sampling, and peptide identification (Figure D).

Figure 4

Development of an instrument control using our established acquisition method (column #5, 90 min gradient, cond4 for DDA, and condB for DIA runs). (A) Experimental strategy for implementation of an instrument performance qualification for large-scale proteomic measurements of cell therapies. (B) Extracted ion chromatogram for the AVGANPEQLTR peptide (m/z 583.3136, z = 2+) and representative mass spectrum (inset). (C) Some metrics of control for nanoLC measurements using the PepCalMix, here showcasing the AVGANPEQLTR peptide. (D) Control metrics for the measurements of a complex commercial digest in DDA mode using protein and peptide identifications. (E) DIA control metrics. The selection of the proteins for DIA control is presented in Figure S7. The dots represent the mean quantitative values, and the whiskers span 3 × standard deviation. Key: blue lines indicate high and low limits for system compliance; red line indicates the median measurements; RT, retention time; Res., resolution; Int., intensity. Using the PepCalMix AutoCal runs in our instrument, we can track the performances of the nanoLC-MS system over time. Key parameters to monitor are presented in Figure BC for the select peptide AVGANPEQLTR (m/z 583.3136, z = 2+). By monitoring the extracted ion chromatograms for each of the peptides contained in the PepCalMix, we can evaluate the efficiency of the separation (Figure B). For example, we show that the peak-width-at-half-maximum for the AVGANPEQLTR peptide was ≈0.05 min, representing ≈875 000 plate numbers. For most of the peptides, we recorded plate numbers between ≈175 000 and 675 000. From the generated calibration reports, we extracted the information on the retention time (RT), the mass resolution (Res), and the intensity (Int) for each peptide across different days. On the basis of the collected data, we established thresholds that the system must meet to be considered compliant and for the analysis to move forward, beyond the ability of the mass spectrometer to perform the automated mass calibration in both MS and MS/MS modes. For RTs, the high and low thresholds were defined as 3 × standard deviation of the mean. We defined low-pass thresholds only for resolution and intensity, based on manufacturer recommendations and empirical data. We can evaluate more parameters using the complex digest (K562), including the data processing pipeline (Figure D). In DDA mode, we established a low pass number of proteins and peptides that we expect to be identified, based on a 3 × standard deviation of the mean. Here, we expect our protein identification to be higher than 3350 proteins and 26 900 peptides for 1 μg of digest on the column and a 90 min gradient (Figure D, top). We also completed the evaluation of the K562 digest in DIA mode (Figure D, bottom). Using the 18 K562 DIA runs performed to establish the LLODs and LLOQs, we created a list of 20 proteins spanning the entirety of the quantification range, presenting RSDs below 10% present in all 18 runs. The selection of the proteins based on their quantitative values is shown in Figure S7. These PQ samples control for our platform’s quantitative repeatability and reproducibility without the variation existing in lab-prepared biological samples. In our measurement control design, we recommend performing two DIA runs at the beginning of the sample sequence and one at the end to assess any discrepancies potentially occurring between the first and last sample. Overall, the instrument and data collection control presented here enables a robust and reproducible MS data collection for the measurements of scarce samples. The different thresholds presented in this manuscript are specific to our LC-MS instrument setup and represent a guideline more than hard-set values. Different instruments likely produce different results and require careful evaluation before starting any measurements of biopharma or clinically relevant samples. For example, in the sections above, we demonstrated that different columns led to different metrics in peak shape, retention times, number of proteins and peptides identifications, and quantitative metrics. Our control strategy design applies to DIA experiments on any instrument, but the thresholds should be evaluated on a case-by-case basis.

Application of Our DIA Approach to a Jurkat Cell Sample

To evaluate our method on a sample relevant to our application, we performed DIA on Jurkat cell samples (Figure ). Jurkat cells are immortalized, cancer T-cells, which are similar to primary patient’s T-cells and can serve as a model system for T-cell-based therapies. Indeed, our main target application is to study the proteomic changes linked to the manufacturing of CAR-T cells, which are directly derived from patient’s T-cells. The proteins from the cells were extracted and digested using the S-TRAP protocol.[40] From ≈500 ng of proteins on the column, we were able to quantify a total of ≈2500 proteins spanning 4-orders of magnitude (Figure A). Among the quantified proteins, we found critical markers of T-cells. For example, we were able to quantify the four extracellular subunits of the T-cell receptor complex (CD3E, CD3G, CD247, and CD3D),[41−43] the receptor CD5,[44] the receptor CD28,[45,46] and the transmembrane cell surface protein leukosialin (SPN or CD43).[47] Together, these markers are often used to isolate T-cells from plasma by fluorescence activated cell sorting. These CD receptors are required for T-cell activation and are present in all stages of T-cell differentiation.

Figure 5

Application of our acquisition method to a T-cell model case—Jurkat cells. (A) Protein quantification by DIA of Jurkat cells prepared by S-TRAP lysis and digestion method. Proteins highlighted by sky-blue dots relate to T-cell biology. Some proteins, typically used as markers for T-cell identification, are identified in the graph. (B) Gene ontology annotation of biological processes occurring in unmodified Jurkat cells. The subclassification for the cellular and metabolic processes are presented in Figure S8. Using gene ontology (GO) annotation, here using PantherDB,[48] we could map the quantified proteins to the leading cellular functions (Figure B). These data can now serve as a baseline to study the effect of engineering T-cells/Jurkat cells to express proteins of choice on their surfaces, such as a CAR. Moreover, after establishing a set sample preparation protocol, these types of data are helpful as metrics for a complete system suitability standard, which controls for the sample preparation and the instrument.

Conclusions

We have developed a sensitive DIA method to enable the measurements of emerging cell therapy products. We also established a PQ control strategy to ensure the method’s reproducibility. Our control strategy was based on the MSQC principles, which evaluate key components of the LC-MS instrument and data acquisition. The method can potentially be applied in a regulated environment, as encountered in the biopharma industry using this stringent control strategy. Moreover, we applied the approach to Jurkat cells and were able to identify important markers of T-cells that can be used later as metrics to design complete system suitability standards.

44 in total

1. Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system.

Authors: Ben C Collins; Ludovic C Gillet; George Rosenberger; Hannes L Röst; Anton Vichalkovski; Matthias Gstaiger; Ruedi Aebersold
Journal: Nat Methods Date: 2013-10-27 Impact factor: 28.547

Review 2. Less label, more free: approaches in label-free quantitative mass spectrometry.

Authors: Karlie A Neilson; Naveid A Ali; Sridevi Muralidharan; Mehdi Mirzaei; Michael Mariani; Gariné Assadourian; Albert Lee; Steven C van Sluyter; Paul A Haynes
Journal: Proteomics Date: 2011-01-17 Impact factor: 3.984

3. Suspension Trapping (S-Trap) Is Compatible with Typical Protein Extraction Buffers and Detergents for Bottom-Up Proteomics.

Authors: Dalia Elinger; Alexandra Gabashvili; Yishai Levin
Journal: J Proteome Res Date: 2019-02-20 Impact factor: 4.466

4. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry.

Authors: Wout Bittremieux; Mathias Walzer; Stefan Tenzer; Weimin Zhu; Reza M Salek; Martin Eisenacher; David L Tabb
Journal: Anal Chem Date: 2017-03-30 Impact factor: 6.986

5. An Introduction to Analytics for Autologous Cell and Gene Therapies.

Authors: Christoph Meyer; Erik Rutjens; Thomas Merlin
Journal: Chimia (Aarau) Date: 2020-03-25 Impact factor: 1.509

6. Subnanogram proteomics: impact of LC column selection, MS instrumentation and data analysis strategy on proteome coverage for trace samples.

Authors: Ying Zhu; Rui Zhao; Paul D Piehowski; Ronald J Moore; Sujung Lim; Victoria J Orphan; Ljiljana Paša-Tolić; Wei-Jun Qian; Richard D Smith; Ryan T Kelly
Journal: Int J Mass Spectrom Date: 2017-09-01 Impact factor: 1.986