Camille Lombard-Banek1,2, Kerstin I Pohl3, Edward J Kwee1, John T Elliott1, John E Schiel1,2. 1. National Institute of Standards and Technology, Material and Measurements Laboratory, Gaithersburg, Maryland 20899, United States. 2. Institute for Bioscience and Bioengineering Research, Rockville, Maryland 20850, United States. 3. SCIEX, Framingham, Massachusetts 01701, United States.
Abstract
Mass spectrometry (MS)-based proteomic measurements are uniquely poised to impact the development of cell and gene therapies. With the adoption of rigorous instrumental performance qualifications (PQs), large-scale proteomics can move from a research to a manufacturing control tool. Especially suited, data-independent acquisition (DIA) approaches have distinctive qualities to extend multiattribute method (MAM) principles to characterize the proteome of cell therapies. Here, we describe the development of a DIA method for the sensitive identification and quantification of proteins on a Q-TOF instrument. Using the improved acquisition parameters, we defined a control strategy and highlighted some metrics to improve the reproducibility of SWATH acquisition-based proteomic measurements. Finally, we applied the method to analyze the proteome of Jurkat cells that here serves as a model for human T-cells. Raw and processed data were deposited in PRIDE (PXD029780).
Mass spectrometry (MS)-based proteomic measurements are uniquely poised to impact the development of cell and gene therapies. With the adoption of rigorous instrumental performance qualifications (PQs), large-scale proteomics can move from a research to a manufacturing control tool. Especially suited, data-independent acquisition (DIA) approaches have distinctive qualities to extend multiattribute method (MAM) principles to characterize the proteome of cell therapies. Here, we describe the development of a DIA method for the sensitive identification and quantification of proteins on a Q-TOF instrument. Using the improved acquisition parameters, we defined a control strategy and highlighted some metrics to improve the reproducibility of SWATH acquisition-based proteomic measurements. Finally, we applied the method to analyze the proteome of Jurkat cells that here serves as a model for human T-cells. Raw and processed data were deposited in PRIDE (PXD029780).
Entities:
Keywords:
SWATH acquisition; biopharmaceutical; bottom-up proteomics; cell therapies; data-independent acquisition; mass spectrometry; performance qualification (PQ); quality control (QC)
Complex therapies,
whereby viruses or whole cells act as the drug,
are emergent new treatments requiring new characterization strategies.
For example, chimeric antigen receptor T-cells (CAR-Ts) are modified
patient T-cells that utilize the existing biological properties of
these immune cells to target and kill cancer cells. CAR-Ts are obtained
by engineering the patient’s T-cells to express a receptor
on their surface—the CAR—specific to the surface receptors
on the targeted malignancy. The genomic information for the CAR protein
is incorporated in the cell via transduction with a viral vector (retrovirus
or lentivirus).[1] A series of complex cell
sorting, activation, and expansion steps are required before reintroducing
the transduced CAR-T cell product back into the patient. Details on
the manufacturing of CAR-T cells are available in recent reviews.[1,2]Current state-of-the-art analyses of CAR-T drug products rely
on
the measurements of a few select proteins.[3−5] For example,
fluorescence-activated-cell sorting (FACS) measures T-cell population
purity and CAR expression using fluorescently labeled antibodies against
T-cell surface markers (CD4 and/or CD8) or the CAR, respectively.
The drug’s pharmacological activity is assessed via activation
using beads decorated with the tumor surface receptor followed by
cytokine-release assays.[3,4,6] Cytokine-release assays measure signaling proteins (interleukin,
interferon, and growth factors) released by the CAR-Ts. Although useful,
these assays only measure a limited number of quality attributes of
the raw material (T-cells) or the product (CAR-Ts). Identifying attributes
that better predict the quality of the product could better position
these drugs from a last resort to the second or first line of treatments,
which requires better characterization of the manufacturing process
and the final product.[4,7]MS enables the characterization
of a large number of proteins in
a single label-free (i.e., no antibodies) experiment.[8,9] We have recently reviewed the potential benefits of MS-based proteomics
in addressing the challenges to characterize cell therapies.[8] Two main data acquisition strategies exist to
measure proteins in an untargeted fashion (also referred to as shotgun
approaches): data-dependent and data-independent acquisitions (DDA
and DIA, respectively). In DDA, peptide ions are selected for fragmentation
using a narrow isolation window following a top-N scheme.[10,11] In a top-N scheme, the N most abundant precursor ions are selected
for fragmentation per instrument cycle time. The fundamental nature
of DDA renders the identification of the same peptide/proteins in
replicate runs stochastic. Therefore, label-free quantification using
DDA often leads to missing peptide information across replicates,
referred to as missing values, which decreases quantitative coverage
and statistical power. Strategies such as multiplexing with isobaric
tags partially remediate the issue, albeit at a high cost.[11,12] Conversely, in DIA, broad isolation windows scanning across the
entire m/z range enable the fragmentation
of all peptide ions regardless of their intensity, leading to a significant
decrease in missing values.[3,13,14] Mirroring the current multi-attribute-method (MAM) employed for
single protein-drug molecules,[15−17] DIA can quantify multiple proteins
with high precision for more complex biopharmaceutical systems like
cell therapies. In MAM, a preliminary run is performed in DDA mode
to determine the list of peptide identities and their respective retention
times. Consecutive runs are then performed and compared in MS-only
mode (no fragmentation) to quantify peptides of interest.[15] Similarly, in DIA, DDA is performed first to
build a list of peptide-query-parameters (PQPs) used to extract peptide
and protein identities from raw DIA. PQPs encompass fragment ion (transition)
lists for each identified peptide and their respective retention times.[18] Proteins are quantified using the sum of the
integrated area under the extracted chromatogram curve of the peptide
fragment ions.During MS-based proteomic analysis, multiple
factors that have
been summarized elsewhere[19−21] contribute to the technical variability
of the measurements. NanoLC-MS instruments contribute in large part
to the measurement variability. The National Institute of Standards
and Technology (NIST) and the National Cancer Institute (NCI) have
established 46 metrics to evaluate the performance of LC-MS systems,
called Mass Spectrometry Quality Control (MSQC). These 46 metrics
correspond to 6 categories critical to LC-MS measurements: chromatography,
dynamic sampling, ion source, MS1 signal, MS2 signal, and peptide
identification.[20] Several informatics tools
have been developed to monitor instrument performances.[21−24] Now, technical variability in large-scale bottom-up proteomics by
DIA can be assessed using these principles and experimental design
borrowed from MAM and/or clinical proteomics.[8]MS-based proteomics has proven beneficial to shed light on
the
mechanism of action of CAR-Ts[25−27] and is poised to identify additional
process and/or product quality attributes by monitoring cell health
at critical stages of the manufacturing process. Expansion of large-scale
MS-based proteomics from the research setting to the process development
requires stringent performance qualifications (PQs) to be fit-for-purpose.
Here, we describe the development of a sensitive and controlled MS-based
proteomic acquisition method on a quadrupole time-of-flight system
using DIA toward the analysis of CAR-T cell therapies. We first established
the different conditions that provided the highest sensitivity and
reproducibility, including the separation, data-dependent acquisition
for PQPs library building, and DIA for quantification. Then, we provide
guidelines and metrics to extend DIA from a research setting to the
biopharma space using the MSQC principles. Finally, we applied the
strategy to the measurement of a Jurkat cells digest. Jurkat cells
are immortalized lymphoblastic T-cells and are being used to produce
CAR-T mimetic to be employed as a method development tool and system
suitability test.[28]
Experimental Section
Reagents
Reagents were purchased at reagent grade or
higher. Standard K562 protein digests were from SCIEX (Framingham,
MA) or Promega (Madison, WI). PepCalMix, containing 20 heavy labeled
peptides were from SCIEX. Dithiothreitol (DTT, #39255) and iodoacetamide
(IAA, #39271) were procured in no-weigh format from Thermo Fisher
Scientific (Waltham, MA). MS-grade trypsin/Lys-C protease mix was
from Thermo Fisher Scientific (#A41007). Solvents for liquid-chromatography
(LC) mass spectrometry (MS) measurements were purchased at LC-MS grade
from Honeywell (Charlotte, NC).
Preparation of Peptide
and Protein Digest Standards
Commercial PepCalMix (SCIEX),
containing 20 heavy labeled peptides,
at a concentration of 1 pmol/μL (stock solution). Aliquots of
10 μL each were stored at −80 °C until further use.
For nanoLC-MS measurements, 1 μL of the PepCalMix aliquot was
diluted in 99 μL of 5% v/v acetic acid in 10% v/v acetonitrile
containing water (final peptide concentration: 10 fmol/μL).Commercial K562 digests were reconstituted to 2 μg/μL
in 0.1% v/v formic acid in water and stored at −80 °C
in 10 μL aliquots. Prior to nanoLC-MS measurements, 9 μL
of 0.1% v/v formic acid in 2% v/v acetonitrile containing water and
1 μL of the stock PepCalMix solution (1 pmol/μL) were
added to the 10 μL K562 digest aliquot. The final K562 peptide
concentration was 1 μg/μL.To assess the sensitivity
of our acquisition method, we built the
calibration curve using the PepCalMix. Different amounts of PepCalMix
were spiked into a 0.5 μg/μL K562 digest solution. A total
of 6 dilutions were prepared with the following final PepCalMix concentrations:
0.01, 0.1, 1, 10, 50, and 100 nmol/L.
Jurkat Cell Culture and
Preparation for Proteomic Analysis
Jurkat cells (ATCC) were
cultured in T-75 flasks using RPMI-1640
media (ATCC) supplemented with 10% heat-inactivated fetal bovine serum
(Gibco). Cells were passaged to maintain a cell density between 2
× 105 to 2 × 106 cells/mL. The desired
number of cells was counted using a Multisizer 3 Coulter Counter (Beckman
Coulter, Sykesville, MD) and aliquoted into Protein LoBind Tubes (Eppendorf).
Cells were washed three times with Dulbecco’s phosphate buffered
saline without calcium and magnesium (Gibco), centrifuging at 200g between washes. Cells were frozen at −20 °C.Jurkat cells digests were obtained following the manufacturer-recommended
S-TRAP (Protifi, Farmingdale, NY) protocol. Briefly, 5 × 106 cells were lysed with 50 μL of lysis solution provided
in the S-TRAP mini kit. Cysteine residues were reduced (50 mmol/L
DTT, 20 min, 75 °C, 1000 rpm) and alkylated (150 mmol/L IAA,
20 min, RT. The protein extract was acidified with 5 μL of 12%
v/v phosphoric acid. Then, 350 μL of 90% v/v methanol in 100
mmol/L triethylamine bicarbonate (TEAB) were added to the protein
solution, which was then loaded onto the S-TRAP column. Proteins were
digested with 7.5 μg of trypsin/Lys-C in 100 mmol/L TEAB for
1.5 h at 47 °C. Peptides were recovered by centrifugation at
1000g for 1 min and successive addition of 0.2% v/v
formic acid in water and 0.2% v/v formic acid in 80% v/v acetonitrile
in water. Finally, the sample was dried to completeness in a vacuum
concentrator (Labconco; Kansas City, MO)
Peptide Separation by NanoLC
Peptide separation was
performed using an Eksigent NanoLC 425 system (SCIEX, Framingham,
MA) in a nanoflow setting with trap and elute configuration. Peptide
samples were loaded onto a C18 trap column (ChromXP C18–3 μm
and 120 Å, Eksigent, 350 μm × 0.5 mm, Part #5016752)
using an isocratic delivery of 100% solvent A (water containing 0.1%
v/v formic acid) at a flow rate of 2 μL/min for 5 min. Then,
peptides were separated on a nanoLC column. The organic solution (solvent
B) was composed of acetonitrile with 0.1% v/v formic acid. The different
columns tested, their properties, and the gradient conditions used
for the K562 samples are recapitulated in Table S1. Once the ideal column condition was established for our
experiment, peptides were separated on a nanoAcquity nanoLC column
(Waters, Wilford, MA, 75 μm × 250 mm) packed with BEH-C18
(1.7 μm × 300 Å) at a flow rate of 200 nL/min. PepCalMix
samples were separated using the following 13 min gradient starting
at 2% solvent B: 12 min 40% solvent B; 13 min 80% solvent B; 17 min
2% solvent B. The different elution gradients for the K562 digest
sample are reported in Table S2. Peptide
samples prepared to test the instrument/method sensitivity and the
Jurkat digest samples were separated using the 90 min gradient listed
in Table S2.
Data-Dependent Acquisition
Mass Spectrometry
Eluting
peptides were ionized using an OptiFlow Turbo V ion source (nanoESI)
and mass analyzed and detected in the positive ion mode by a fast-scanning
quadrupole-time-of-flight instrument (TripleTOF 6600+ system, SCIEX,
Framingham, MA). The initial acquisition method was set as a “top
30” experiment. TOF-MS scans were acquired with an accumulation
time of 250 ms over the m/z range
400 to 1250. Precursor ions were selected for fragmentation with 0.7
amu isolation width, fragmented by CID with nitrogen using manufacturer
optimized rolling collision energy. MS/MS spectra were collected in
high-sensitivity mode for ions presenting charges between 2 and 5,
counts per second above 150, with the following settings: accumulation
time, 50 ms and m/z range, 100 to
1500, dynamic exclusion on for 7 s for columns #1 and #2 and for 5
s for column #3, #4, and #5. Any changes applied for DDA method development
purposes are summarized in Table S3.
Data Independent Acquisition (SWATH Acquisition)
The
same ionization conditions and instruments were used for the DDA methods.
Initially, TOF-MS scans were acquired with the following parameters:
accumulation time, 50 ms; m/z range,
400 to 1250. Consecutive SWATH acquisition scans were acquired with
the following conditions: accumulation time, 50 ms; m/z range, 100 to 1500; SWATH acquisition window,
20 amu with 1 amu overlap. For method evaluation purposes, the isolation
window schemes and the mass ranges were changed and are reported in Table S4 and Table S6.
Database Search and Protein Identifications
Raw mass
spectra acquired by DDA were processed for protein identification
in ProteinPilot software running the Paragon search engine[29,30] (v5.0.2.0, SCIEX). MS/MS spectra were searched against the SwissProt
canonical human database (containing 20,396 entries). The search was
performed in “Thorough ID” mode, which automatically
adjusts the mass tolerance to the resolution of the MS and MS/MS acquisitions.
Carbamidomethylation of cysteines, trypsin for digestion, and TripleTOF
6600+ system were set as search defaults. “Thorough ID”
mode allows all possible variable modifications including up to 2
missed cleavages per peptide. Protein and peptides are reported with
1% false discovery rate (FDR). Proteins are reported with a minimum
of 1 peptide identification above 95% confidence. The lists of proteins
identified by DDA for each condition are available in Table S5.
Protein Quantification
Peak extractions from the DIA
experiments were performed in PeakView software 2.0 using the SWATH
acquisition microapp (SCIEX) using DDA-generated PQPs libraries. The
PQPs were generated for nonredundant and unmodified peptides identified
from a combined search of K562 and Jurkat samples (total of 26 DDA
MS files) in ProteinPilot software. The following criteria were used
for MS/MS peak extraction and protein quantification: 6 transitions/peptide;
10 min retention time tolerance; 75 ppm mass tolerance; peptide identification
scoring less than 1% FDR; up to 6 peptides/protein. Retention times
were aligned using the spiked in PepCal Mix. The peakView software
only allows for setting 1 transition per peptide and 1 peptide per
protein. Extracted peptides were filtered for 1% FDR. Quantified proteins
were filtered to contain a quantitative value across all three replicates.
The lists of proteins quantified by SWATH acquisition are available
in Table S6.
Statistical Data Analysis
Data analysis and parsing
were performed using custom scripts in R language running in Rstudio.
All measurements were performed in technical triplicates to calculate
the means and the relative-standard-deviations (RSDs).
Data and Script
Sharing
Scripts used for the analyses
of data are made available on GitHub (https://github.com/Lombardbanekc/CART-SWATH-MS-Data-Processing.git). RAW data and search results files for DDA and DIA experiments
have been deposited to the ProteomeXchange server (PXD029780).
Results
and Discussion
Designing a Sensitive, Reproducible, and
Controlled Method to
Quantify Proteins from Cells
Multiple components of the data
acquisition workflow were evaluated to enable a sensitive and robust
quantification of the proteome of cell-based therapies and to provide
suggestions for best practices on implementing instrument controls
to expand the application of DIA to the biopharma space. Figure summarizes our approach.
First, we revised the chromatography by evaluating a total of five
reversed-phase (C18) commercial columns from three different vendors
(see Table ), and
we then evaluated different tuning parameters for DDA and DIA (Table S3 and Table S4). Although our main application is to perform DIA for quantification,
DDA evaluation is still critical as it is used to build the list of
PQPs necessary to extract peptide signals and identify proteins from
DIA raw files.[18]
Figure 1
Overview of the analytical measurement
parameters that were evaluated
to build a sensitive method and draw our instrument performance qualifications
(PQ) metrics.
Table 1
Summary
of the Properties of the Five
Columns Tested and the Corresponding Number of Identifications by
DDA from 1 μg Injections of K562 Commercial Digesta
column #
vendor
length (cm)
particle size (μm)
pore size (A)
# peptide
identified
# protein identified
1
A
15
3
120
17614 ± 186 (22717)
2599 ± 34 (2969)
2
B
15
3
300
18071 ± 590 (22473)
2613 ± 78 (2961)
3
C
15
1.7
130
23081 ± 468 (28231)
3228.3 ± 14 (3550)
4
C
25
1.7
130
28794 ± 131 (35336)
3557 ± 21 (3968)
5
C
25
1.7
300
29772 ± 184 (36151)
3774 ± 165 (4120)
Error is represented by the standard
deviation of the number of identifications. Numbers in parentheses
represent the combined identifications from triplicate measurements.
Error is represented by the standard
deviation of the number of identifications. Numbers in parentheses
represent the combined identifications from triplicate measurements.Overview of the analytical measurement
parameters that were evaluated
to build a sensitive method and draw our instrument performance qualifications
(PQ) metrics.The chromatography is a critical
component of nanoLC-MS-based proteomic
experiments because improved separation can notably enhance method
sensitivity.[31−33] For example, reducing the column diameter and support
particle size improved peak shape and increased the signal-to-noise
ratio for peptides by ≈2-fold.[34] Tuning the MS parameters to fit the application has also been shown
to be sometimes valuable in increasing the proteome coverage. Multiple
studies using TOF or trapping instruments such as Orbitraps have evaluated
the importance of some acquisition parameters in improving the number
of proteins identified/quantified.[35−38] For DDA experiments, the chromatographic
conditions and the acquisition parameters were evaluated based on
the numbers of peptides and proteins identified with 1% FDR. For DIA
experiments, we evaluated the numbers of peptides and proteins quantified,
the quantification range, and the reproducibility of the quantification
measured by the RSD. Once our method was established, we devised an
instrument control strategy and defined some metrics to survey to
strengthen the applicability of our developed method to biopharmaceutical
products like cell therapies (Figure ). Finally, we applied our method to Jurkat cells,
an immortalized cancer T-cell cell line currently used to build precompetitive
CAR engineered cells.[28]
Evaluating
the Importance of the Chromatography in Improving
the Identified Proteome Coverage
We evaluated the peptide
separation for both DDA and DIA in tandem to make a concerted decision
on the column to choose for our application. Each column tested was
selected to represent different particle sizes, pore sizes, lengths,
and support particle properties. The properties of each column are
recorded in Table , and the gradients used for each column are summarized in Table S1. In this study, we purposely kept the
vendors and models hidden to remain partial, as commended by the NIST
mission. The goals of the evaluation are (1) to demonstrate that different
performance metrics are unique to the instrument setup and (2) to
evaluate the effect of the column properties on the measurements.
During DDA, using 1 μg of a commercial K562 digest on the column,
we found that the columns with the smallest particle size led to much
higher numbers of peptides and proteins identified per run; more than
3000 proteins were identified per replicate for each of these 3 columns
(Table ). The proteins
identified using the different columns were mostly complementary (Figure A). Columns #4 and
#5, which were the longest, demonstrated many unique proteins identified
compared to the other three (Figure A). The increase in proteins identified using the columns
with the smallest particle size (columns #3, #4, and #5) was proportionally
distributed across the three main cell compartments: Cytosol, nucleus,
and membrane (Figure B). This finding suggests that none of the columns were biased toward
the cellular location of the proteins.
Figure 2
Column evaluation on
protein identification and quantification
using the 90 min gradients described in Table S1. (A) Venn diagrams showing the overlap of proteins identified
from combined replicate DDA runs between all five columns from 1 μg
of K562 digest. (B) Distribution of the identified combined proteins’
cell compartment for each column conditions. (C) Quantitative dynamic
range for each column by DIA for 500 ng of K562 digest. (D) Quantitative
reproducibility of the quantification between technical replicates
as measured by the RSD.
Column evaluation on
protein identification and quantification
using the 90 min gradients described in Table S1. (A) Venn diagrams showing the overlap of proteins identified
from combined replicate DDA runs between all five columns from 1 μg
of K562 digest. (B) Distribution of the identified combined proteins’
cell compartment for each column conditions. (C) Quantitative dynamic
range for each column by DIA for 500 ng of K562 digest. (D) Quantitative
reproducibility of the quantification between technical replicates
as measured by the RSD.Next, we evaluated each
column for DIA, using 500 ng (Figure CD) and 200 ng (Figure S1) of K562 protein digest on the column.
For consistency, we processed the DIA data from each column using
the same list of PQPs obtained as described in the Experimental Section. Retention times were aligned using the
spiked in PepCalMix (20 heavy labeled peptides), and the same parameters
were used to extract the data (see Experimental Section). Figure C shows
the number of proteins quantified as well as their dynamic range.
All five columns presented a similar dynamic range of ≈5 orders
of magnitude. Surprisingly, column #3 underperformed—fewer
proteins were quantified than with the other four columns, despite
having good results in DDA experiments. Column #5 performed the best
with close to 3000 proteins quantified across all three replicates
using 500 ng of protein digest, which is ≈13% more than the
next best one—column #4 (Figure C). Moreover, when analyzing only 200 ng of digest,
we still quantified ≈2650 proteins using column #5 (Figure S1A). We evaluated the repeatability of
our quantification by measuring the RSD across technical triplicate
measurements (Figure D and Figure S1B). For all the columns,
the median RSDs were below 15%, demonstrating the quality of DIA quantification
measurements. Interestingly, the distribution of RSDs for the shorter
columns (columns #1, #2, and #3) spread more than for the two longer
columns (columns #4 and #5). Column #5 presented an outstandingly
low median RSD of ≈4% (Figure D) and a tight distribution around the median value.
The RSD values were only slightly increased when measuring 200 ng
(Figure S1B).On the basis of the
different properties of the chosen columns,
we can attribute the improvement in the numbers of proteins identified
and quantified using columns #4 and #5 to two main factors: Particle
size and column length. These results are not particularly surprising
and have been well documented previously for DDA-only experiments.[31,34] Here, we demonstrated that these important column attributes also
play a critical role in the quality of the DIA-based quantification,
as illustrated by the low median RSDs obtained when using columns
#4 and #5. These improvements can be attributed to better chromatographic
properties. Indeed, increased resolution of the chromatographic separation
and improved chromatographic peak shapes lead to better peak extraction
of DIA data. Better resolution of the chromatographic separation decreases
the occurrence of coeluting peptides, which decreases tandem mass
spectra complexity; better peak shape improves peak statistics. Overall,
DDA and DIA-based quantitative results suggested that column #5 was
the most favorable for our future experiments and was used for the
remainder of this study.The gradient duration on column #5
was further refined using 500
ng of K562 digest (Figure A,B). We aimed to balance gain in protein identifications
and throughput and evaluated 4 different gradient durations for DDA
experiments: 45, 60, 90, and 120 min. Peptide elution windows of 90
and 120 min are standard for untargeted large-scale bottom-up proteomic
experiments by DDA. We also considered shorter gradients, not typically
used in large-scale bottom-up proteomics, due to the fast-scanning
rate of our quadrupole-time-of-flight instrument. As DDA experiments
are here used to build PQP libraries, we purposely report the combined
number of protein identifications (Figure A and Table S5). Interestingly, we found that the gain in the number of protein
identifications got smaller as we increased the gradient duration
(Figure A). When increasing
from 90 to 120 min, the number of identified proteins improved by
≈6%, while the throughput decreased by ≈30% (Figure A). The mitigated
improvement as the gradient time increased can be attributed to the
increase in chromatographic peak width and acquisition redundancy
(Figure S2).
Figure 3
Revision of the acquisition
conditions for DDA and DIA modes using
500 ng on-column of a K562 digest. DDA results are shown for combined
data from three technical replicates. Protein identifications from
DIA were filtered to include proteins that had quantification values
across all three replicates. (A,B) The gradient length was evaluated
for DDA (A) and DIA (B) modes. Gradient descriptions are available
in Table S2. (C) Several acquisition parameters
were studied for DDA acquisition. Details are reported in Table S3. (D) The window scheme was evaluated
for DIA. Conditions are described in Table S4. Key: # Protein IDs, Numbers of protein identified; # Protein Quant.,
Numbers of protein quantified; Pept. sep. window, Peptide separation
window; ctrl, control.
Revision of the acquisition
conditions for DDA and DIA modes using
500 ng on-column of a K562 digest. DDA results are shown for combined
data from three technical replicates. Protein identifications from
DIA were filtered to include proteins that had quantification values
across all three replicates. (A,B) The gradient length was evaluated
for DDA (A) and DIA (B) modes. Gradient descriptions are available
in Table S2. (C) Several acquisition parameters
were studied for DDA acquisition. Details are reported in Table S3. (D) The window scheme was evaluated
for DIA. Conditions are described in Table S4. Key: # Protein IDs, Numbers of protein identified; # Protein Quant.,
Numbers of protein quantified; Pept. sep. window, Peptide separation
window; ctrl, control.DIA runs are typically
acquired using similar or shorter gradient
than for DDA. Therefore, as we established a 90 min gradient for DDA
runs, we eliminated the 120 min gradient from the evaluation for DIA.
In DIA, the increases in protein identifications were lesser than
with DDA (Figure B).
Indeed, we only notice an increase of a maximum of 5% when lengthening
the gradient duration using the traditional DIA method. It is worth
noting that, unlike with DDA, the % increase was more prominent as
we lengthened the gradient duration. However, this observation could
reflect a bias in data analysis since the PQP library was built using
a 90 min gradient.Changing the gradient also impacted the reproducibility
of the
quantification. With 90 min gradient, the percent of proteins quantified
with a percent relative standard deviation (% RSD) below 10% is higher
than with 60 or 45 min gradients (≈55% vs ≈46% and 43%, Figure S2A); the percent of protein quantified
with % RSD below 20% is the highest with 60 min gradient (≈79%
vs ≈73%, Figure S2A). Overall, we
noted that the gradient length played a minor role in improving the
number of proteins identified and quantified in DIA, especially compared
to the notable improvements resulting from the column change. Most
recently, using the newest generation LC system, quantification of
≈2000 proteins was achieved in a 1 min span using this type
of instrument and an advanced DIA strategy.[39]
Evaluation of the MS Acquisition Parameters on the Protein Identification
and Quantification
Using column #5 and a 90 min gradient,
we assessed the role of parameters, deemed critical by the instrument
vendor, in improving the number of proteins identified by DDA and
the quality of protein quantified by DIA (Figure CD). For DDA, we evaluated the number of
precursors selected per instrument cycle, the TOF-MS accumulation
time, and the collision energy spread (CES). The number of selected
precursors and the TOF-MS accumulation time were adjusted in unison
to maintain the cycle time constant. MS acquisition parameters are
recapitulated in Table S3. Each change
was evaluated against the suggested method from the SCIEX performance
evaluation guidebook (ctrl, Figure C). We tested adjusting the cycle time to 1.2 s to
match the median chromatographic peak width (fwhm) of 12s, resulting
in fewer proteins identified (data not shown). Lowering the TOF-MS
accumulation time and increasing the number of selected precursors
(cond1) improved the identification numbers the most (≈2%, Figure C). We also evaluated
the effect of adding CES to fragment peptides. When using a CES, peptides
are fragmented in 10 increments of collision energy (CE), spanning
CE–CES to CE+CES, over the TOF-MS/MS accumulation time (here
50 ms). We found that setting the CES to 3 (cond2, Figure C) gave better results. Finally,
for our final method, we combined the values that led to the highest
improvements in identification and adjusted the m/z TOF-MS scan range to more closely match the span
of m/z values present in the digest
(Figure S4). We found that changing the m/z range did not affect the number of
protein and peptide identifications (data not shown). However, the m/z range affects the DIA by reducing the
number of fragmentation windows, leading to a lower cycle time for
the same window width and providing better chromatographic peak statistics
for reproducible quantification. Pino et al. recently demonstrated
the advantages of a narrower m/z scan range for DIA on trapping instruments.[38] The cumulative advantages of more precursors selected and improved
CES led to an ≈5% increase in protein identified. This improvement,
although small, led to an expansion of the list of PQPs for SWATH
acquisition data processing.For DIA, we only evaluated the
windowing scheme. All DIA data were searched against the same list
of PQPs as defined in the method section. The different conditions
were evaluated against the suggested method provided by SCIEX, which
uses a 32 fixed windows scheme, but with the TOF-MS m/z scan range adjusted to our sample of choice:
375 to 1000 (ctrl, Figure D). Condition A (condA) and B (condB) were based on a 32 and
30 variable windows scheme, respectively. We found that 30 variable
windows led to higher numbers of protein quantified with an ≈12%
increase compared to the control. Interestingly, the number of proteins
with % RSD lower than 10% was the same across all three conditions
(Figure S3B), but the number of proteins
with % RSD below 20% was lower with the variable window schemes (69%
vs 65%, Figure S3B). It is likely due to
the variable window scheme methods identifying more lowly abundant
proteins compared to the fixed window scheme (Figure S5).
Evaluation of DIA Method Sensitivity
From the above
method revisions, we determined the following DIA parameters: Separation,
Column #5 and 90 min gradient; DIA, condition B. Using this method,
we evaluated the quantitative performances of our approach by DIA.
We established the sensitivity of our DIA method by creating samples
containing different concentrations of the PepCalMix (0.01–100
nmol/L) in a constant K562 cell digest background. This sample mix
better represents complex peptide samples, which can suffer major
interferences due to coeluting and cofragmenting peptides during SWATH
acquisition analyses. After extracting the data using the PeakView
software, we built calibration curves to establish our analytical
method’s lower limit of detection (LLOD) and lower limit of
quantification (LLOQ). A representative calibration curve for peptide
sequence AVGANPEQLTR (m/z 583.3136, z = 2+) is presented in Figure S6. Results for all peptides are presented in Table . Out of 19 peptides detectable in the method
m/z range, we confidently detected 16 of them. The three peptides
labeled as nondetected did not pass the FDR cutoff of 1%, likely due
to the presence of interfering peptides from the K562 digest (Table ). The majority of
the 15 peptides had an LLOQ of 0.1 nmol/L, and four peptides had an
LLOQ of 1 nmol/L. Considering that 1 μL of the sample mixture
was injected into the column for analysis, the absolute LLOQs are
0.1 fmol and 1 fmol, respectively. For most peptides, we never reached
the LLOD, and so we noted the LLOD to be below our last measured concentration
of 0.01 nmol/L (10 amol). For three peptides, the LLOD was estimated
to be between the last two measured concentrations, 0.01 and 0.1 nmol/L
or 10 to 100 amol (Table ).
Table 2
Quantification Sensitivity Measured
Using Spiked in PepCalMix into a K562 Digest with a 90 min Gradient
(Table S1) on Column #5 and the DIA Condition
B (Table S4)a
sequence
m/z
z
LLOQ (nmol/L)
LLOD
(nmol/L)
R2
AETSELHTSLK
408.5501
3
1
<0.01
0.89
GAYVEVTAK
473.2602
2
0.1
0.01–0.1
0.95
IGNEQGVSR
485.2530
2
ND
ND
NA
LVGTPAEER
491.2656
2
1
<0.01
0.94
LDSTSIPVAK
519.7997
2
0.1
0.01–0.1
0.93
AGLIVAEGVTK
533.3233
2
0.1
<0.01
0.95
LGLDFDSFR
540.2734
2
1
<0.01
0.84
GFTAYYIPR
549.2863
2
0.1
<0.01
0.95
SGGLLWQLVR
569.8340
2
ND
ND
NA
AVGANPEQLTR
583.3136
2
0.1
<0.01
0.99
SAEGLDASASLR
593.8005
2
0.1
<0.01
0.95
VFTPLEVDVAK
613.3496
2
0.1
<0.01
0.90
VGNEIQYVALR
636.3527
2
0.1
<0.01
0.95
YIELAPGVDNSK
657.3450
2
0.1
0.01–0.1
0.91
DGTFAVDGPGVIAK
677.8583
2
ND
ND
NA
YDSINNTEVSGIR
739.3615
2
1
<0.01
0.96
SPYVITGPGVVEYK
758.9105
2
0.1
<0.01
0.96
ALENDIGVPSDATVK
768.9034
2
0.1
<0.01
0.96
AVYFYAPQIPLYANK
883.4738
2
0.1
<0.01
0.99
A representative example of the
calibration curves is presented in Figure S6. Key: LLOQ, lower limit of quantification; LLOD, lower limit of
detection; ND, non detected.
A representative example of the
calibration curves is presented in Figure S6. Key: LLOQ, lower limit of quantification; LLOD, lower limit of
detection; ND, non detected.These LLOQs and LLODs values demonstrate very good sensitivity
over a broad range of peptide properties. It raises confidence in
the ability of the method to quantify proteins present in cells at
low levels in an untargeted manner.
Establishing a Performance
Qualification (PQ) Strategy for DIA
We designed a PQ strategy
to ensure reproducible analysis of scarce
material such as engineered T-cells. The design of our control strategy
responds to the MSQC principles as described in the introduction.[20]Figure highlights the overall strategic design and some figures
of merit for controlled parameters. Two primary samples are used for
PQ. In our case, one is a simple mixture of peptides (QC1)—PepCalMix
(20 heavy labeled peptides, ABSCIEX); the other is a complex commercial
K562 cell digest (QC2, Figure A). The PepCalMix is used to control the chromatographic separation,
ionization efficiency, MS1 signal, MS2 signal, and perform punctual
mass calibration using a calibration LC run. The calibration LC run
is specific to the TripleTOF 6600+ system used for this study; punctual
calibration should be performed as needed following the mass spectrometer
manufacturer recommendations. After the calibration run is completed,
a calibration report containing data on peptide ions intensities,
mass resolution for each peptide, retention times, and mass accuracy
for MS1 and MS2 levels is produced. We present some representative
data for one select peptide from the PepCalMix (Figure B,C). The K562 digest was run in both DDA
mode (K562(D)) and SWATH acquisition mode (K562(S)). The K562 digest
helps assessing the ion source, the MS1 and MS2 signals, dynamic ion
sampling, and peptide identification (Figure D).
Figure 4
Development of an instrument control using our
established acquisition
method (column #5, 90 min gradient, cond4 for DDA, and condB for DIA
runs). (A) Experimental strategy for implementation of an instrument
performance qualification for large-scale proteomic measurements of
cell therapies. (B) Extracted ion chromatogram for the AVGANPEQLTR
peptide (m/z 583.3136, z = 2+) and representative mass spectrum (inset). (C) Some metrics
of control for nanoLC measurements using the PepCalMix, here showcasing
the AVGANPEQLTR peptide. (D) Control metrics for the measurements
of a complex commercial digest in DDA mode using protein and peptide
identifications. (E) DIA control metrics. The selection of the proteins
for DIA control is presented in Figure S7. The dots represent the mean quantitative values, and the whiskers
span 3 × standard deviation. Key: blue lines indicate high and
low limits for system compliance; red line indicates the median measurements;
RT, retention time; Res., resolution; Int., intensity.
Development of an instrument control using our
established acquisition
method (column #5, 90 min gradient, cond4 for DDA, and condB for DIA
runs). (A) Experimental strategy for implementation of an instrument
performance qualification for large-scale proteomic measurements of
cell therapies. (B) Extracted ion chromatogram for the AVGANPEQLTR
peptide (m/z 583.3136, z = 2+) and representative mass spectrum (inset). (C) Some metrics
of control for nanoLC measurements using the PepCalMix, here showcasing
the AVGANPEQLTR peptide. (D) Control metrics for the measurements
of a complex commercial digest in DDA mode using protein and peptide
identifications. (E) DIA control metrics. The selection of the proteins
for DIA control is presented in Figure S7. The dots represent the mean quantitative values, and the whiskers
span 3 × standard deviation. Key: blue lines indicate high and
low limits for system compliance; red line indicates the median measurements;
RT, retention time; Res., resolution; Int., intensity.Using the PepCalMix AutoCal runs in our instrument, we can
track
the performances of the nanoLC-MS system over time. Key parameters
to monitor are presented in Figure BC for the select peptide AVGANPEQLTR (m/z 583.3136, z = 2+). By monitoring
the extracted ion chromatograms for each of the peptides contained
in the PepCalMix, we can evaluate the efficiency of the separation
(Figure B). For example,
we show that the peak-width-at-half-maximum for the AVGANPEQLTR peptide
was ≈0.05 min, representing ≈875 000 plate numbers.
For most of the peptides, we recorded plate numbers between ≈175 000
and 675 000.From the generated calibration reports,
we extracted the information
on the retention time (RT), the mass resolution (Res), and the intensity
(Int) for each peptide across different days. On the basis of the
collected data, we established thresholds that the system must meet
to be considered compliant and for the analysis to move forward, beyond
the ability of the mass spectrometer to perform the automated mass
calibration in both MS and MS/MS modes. For RTs, the high and low
thresholds were defined as 3 × standard deviation of the mean.
We defined low-pass thresholds only for resolution and intensity,
based on manufacturer recommendations and empirical data. We can evaluate
more parameters using the complex digest (K562), including the data
processing pipeline (Figure D). In DDA mode, we established a low pass number of proteins
and peptides that we expect to be identified, based on a 3 ×
standard deviation of the mean. Here, we expect our protein identification
to be higher than 3350 proteins and 26 900 peptides for 1 μg
of digest on the column and a 90 min gradient (Figure D, top). We also completed the evaluation
of the K562 digest in DIA mode (Figure D, bottom). Using the 18 K562 DIA runs performed to
establish the LLODs and LLOQs, we created a list of 20 proteins spanning
the entirety of the quantification range, presenting RSDs below 10%
present in all 18 runs. The selection of the proteins based on their
quantitative values is shown in Figure S7. These PQ samples control for our platform’s quantitative
repeatability and reproducibility without the variation existing in
lab-prepared biological samples. In our measurement control design,
we recommend performing two DIA runs at the beginning of the sample
sequence and one at the end to assess any discrepancies potentially
occurring between the first and last sample.Overall, the instrument
and data collection control presented here
enables a robust and reproducible MS data collection for the measurements
of scarce samples. The different thresholds presented in this manuscript
are specific to our LC-MS instrument setup and represent a guideline
more than hard-set values. Different instruments likely produce different
results and require careful evaluation before starting any measurements
of biopharma or clinically relevant samples. For example, in the sections
above, we demonstrated that different columns led to different metrics
in peak shape, retention times, number of proteins and peptides identifications,
and quantitative metrics. Our control strategy design applies to DIA
experiments on any instrument, but the thresholds should be evaluated
on a case-by-case basis.
Application of Our DIA Approach to a Jurkat
Cell Sample
To evaluate our method on a sample relevant to
our application, we
performed DIA on Jurkat cell samples (Figure ). Jurkat cells are immortalized, cancer
T-cells, which are similar to primary patient’s T-cells and
can serve as a model system for T-cell-based therapies. Indeed, our
main target application is to study the proteomic changes linked to
the manufacturing of CAR-T cells, which are directly derived from
patient’s T-cells. The proteins from the cells were extracted
and digested using the S-TRAP protocol.[40] From ≈500 ng of proteins on the column, we were able to quantify
a total of ≈2500 proteins spanning 4-orders of magnitude (Figure A). Among the quantified
proteins, we found critical markers of T-cells. For example, we were
able to quantify the four extracellular subunits of the T-cell receptor
complex (CD3E, CD3G, CD247, and CD3D),[41−43] the receptor CD5,[44] the receptor CD28,[45,46] and the transmembrane cell surface protein leukosialin (SPN or CD43).[47] Together, these markers are often used to isolate
T-cells from plasma by fluorescence activated cell sorting. These
CD receptors are required for T-cell activation and are present in
all stages of T-cell differentiation.
Figure 5
Application of our acquisition method
to a T-cell model case—Jurkat
cells. (A) Protein quantification by DIA of Jurkat cells prepared
by S-TRAP lysis and digestion method. Proteins highlighted by sky-blue
dots relate to T-cell biology. Some proteins, typically used as markers
for T-cell identification, are identified in the graph. (B) Gene ontology
annotation of biological processes occurring in unmodified Jurkat
cells. The subclassification for the cellular and metabolic processes
are presented in Figure S8.
Application of our acquisition method
to a T-cell model case—Jurkat
cells. (A) Protein quantification by DIA of Jurkat cells prepared
by S-TRAP lysis and digestion method. Proteins highlighted by sky-blue
dots relate to T-cell biology. Some proteins, typically used as markers
for T-cell identification, are identified in the graph. (B) Gene ontology
annotation of biological processes occurring in unmodified Jurkat
cells. The subclassification for the cellular and metabolic processes
are presented in Figure S8.Using gene ontology (GO) annotation, here using PantherDB,[48] we could map the quantified proteins to the
leading cellular functions (Figure B). These data can now serve as a baseline to study
the effect of engineering T-cells/Jurkat cells to express proteins
of choice on their surfaces, such as a CAR. Moreover, after establishing
a set sample preparation protocol, these types of data are helpful
as metrics for a complete system suitability standard, which controls
for the sample preparation and the instrument.
Conclusions
We have developed a sensitive DIA method to enable the measurements
of emerging cell therapy products. We also established a PQ control
strategy to ensure the method’s reproducibility. Our control
strategy was based on the MSQC principles, which evaluate key components
of the LC-MS instrument and data acquisition. The method can potentially
be applied in a regulated environment, as encountered in the biopharma
industry using this stringent control strategy. Moreover, we applied
the approach to Jurkat cells and were able to identify important markers
of T-cells that can be used later as metrics to design complete system
suitability standards.
Authors: Ben C Collins; Ludovic C Gillet; George Rosenberger; Hannes L Röst; Anton Vichalkovski; Matthias Gstaiger; Ruedi Aebersold Journal: Nat Methods Date: 2013-10-27 Impact factor: 28.547
Authors: Karlie A Neilson; Naveid A Ali; Sridevi Muralidharan; Mehdi Mirzaei; Michael Mariani; Gariné Assadourian; Albert Lee; Steven C van Sluyter; Paul A Haynes Journal: Proteomics Date: 2011-01-17 Impact factor: 3.984
Authors: Wout Bittremieux; Mathias Walzer; Stefan Tenzer; Weimin Zhu; Reza M Salek; Martin Eisenacher; David L Tabb Journal: Anal Chem Date: 2017-03-30 Impact factor: 6.986
Authors: Ying Zhu; Rui Zhao; Paul D Piehowski; Ronald J Moore; Sujung Lim; Victoria J Orphan; Ljiljana Paša-Tolić; Wei-Jun Qian; Richard D Smith; Ryan T Kelly Journal: Int J Mass Spectrom Date: 2017-09-01 Impact factor: 1.986