Imaging of proteoforms in human tissues is hindered by low molecular specificity and limited proteome coverage. Here, we introduce proteoform imaging mass spectrometry (PiMS), which increases the size limit for proteoform detection and identification by fourfold compared to reported methods and reveals tissue localization of proteoforms at <80-μm spatial resolution. PiMS advances proteoform imaging by combining ambient nanospray desorption electrospray ionization with ion detection using individual ion mass spectrometry. We demonstrate highly multiplexed proteoform imaging of human kidney, annotating 169 of 400 proteoforms of <70 kDa using top-down MS and a database lookup of ~1000 kidney candidate proteoforms, including dozens of key enzymes in primary metabolism. PiMS images reveal distinct spatial localizations of proteoforms to both anatomical structures and cellular neighborhoods in the vasculature, medulla, and cortex regions of the human kidney. The benefits of PiMS are poised to increase proteome coverage for label-free protein imaging of tissues.
Imaging of proteoforms in human tissues is hindered by low molecular specificity and limited proteome coverage. Here, we introduce proteoform imaging mass spectrometry (PiMS), which increases the size limit for proteoform detection and identification by fourfold compared to reported methods and reveals tissue localization of proteoforms at <80-μm spatial resolution. PiMS advances proteoform imaging by combining ambient nanospray desorption electrospray ionization with ion detection using individual ion mass spectrometry. We demonstrate highly multiplexed proteoform imaging of human kidney, annotating 169 of 400 proteoforms of <70 kDa using top-down MS and a database lookup of ~1000 kidney candidate proteoforms, including dozens of key enzymes in primary metabolism. PiMS images reveal distinct spatial localizations of proteoforms to both anatomical structures and cellular neighborhoods in the vasculature, medulla, and cortex regions of the human kidney. The benefits of PiMS are poised to increase proteome coverage for label-free protein imaging of tissues.
Proteoforms are the protein-level products of gene expression and posttranslational modifications (PTMs) functioning as key effectors in human health and disease (, ). In addition to understanding of their molecular compositions, interactions, and biological function, comprehensive characterization of the human proteoform landscape also requires mapping of their spatial distributions in human tissues and organs (). Protein-level imaging using antibody-based optical microscopy has revealed distinct cell types, functional tissue units, and subcellular structures (). These techniques use enzymes, metals, and fluorophores as reporters to obtain high-resolution maps of protein targets in tissues (). In recent years, highly multiplexed antibody-based imaging assays such as CODEX (, ), IBEX (), and Cell-DIVE () have drastically increased the number of protein targets that can be probed in a single experiment. Alternatively, mass spectrometry (MS)–based imaging assays, including imaging mass cytometry and multiplexed ion beam imaging, use antibodies labeled with rare earth metals to detect protein localization for up to ~40 protein targets at once (, ). Despite the substantial advances in spatial resolution and sensitivity, antibody-based approaches require prior knowledge of the protein targets and do not provide proteoform-level information (, ).MS-based top-down proteomics (, ) has been widely used for proteoform characterizations (). Modern MS instrumentation has reached the sensitivity for spatially resolved top-down proteomics suitable for imaging experiments (, ). Matrix-assisted laser desorption/ionization (MALDI) is widely used for protein imaging (, ) because of the broad mass range of proteome sampling (–). However, MALDI predominantly generates singly charged ions, which gives limited fragment information for direct top-down identification of intact proteins (). This challenge may be addressed using matrix-assisted laser desorption electrospray ionization (MALDESI), which combines MALDI with extractive ESI to generate multiply charged ions of peptides and proteins extracted from tissues (). Alternatively, multiply charged protein ions may be generated using liquid extraction–based ambient ionization methods () including DESI (), liquid extraction surface analysis (), and nanospray desorption electrospray ionization (nano-DESI) (). These techniques are particularly advantageous in top-down analysis of intact proteoforms in the imaging mode. Among these techniques, nano-DESI that uses a subnanoliter dynamic liquid bridge as a sampling probe enables imaging of biomolecules in tissues with a spatial resolution down to 10 μm (, ).One major challenge in proteoform imaging using liquid extraction–based techniques is the detection of low-abundance, high-mass proteoforms in the congested MS spectra produced by ionizing complex mixtures of biomolecules extracted from the sample. Until now, imaging and identification of intact proteoforms directly from tissue have been limited to <20-kDa species (, , –), with one report leading to the identification of 29-kDa Zn2+-bound carbonic anhydrase 2 (). Here, we address this challenge using individual ion MS (I2MS). I2MS is a new Orbitrap-based charge detection technique (–) using individual ions. Compared to traditional ensemble MS techniques, I2MS workflow is compatible with 500× dilute samples and yields a 10- to 20-fold enhancement in resolving power (). In particular, we combine nano-DESI () with I2MS () to create proteoform imaging MS (PiMS) for tissue imaging and direct identification of proteoforms up to ~70 kDa. We show ~400 isotopically resolved proteoform assignments from the human kidney and confidently identify 20 proteoforms in the 20- to 70-kDa range using tandem MS (MS/MS), illuminating differences in kidney architecture from the medulla, cortex, and vasculature. Incorporating I2MS, PiMS yielded 169 proteoform assignments/identifications based on MS/MS and intact mass matching [±5 parts per million (ppm)] at 80-μm spatial resolution and demonstrates the potential to visualize the proteinaceous structures comprising human tissues.
RESULTS
Overview of PiMS workflow
PiMS illustrated in Fig. 1 combines nano-DESI imaging () with data acquisition and processing for individual ions (see Supplementary Text and fig. S1) (). Specifically, we perform nano-DESI line scans on tissue, during which proteoforms are sampled as multiply charged ions distributed across multiple charge states (Fig. 1, top left). Instead of unresolved signals from overlapping charge states of protein mixtures typically observed for ensembles of ions, PiMS provides information of individual protein ions in the mass domain even when their charge states overlap in the mass-to-charge domain. Enabled by direct charge assignment to individual ion signals, PiMS generates mass-domain information of the individual proteoforms with resolution of their 13C isotopic peaks in each pixel of the imaging data (fig. S1). This allows for confident assignment of proteoform masses with better than 2-ppm accuracy at 1σ in each pixel of the imaging data (Fig. 1, middle left). Beyond proteoform-specific images (Fig. 1, bottom left), molecular identification is achieved using direct top-down MS/MS off the tissue (Fig. 1, top right) supplemented by database searching from known intact proteoform masses (Fig. 1, bottom right).
Fig. 1.
Illustration of the PiMS workflow for proteoform imaging and identification.
(A) Scanning approach (top), detection of proteoforms in the mass domain (middle), and image reconstruction (bottom). (B) Two approaches to identify proteoforms using either direct fragmentation of proteoform ions and spectral readout by individual ion MS/MS (top) or database lookup of accurate mass values (IMT, bottom). Scale bars, 1 mm.
Illustration of the PiMS workflow for proteoform imaging and identification.
(A) Scanning approach (top), detection of proteoforms in the mass domain (middle), and image reconstruction (bottom). (B) Two approaches to identify proteoforms using either direct fragmentation of proteoform ions and spectral readout by individual ion MS/MS (top) or database lookup of accurate mass values (IMT, bottom). Scale bars, 1 mm.
Human kidney proteoforms detected by PiMS
We used PiMS to examine the proteoforms and their localizations in a 10-μm-thick human kidney tissue section. Encouragingly, we immediately expanded the mass detection range for proteoform imaging to >70 kDa. Figure 2A shows the full PiMS spectrum from 5 to 72 kDa from a sum of 16,500 MS scans (~8 million single ions). This spectrum contains ~400 proteoform masses above 0.1% relative abundance that are isotopically resolved. A complex group of proteoforms in the 68- to 80-kDa range was observed but not individually resolved because of limited resolution of a particularly dense set of proteoforms in this spectral range. Therefore, these were not included in either proteoform counts or annotation efforts in this study (fig. S2). Spectral attributes include a dynamic range of ~200 (using a signal-to-noise ratio of 3 as the limit of detection) and a mass resolution (m/Δm) of ~100,000. PiMS images can be constructed for any of the 242 proteoform masses with relative abundance above 1% in the full PiMS spectrum of Fig. 2A.
Fig. 2.
Sum of mass-domain spectrum obtained from PiMS of human kidney.
(A) Full-scale PiMS spectrum in 5 to 72 kDa summed from 16,500 MS scans; regions occupied by abundant blood proteins (hemoglobin subunits and albumin) are labeled in the spectrum; red asterisks denote the key glycolytic enzymes found in the spectrum. (B) PiMS full spectrum in the 18- to 56-kDa range zoomed in from (A); major proteins identified using a variety of approaches are labeled in the spectrum; aside from glycolytic enzymes (red asterisks), enzymes involved in a few other major metabolic pathways (Krebs cycle, gluconeogenesis, and oxidative phosphorylation) found in PiMS are labeled in the spectrum. (C) Theoretical (red triangles) and experimentally observed (black trace) isotopic distributions of the four glycolytic enzymes [labeled with red asterisks in (A) and (B)] together with their PiMS images depicted in a schematic diagram of the glycolysis metabolic pathway. The cortex and the medulla of the kidney section imaged are labeled in the autofluorescence image at the top. Scale bars, 1 mm. Selected mass range of 22.2 to 22.7 kDa (D) and 41.4 to 42.1 kDa (E) showing identified proteoforms (spectrum in black, theoretical isotopic distributions in color). (F) GO analysis of biological pathways found enriched in the 169 identified proteoforms from the PiMS experiment shown in the −log10(P) scale.
Sum of mass-domain spectrum obtained from PiMS of human kidney.
(A) Full-scale PiMS spectrum in 5 to 72 kDa summed from 16,500 MS scans; regions occupied by abundant blood proteins (hemoglobin subunits and albumin) are labeled in the spectrum; red asterisks denote the key glycolytic enzymes found in the spectrum. (B) PiMS full spectrum in the 18- to 56-kDa range zoomed in from (A); major proteins identified using a variety of approaches are labeled in the spectrum; aside from glycolytic enzymes (red asterisks), enzymes involved in a few other major metabolic pathways (Krebs cycle, gluconeogenesis, and oxidative phosphorylation) found in PiMS are labeled in the spectrum. (C) Theoretical (red triangles) and experimentally observed (black trace) isotopic distributions of the four glycolytic enzymes [labeled with red asterisks in (A) and (B)] together with their PiMS images depicted in a schematic diagram of the glycolysis metabolic pathway. The cortex and the medulla of the kidney section imaged are labeled in the autofluorescence image at the top. Scale bars, 1 mm. Selected mass range of 22.2 to 22.7 kDa (D) and 41.4 to 42.1 kDa (E) showing identified proteoforms (spectrum in black, theoretical isotopic distributions in color). (F) GO analysis of biological pathways found enriched in the 169 identified proteoforms from the PiMS experiment shown in the −log10(P) scale.To identify these proteoforms, we manually annotated the full PiMS spectrum shown in Fig. 2A based on the results of an intact mass tag (IMT) search against a custom database. In this approach, we compared the shape and mass accuracy of the isotopic distributions of the proteoforms in the PiMS spectrum with the theoretical proteoforms in the database. The database contains 1000 kidney candidate proteoforms and was constructed from the top 500 most abundant proteins identified in a bottom-up proteomics study of human kidney tissues (table S1) (). To maximize the number of proteoform identification, we combined additional matches from top-down identification (details discussed in the next section) and manually inspected PTMs recorded in the Swiss-Prot database (). As a result, we manually annotated 169 proteoforms in the entire mass range using a ±5-ppm mass tolerance (list of proteoforms shown in table S2). Three technical replicates of PiMS on kidney tissue sections from the same patient show >85% overlap in the identified proteoforms (list of proteoforms shown in tables S3 and S4; Venn diagram of overlapping proteoforms shown in fig. S8). Figure 2 (D and E) shows two zoomed regions of the full PiMS spectrum with theoretical proteoform matches highlighted in color. Figure 2D shows various unmodified proteoforms in the mass range of 22.2 to 22.7 kDa captured by the custom database, demonstrating that the search included a variety of proteins in the kidney proteome. The mass range of 41.4 to 42.1 kDa shown in Fig. 2E contains proteoforms of γ-actin (highlighted in pink) and 3-ketoacyl-CoA thiolase (highlighted in blue) exhibiting diverse PTMs and their combinations. Aside from the monoacetylated and dimethylated proteoform of γ-actin identified by top-down MS/MS, other modified proteoforms in the displayed mass range were manually annotated. We note that the annotated proteoforms were manually confirmed by practitioners with the knowledge necessary to greatly suppress instances of false assignments. Clearly, further investigation is needed to estimate the false discovery rate for automated identification in PiMS, including the use of Bayesian priors.We annotated the most abundant proteins in Fig. 2 (A and B) to demonstrate the portion of the human kidney proteome captured by PiMS. Not unexpectedly, blood proteins (hemoglobin subunits, apolipoprotein A-1, and albumin; Fig. 2A) were found at highest abundance because of the highly vascularized nature of the kidney. Meanwhile, we captured many proteins prevalent in cellular pathways that are naturally abundant in human cells. In Fig. 2B, we labeled the most abundant proteins to give a brief overview of the molecular functions and biological pathways observed in PiMS. From low to high mass range shown in Fig. 2B, we found molecular chaperone [α-crystallin B chain (CRYAB)], signaling modulator [phosphatidylethanolamine-binding protein 1 (PEBP1)], proteins for cellular detoxification [mitochondrial manganese superoxide dismutase (SOD2) and glutathione S-transferase A1 and A2 (GSTA1 and GSTA2)] and homeostasis (carbonic anhydrase 1 and 2), and structural proteins (actins and vimentin). Moreover, proteins participating in central metabolic pathways are dominant. In particular, we found key enzymes in the Krebs cycle [malate dehydrogenase (MDH)] and gluconeogenesis [fructose-1,6-bisphosphatase 1 (FBP1)] and 27 subunits of protein complexes in the electron transport chain of oxidative phosphorylation (blue asterisks for 6 subunits in the 18- to 56-kDa mass range; others recorded in table S2). More intriguingly, four key enzymes in glycolysis [triosephosphate isomerase (TPI), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphoglycerate kinase (PGK), and α-enolase (ENO)] were imaged and identified (Fig. 2C and red asterisks in Fig. 2, A and B). PiMS images of the four detected glycolytic enzymes are shown in Fig. 2C with largely even distributions across the entire tissue section spanning from the kidney medulla to the cortex, confirming the presence of glycolysis in many of the kidney cell types. A more complete investigation of the biological pathways found for the 169 identified proteoforms is demonstrated by a Gene Ontology (GO) analysis shown in Fig. 2F, which indicates that cellular metabolic processes (glycolysis, the Krebs cycle, and oxidative phosphorylation) are the predominant biological pathways observed in the PiMS experiment.
Top-down characterizations of kidney proteoforms in PiMS
Human kidney proteoforms were characterized using MS/MS. Direct fragmentation of protein ions >20 kDa is challenging because of the low abundance of their fragment ions, particularly larger fragments (>15 kDa) that result from cleavage in the middle of the protein sequence (). To overcome this limitation, we used I2MS for the readout of top-down fragmentation spectra to capture those large fragment ions typically buried under the noise level in ensemble MS/MS experiments (). In particular, we selected >20-kDa proteoforms observed at >4% relative abundance for MS/MS, aiming for their most abundant locations in the tissue section. For each target proteoform, we selected a <0.8 m/z (mass/charge ratio) wide isolation window corresponding to the most abundant charge state of the proteoform obtained from PiMS data at MS1 level (Fig. 1, right, and see Materials and Methods). By matching the originally observed intact mass with the subsequent fragmentation data, we confidently identified 20 proteoforms of >20 kDa (Table 1 and fig. S7, A to U).
Table 1.
Proteoforms identified by on-tissue MS/MS.
PFR, proteoform record number; N/A, not available; ATPase, adenosine triphosphatase.
PFR, proteoform record number; N/A, not available; ATPase, adenosine triphosphatase.In Fig. 3, we highlight two representative >20-kDa proteoforms confidently identified by MS/MS. The 11- to 14.5-kDa region of the fragmentation spectrum of monoacetylated GSTA1 (25,542 Da; Fig. 3A) contains abundant complementary fragments of a 14–amino acid–long sequence tag (B95-B108 and Y113-Y126), contributing to the confident identification of this proteoform [graphical fragment map (GFM), shown in Fig. 3A]. PiMS image of GSTA1 proteoform in Fig. 3A shows that this proteoform is localized to the kidney cortex region. On the higher-mass end, vimentin (53,530 Da) was also identified (fragment spectrum and GFM shown in Fig. 3B), with 63 isotopically resolved >15-kDa fragment ions above 1% relative abundance matching the sequence fragments of vimentin (a few representative ones shown in Fig. 3B). PiMS image of vimentin (Fig. 3B) showing its localization to the vasculature also confirms the identity of the proteoform ().
Fig. 3.
On-tissue identification of proteoforms by MS/MS.
Two representative human kidney proteoforms, GSTA1 (A) and vimentin (B). The PiMS images of the two proteoforms are shown on the top right of each panel along with an autofluorescence image of an adjacent section as a reference. On the bottom left of each panel, expanded regions of the fragment spectrum are displayed with the major matching fragment ions annotated. GFMs are shown on the bottom right of each panel. Scale bars, 1 mm.
On-tissue identification of proteoforms by MS/MS.
Two representative human kidney proteoforms, GSTA1 (A) and vimentin (B). The PiMS images of the two proteoforms are shown on the top right of each panel along with an autofluorescence image of an adjacent section as a reference. On the bottom left of each panel, expanded regions of the fragment spectrum are displayed with the major matching fragment ions annotated. GFMs are shown on the bottom right of each panel. Scale bars, 1 mm.The identification of the >60-kDa proteoforms by MS/MS is increasingly challenging. We obtained 3% sequence coverage for a ~66.4-kDa proteoform putatively identified as albumin (fig. S7T). Poor sequence coverage obtained for this proteoform is likely due to the presence of 17 disulfide linkages known to occur in albumin, thereby making PTM localization challenging. In one attempt to identify a proteoform centered at 70,900 Da, we found that the precursor proteoform with the best database retrieval score was mesothelin isoform 2 (fig. S7U). The deviation in precursor mass (67,938 Da compared to 70,900 Da) may be attributed to modifications and/or isoform expression, which are not captured in the database. Despite these challenges, we were able to readily identify 20 human proteoforms ranging from 20 to 70 kDa in molecular mass using top-down MS/MS in PiMS.
Creation of a kidney proteoform map
PiMS images allow for direct visualization of submillimeter anatomical structures and functional tissue units of human kidney sections with proteoform-level precision. PiMS images of kidney tissue containing the cortex, medulla, and vasculature regions show distinct differences in the distribution of proteoforms across these vastly different anatomical regions (Fig. 4; optical image shown in Fig. 4A). The identification of kidney internal structures was supported by autofluorescence microscopy (Fig. 4B) () and periodic acid–Schiff (PAS) staining histology (Fig. 4C).
Fig. 4.
Kidney proteoform maps.
Optical (A) and autofluorescence (B) images of adjacent human kidney sections containing the cortex, medulla, and vasculature regions. (C) PAS staining of an adjacent section from the same kidney. (D to I) PiMS images of individual proteoforms that selectively illuminate different anatomical regions and cellular neighborhoods; the name and UniProt accession of the proteoforms are depicted next to the images with their color scale. Composite image of (J) apolipoprotein A-1 (blue) (D), GSTA2 (red) (E), and albumin (green) (F); (K) GSTA2 (blue) (E), α-crystallin B chain (red) (G), and transgelin-2 (green) (H); and (L) GSTA2 (blue) (E), α-crystallin B chain (red) (G), and vimentin (green) (I). Scale bars, 1 mm.
Kidney proteoform maps.
Optical (A) and autofluorescence (B) images of adjacent human kidney sections containing the cortex, medulla, and vasculature regions. (C) PAS staining of an adjacent section from the same kidney. (D to I) PiMS images of individual proteoforms that selectively illuminate different anatomical regions and cellular neighborhoods; the name and UniProt accession of the proteoforms are depicted next to the images with their color scale. Composite image of (J) apolipoprotein A-1 (blue) (D), GSTA2 (red) (E), and albumin (green) (F); (K) GSTA2 (blue) (E), α-crystallin B chain (red) (G), and transgelin-2 (green) (H); and (L) GSTA2 (blue) (E), α-crystallin B chain (red) (G), and vimentin (green) (I). Scale bars, 1 mm.PiMS images of proteoforms in the kidney show distinct localizations (Fig. 4, D to I). Apolipoprotein A-1 (Fig. 4D) and GSTA2 (Fig. 4E) were found to be enhanced in the medulla and cortex regions, respectively. Blood-abundant albumin (Fig. 4F) and α-crystallin B chain (Fig. 4G), a molecular chaperone, were expressed in multiple regions of the kidney, with particular enhancement in the inner and outer medulla, respectively. In addition, transgelin-2 (Fig. 4H) and vimentin (Fig. 4I) were abundant in highly focused regions near the artery.
Assessing proteoform biology: Differences in space and molecular composition
Next, we created composite PiMS proteoform images to enable more efficient readout of anatomical regions and functional tissue units in the kidney. Using the abovementioned single PiMS images, we created a series of tricolor composite PiMS images (Fig. 4, J to L). By combining images of apolipoprotein A-1 (Fig. 4D, blue), GSTA2 (Fig. 4E, red), and albumin (Fig. 4F, green), we highlighted vascular regions within the kidney (Fig. 4J, labeled as bulk vasculature), including the bulk vasculature and the inner medulla. In addition, the distinct points of vascularization in the cortex may be aligned with glomeruli (green). This is further supported by GSTA2 (red) localized to the tubules surrounding the glomeruli (). Figure 4K is a composite PiMS image of transgelin-2 (Fig. 4H, green), GSTA2 (Fig. 4E, blue), and α-crystallin B chain (Fig. 4G, red). Using this image, we highlight the large artery via the specific localization of transgelin-2 (Fig. 4K, green). Previous studies have shown the specific localization of transgelin-2 to the smooth muscle cells (SMCs). This observation is consistent with the abundance of SMC in arteries, whereas veins are mainly composed of stromal cells (). Last, the combination of the spatial distribution of vimentin (Fig. 4I, green), GSTA2 (Fig. 4E, blue), and α-crystallin B chain (Fig. 4G, red) provides the best outline of the vasculature in the tissue section (Fig. 4L). Vimentin is a filamentous protein found in most of the blood vessels and connective tissues. The localization of vimentin to the bulk blood vessel regions observed by PiMS is consistent with this. Most of the scattered vimentin spots observed in the cortex fit well into the dark spots corresponding to glomeruli. Moreover, vimentin is found in many scattered locations in the inner medulla region. These illuminated spots correspond to the spatially dispersed peritubular capillaries, which form a complex three-dimensional network in the medulla and become dispersed when the tissue is cross-sectioned.Moreover, a major advantage of PiMS over antibody-based imaging approaches lies in the ability to determine the molecular composition of proteoforms in an untargeted fashion. Antibody-based imaging approaches do not distinguish different proteoforms of a single protein, whereas PiMS can capture sequence differences and modifications. Subtle differences of protein sequences and modifications become especially challenging to detect for high-mass proteins. Direct top-down identification of >20-kDa proteoforms from tissue enabled by PiMS allows for the characterization of highly similar kidney protein isoforms that originate from allelic coding single-nucleotide polymorphisms (cSNPs).Three major proteoforms of N-terminal acetylated GST subunits, GSTA1 and GSTA2, were observed in kidney tissue with localizations to the cortex region (Fig. 5, B, D, and E). Figure 5C shows the mass-domain spectrum of GSTA1 and GSTA2 proteoforms. A single proteoform was detected from GSTA1 (25,542 Da), showing two alleles with the same sequence. In contrast, we identified two GSTA2 proteoforms, the canonical form at 25,573 Da and another form at 25,587 Da representing a 14-Da mass shift. Both proteoforms were observed at similar abundances with highly similar tissue localization, which is characteristic of nonspecific biallelic tissue expression resulting from a cSNP (Fig. 5, D and E). Fragmentation data were able to localize the 14-Da mass shift to the region between Pro110 and Gln113 from the N terminus (regions highlighted in light red in Fig. 5, D and E, and fig. S7D). A UniProt search shows a Ser111➔Thr natural variant of GSTA2 (highlighted in red), confirming that the 14-Da mass shift corresponds to a proteoform resulting from a common cSNP (allele frequency > 40% according to dbSNP entry no. rs2180314). This exemplifies the power of PiMS to the probing of gene expression in tissues directly at the proteoform level, which is complementary to genomic and transcriptomic predictions.
Fig. 5.
Mass spectrometric detection of gene differences.
(A) Autofluorescence and PiMS images of α GST enzymes of the human kidney section (B, D, and E). Mass-domain PiMS spectra of ~25.6-kDa range (C) shows GSTA1 and GSTA2 enzyme proteoforms. The GFMs of the two proteoforms of GSTA2 from a known biallelic cSNP are shown in (D) and (E) below the PiMS images. The sequence variations of the two cSNP GSTA2 proteoforms are highlighted in the GFMs in red. Scale bars, 1 mm.
Mass spectrometric detection of gene differences.
(A) Autofluorescence and PiMS images of α GST enzymes of the human kidney section (B, D, and E). Mass-domain PiMS spectra of ~25.6-kDa range (C) shows GSTA1 and GSTA2 enzyme proteoforms. The GFMs of the two proteoforms of GSTA2 from a known biallelic cSNP are shown in (D) and (E) below the PiMS images. The sequence variations of the two cSNP GSTA2 proteoforms are highlighted in the GFMs in red. Scale bars, 1 mm.
DISCUSSION
Imaging methods have boomed in recent years. Highly multiplexed affinity reagent–based methods allow for the detection of more than 50 protein targets in a single assay (, ). The increased throughput of these approaches has provided an opportunity to develop comprehensive maps of proteins in human tissues at a rapid pace (, , ). However, antibody-based imaging approaches do not distinguish different proteoforms of a single protein. The development described in this study addresses this challenge and enables imaging of protein isoforms from different gene family members or allelic cSNPs in biological tissues.PiMS is able to capture many abundant cytosolic and mitochondrial proteins in cells within one acquisition process, some of which have been previously investigated using antibody-based approaches (e.g., vimentin and GSTA2) (). In an IMT search of the PiMS data against a database containing 100 most abundant proteins found in bottom-up proteomics of human kidney (i.e., 200 candidate proteoforms from each protein with and without the start methionine), 56 proteoform hits were asserted by PiMS (54 and 46 hits from the other two technical replicates). This provides an opportunity of using PiMS to delve into the major biological pathways in tissues in a spatially resolved manner. In the GO analysis result based on the 169 annotated proteoforms shown in Fig. 2F, a variety of central cellular metabolic pathways were found enriched by PiMS in many kidney cell types. Notably, we were able to detect and image four key enzymes in glycolysis and 27 subunits of the complexes in the respiratory electron transport chain, which are critical components of adenosine 5′-triphosphate (ATP) metabolic process, the highest enriched pathway.Meanwhile, we acknowledge that nuclear and membrane proteins in kidney cells were not detected at high abundance in PiMS, suggesting that major sources of bias exist such as solubility, ion suppression, and limitations in detection at very high mass or low abundance. Despite the limitations of PiMS in its current form, these findings were absent in denaturing, global top-down MS experiments (). In fact, a comparison of a global top-down MS and PiMS of human kidney tissues results in only 9 overlapping proteoforms, with 160 unique proteoforms annotated from PiMS (). This lack of overlap is mainly due to the strong bias against the detection of >30-kDa proteins in global top-down MS, and more detailed comparison of biological pathways detectable in PiMS will await future studies with expanded proteome coverage. Clearly, uncovering the many additional proteoforms present in the human proteome will require another major leap forward in proteoform imaging technology (). In this first iteration of the PiMS technology, only a small percentage of all proteoforms were revealed. While this limits the biological conclusions that can be drawn, future iterations of the technology can improve the detection rate of proteoforms, enabling deeper insights into biological function. Future work will optimize sample pretreatment and extraction solvent compositions to expand the detection of proteoforms localized to various subcellular compartments. Furthermore, low-abundance proteoforms may be captured by increasing the total number of ions collected in a single PiMS experiment.PiMS expands the proteoform detection to 70-kDa mass range with >10× feature capacity compared to multiplexed imaging assays. More than 50% of the human proteins reside in the mass range between 20 and 70 kDa (). The ability in measuring the proteome with a mass coverage of up to 70 kDa opens up the opportunities in revealing a myriad of biological pathways and molecular functions (i.e., the oxidative phosphorylation and mitochondrial proteins reported here), which were hidden in prior MS imaging techniques. High-mass proteoforms of >70 kDa with generally lower abundance and increased molecular heterogeneity (e.g., the idea that the larger the protein, the greater the number of high-stoichiometry PTMs) perhaps contribute to size limitations of proteoform imaging in its current form. Furthermore, high-mass proteoforms can experience poor desolvation and high decay rate during MS detection because of large cross sections. Future research effort to detect high-mass proteoforms will focus on these challenges.Another advantage of whole-proteoform measurement is the robust detection of diverse types of PTMs with low bias and knowledge of their stoichiometry. For example, two proteoforms of GAPDH were identified in the kidney dataset, both missing the start methionine and one containing a dimethylation site at Lys65. While both of these PTMs are known (, ), the dimethylation of Lys65 was shown to be very low in abundance and inconsistently observed (). Our data indicate that the dimethylated proteoform is about ~5% of the total GAPDH in the human kidney. The function of the dimethylation is unknown, but its colocalization with the unmethylated proteoform suggests a possible role of the modified GAPDH proteoform in dimerization, which may contribute to its catalytic properties in glycolysis ().In conclusion, we present the PiMS approach that combines nano-DESI with I2MS technology, which enables the direct imaging and molecular identification of human kidney proteoforms of up to ~70 kDa. This approach increases observable proteoform masses by nearly 4-fold and resolving power by 10-fold compared to prior work (). The development of PiMS opens up exciting opportunities to infuse proteoform knowledge into the multiomic approaches being evaluated for inclusion into the Human Reference Atlas (). By providing spatial localization of proteoforms to anatomical regions, cell types, and functional tissue units, PiMS promises applications in molecular tissue mapping, biomarker discovery, and disease diagnostics.
MATERIALS AND METHODS
Tissue preparation
Human kidney tissue sections were prepared according to published protocols (). Mouse brain tissue sections were sectioned at −21°C to a thickness of 12 μm using a CM1850 cryostat (Leica Microsystems, Wetzlar, Germany). Institutional Review Board and/or Institutional Animal Care and Use Committee guidelines were followed with human or animal subjects. Tissue sections were thaw-mounted onto glass microscope slides (IMEB Inc., Tek-Select Gold Series Microscope Slides, clear glass, positive charged) and stored at −80°C before MS imaging analysis.Human kidney tissue sections were thawed under slight vacuum at room temperature, fixed and desalted via successive immersion in 70%/30%, 90%/10%, and 100%/0% ethanol/H2O solutions for 20 s each, delipidated by 99.8% chloroform for 60 s, and dried under slight vacuum right before nano-DESI imaging experiments. These sample preparation steps allow for the in situ precipitation of proteins and removal of lipids, avoiding suppression of protein signals upon ionization into the mass spectrometer (, ).
Nano-DESI ion source
A custom-designed nano-DESI source was used for all data acquisition. The experimental details of nano-DESI MSI have been described elsewhere (, ). Briefly, the nano-DESI probe is composed of a primary [outer diameter (OD), 150 μm; inside diameter (ID), 20 μm] and a nanospray capillary (OD, 150 μm; ID, 40 μm), with the spray side of the nanospray capillary positioned close to the MS inlet. The probe was fabricated using fused silica capillary tubing (Molex, Thief River Falls, MN). A liquid bridge formed at the location where the two capillaries meet is brought into contact with the tissue section for analyte extraction. The liquid bridge is dynamically maintained by solvent propulsion from the primary capillary and instantaneous vacuum aspiration through the nanospray capillary. The extracted analytes are continuously transferred to a mass spectrometer inlet and ionized by ESI. Imaging experiments are performed by moving the sample under the nano-DESI probe in lines. The optimal scan rate is discussed in the next section. The strip step between the line scans was set to 150 μm to avoid overlap between the adjacent line scans. To ensure the stability of the nano-DESI probe during the imaging experiment, we applied a surface tilt angle correction to the tissue sample by defining a three-point plane before the imaging experiment (). All samples were electrosprayed under denaturing conditions in a 60%/39.4% acetonitrile/water and 0.6% acetic acid solution compatible with both protein extraction and ionization. All the experiments were performed in positive ionization mode.
PiMS conditions and data acquisition
PiMS data acquisition was performed in the I2MS mode on a Q-Exactive Plus Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany), which has been described previously (fig. S1) (). The source conditions on the mass spectrometer were set as follows: ESI voltage, 3 kV; in-source collision-induced dissociation, 15 eV; S-lens radio frequency level, 70%; capillary temperature, 325°C. In particular, rather than collecting typical ensemble ion MS spectra, ion signals were attenuated down to the individual ion regime by limiting the ion collection time in the C-trap before injection into the Orbitrap analyzer. The MS acquisition rate was set at one scan every 2 s. During PiMS data acquisition, proteoforms were sampled by a nano-DESI probe producing multiply charged ions distributed across multiple charge states. To enable downstream I2MS analysis, most of the ions in one detection period were collected in the individual ion regime, corresponding to a singular ion signal at a defined m/z (or frequency) value. In particular, rastering line scans on tissue was performed at a reduced rate of 2.5 to 4 μm/s, which corresponds to a 5- to 8-μm sampling distance between adjacent pixels (fig. S3). The nano-DESI solvent flow rate was kept at 600 nl/s to allow for efficient extraction and dilution of the proteins. The injection time for a specific set of tissue sections is typically optimized by acquiring line scans on an adjacent section before an imaging experiment, and it may vary from 100 to 500 ms (figs. S4 to S6). For 10-μm-thin sections of human kidney tissue presented in this study, a 300-ms injection time was used for all the sections from the same subject. We note that for extremely dominant proteoforms (e.g., hemoglobin subunits in this study), “multiple ion events” were commonly observed.Additional MS instrument conditions in the PiMS experiment are as follows: The Orbitrap central electrode voltage was adjusted to −1 kV to improve the ion survival rate under denatured conditions. Higher-energy collisional dissociation (HCD) pressure level was kept at 0.2 (ultrahigh vacuum pressure < 2 × 10−11 torr) to reduce collision-induced ion decay within the Orbitrap analyzer without substantial losses in trapping efficiency. Additional relevant data acquisition parameters were adjusted as follows: mass range, 400 to 2500 m/z; automatic gain control mode was disabled, and the maximum injection time was held constant at 300 ms; enhanced Fourier transform, off; averaging, 0; microscans, 1. Time-domain data files were acquired at detected ion frequencies and recorded as Selective Temporal Overview of Resonant Ion (STORI) files ().
PiMS data analysis and image generation
Ion images were generated using a MATLAB script developed in-house (available from the authors upon request). Mass-domain spectra were constructed by co-adding all individual ions obtained from the entire tissue section. In specific cases where computation power was limited, charge assignment and image construction were performed in sections with an upper limit of 50 million ions per portion. In the first step, all ion signals were subjected to STORI analysis to filter out decayed and multiple ion events. The neutral masses of the protein ions were calculated byCharge state (z) is obtained from the slope of induced image current determined by the STORI analysis (). Accurate charge assignment of each ion was statistically evaluated by comparing the slopes of its isotopologs across different charge states from the entire tissue section. In particular, an iterative voting methodology was used for filtering out ions with a lower probability score in the process, which allows for the construction of mass-domain isotopic distribution of a proteoform with statistical confidence. In this step, we used a kernel density estimation approach to convert centroid masses of individual ions to uniform distributions. Accurate masses of the isotopes were obtained from the center of the summed individual ion profiles.For image generation, ions composing the mass-domain isotopic envelope of a protein were registered back to their spatial origins on the tissue section for PiMS image generation. A ±10-ppm isotopic mass tolerance was used to select individual ions for image generation. A raw image was first generated using absolute ion counts at different x-y locations. In the kidney PiMS images presented in this study, each pixel was constructed from three adjacent MS scans corresponding to ~10 μm by 150 μm area. The raw image was normalized using a total ion count matrix, which accounts for the fluctuation of sampling conditions at different locations.
IMT search and GO analysis
The summed mass-domain PiMS spectrum was converted to .mzML format and processed using a custom version of TDValidator (Proteinaceous, Evanston, IL) implemented with an MS1 IMT search function. The PiMS spectrum was shifted by +4 ppm according to the accurate masses of six MS/MS-identified proteoforms in the 20- to 50-kDa mass range. A human protein database constructed from the top 500 most abundant proteins in a bottom-up proteomic study of human kidney tissues was used for the search (table S2). Methionine on/off and monoacetylation were considered as possible proteoform modifications in the database. IMT search was performed with a ±5-ppm mass tolerance. Additional proteoform matches were curated by spectrum inspection and manual annotation of putative modifications recorded in Swiss-Prot human proteome database. The final 169 proteoform matches include MS/MS-identified proteoforms and all IMT-identified proteoforms discussed above.GO analysis was performed using Metascape (https://metascape.org/) (). Specifically, a list of Entrez Gene ID was retrieved for the 169 identified proteoforms on UniProt and submitted to Metascape for GO analysis. The result contains the top-level GO biological processes.
Top-down proteomic data acquisition and analysis
Targeted MS/MS experiments were performed on a tissue section adjacent to the imaged section using HCD. In the first step, a target proteoform was selected in the mass-domain spectrum. We used the PiMS image of the target proteoform to select a target area on the section where the proteoform abundance is enhanced. For the selected area, the mass-domain PiMS spectrum was convoluted back to m/z domain, from which a proper isolation window that contains predominantly the target proteoform was selected. A 0.8 m/z isolation window was typically used for most of the targets; in special cases, 0.5 to 0.6 m/z window was used to avoid overlapping signal. MS/MS experiments were performed by scanning the nano-DESI probe over the selected region with the selected isolation window at a scan rate of 2.5 to 4 μm/s. MS/MS data acquisition was conducted in the I2MS mode with an Orbitrap detection period of 2 s (HCD pressure setting = 0.5) (). HCD collision energy and injection time were optimized to maximize the population of individual ion fragments. Typical ranges of collision energy and injection time used in this study were 7 to 14 eV and 200 to 1500 ms, respectively. Total data acquisition time for each target varied from 1 to 5 hours.MS/MS data were first subjected to I2MS processing for fragment ion charge assignment and mass-domain spectrum construction following the same procedure as described above. Mass-domain spectrum was converted to .mzML format subjected to MS2 search function implemented in ProSight Native (Proteinaceous, Evanston, IL) to look for possible candidates from the entire Swiss-Prot human protein database (www.uniprot.org/proteomes/UP000005640) (, ). For each search, the top one to five candidates were manually validated using a custom version of TDValidator (Proteinaceous, Evanston, IL) to identify the best matching proteoform.
Authors: Kris Gevaert; Marc Goethals; Lennart Martens; Jozef Van Damme; An Staes; Grégoire R Thomas; Joël Vandekerckhove Journal: Nat Biotechnol Date: 2003-03-31 Impact factor: 54.908
Authors: Michael J Gerdes; Christopher J Sevinsky; Anup Sood; Sudeshna Adak; Musodiq O Bello; Alexander Bordwell; Ali Can; Alex Corwin; Sean Dinn; Robert J Filkins; Denise Hollman; Vidya Kamath; Sireesha Kaanumalle; Kevin Kenny; Melinda Larsen; Michael Lazare; Qing Li; Christina Lowes; Colin C McCulloch; Elizabeth McDonough; Michael C Montalto; Zhengyu Pang; Jens Rittscher; Alberto Santamaria-Pang; Brion D Sarachan; Maximilian L Seel; Antti Seppo; Kashan Shaikh; Yunxia Sui; Jingyu Zhang; Fiona Ginty Journal: Proc Natl Acad Sci U S A Date: 2013-07-01 Impact factor: 11.205
Authors: John P McGee; Rafael D Melani; Ping F Yip; Michael W Senko; Philip D Compton; Jared O Kafader; Neil L Kelleher Journal: Anal Chem Date: 2020-12-15 Impact factor: 6.986
Authors: Leeat Keren; Marc Bosse; Steve Thompson; Tyler Risom; Kausalia Vijayaragavan; Erin McCaffrey; Diana Marquez; Roshan Angoshtari; Noah F Greenwald; Harris Fienberg; Jennifer Wang; Neeraja Kambham; David Kirkwood; Garry Nolan; Thomas J Montine; Stephen J Galli; Robert West; Sean C Bendall; Michael Angelo Journal: Sci Adv Date: 2019-10-09 Impact factor: 14.136
Authors: Paul D Piehowski; Ying Zhu; Lisa M Bramer; Kelly G Stratton; Rui Zhao; Daniel J Orton; Ronald J Moore; Jia Yuan; Hugh D Mitchell; Yuqian Gao; Bobbie-Jo M Webb-Robertson; Sudhansu K Dey; Ryan T Kelly; Kristin E Burnum-Johnson Journal: Nat Commun Date: 2020-01-07 Impact factor: 14.919
Authors: Yury Goltsev; Nikolay Samusik; Julia Kennedy-Darling; Salil Bhate; Matthew Hale; Gustavo Vazquez; Sarah Black; Garry P Nolan Journal: Cell Date: 2018-08-02 Impact factor: 41.582