| Literature DB >> 35275789 |
Shadi Ferdosi1, Behzad Tangeysh1, Tristan R Brown1, Patrick A Everley1, Michael Figa1, Matthew McLean1, Eltaher M Elgierari1, Xiaoyan Zhao1, Veder J Garcia1, Tianyu Wang1, Matthew E K Chang2, Kateryna Riedesel1, Jessica Chu1, Max Mahoney1, Hongwei Xia1, Evan S O'Brien1, Craig Stolarczyk1, Damian Harris1, Theodore L Platt1, Philip Ma1, Martin Goldberg1, Robert Langer3, Mark R Flory2, Ryan Benz1, Wei Tao4,5, Juan Cruz Cuevas1, Serafim Batzoglou1, John E Blume1, Asim Siddiqui1, Daniel Hornburg1, Omid C Farokhzad1,4,5.
Abstract
SignificanceDeep profiling of the plasma proteome at scale has been a challenge for traditional approaches. We achieve superior performance across the dimensions of precision, depth, and throughput using a panel of surface-functionalized superparamagnetic nanoparticles in comparison to conventional workflows for deep proteomics interrogation. Our automated workflow leverages competitive nanoparticle-protein binding equilibria that quantitatively compress the large dynamic range of proteomes to an accessible scale. Using machine learning, we dissect the contribution of individual physicochemical properties of nanoparticles to the composition of protein coronas. Our results suggest that nanoparticle functionalization can be tailored to protein sets. This work demonstrates the feasibility of deep, precise, unbiased plasma proteomics at a scale compatible with large-scale genomics enabling multiomic studies.Entities:
Keywords: machine learning; mass spectrometry; nanoparticle; nano–bio interaction; proteomics
Mesh:
Substances:
Year: 2022 PMID: 35275789 PMCID: PMC8931255 DOI: 10.1073/pnas.2106053119
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.DIA workflow comparison. Comparing a five-NP workflow (red) to a 19-concatenated-into-9 high-pH fractionation of depleted plasma strategy (blue), a plasma depletion strategy (green), and neat plasma (purple). (A) Step-by-step comparison of five-NP, high-pH fractionation, depleted, and neat plasma workflows. (B) Median number of protein groups identified by each workflow. Error bars denote SDs of assay replicates. The top dash depicts the number of identified proteins in any of the samples, and the lower dash represents the number of identified proteins in three out of three assay replicates (defined as the complete features). For this comparison, samples belonging to the respective workflows were processed as independent Spectronaut runs. When processed together, the median numbers of protein groups were 1,615, 862, 461, and 375 for five-NP, high-pH, depleted, and neat plasma, respectively. (C) CV of median-normalized peptide intensities filtered for three out of three identifications across assay replicates. Median CV is depicted on each plot. (D) Dynamic range of identified proteins matched with normalized protein intensities from a plasma protein database (22). Protein groups were filtered for complete features. Median log10 intensity of complete features is shown on each boxplot, and the outliers are removed. (E) Percent coverage of the plasma protein database in each workflow (Top) and relative coverage of plasma protein database by the five-NP method to high-pH fractionation (Bottom) over negative protein log10 intensities. The 95% interval is shown in gray. Protein groups were filtered for complete features. (F) UpSet plot showing the protein group overlap between the five-NP and high-pH workflows. Protein groups are filtered for complete features. The workflows included in each bar are shown as colored labels. (G) Comparison of number of proteins with specified functional annotations covered exclusively with five-NP (red), exclusively with high-pH workflow (blue), or both (overlapped). Protein groups were filtered for complete features. The Venn diagrams are proportional to the number of protein groups. Workflows were processed together using Spectronaut for all analyses except for A, in which each workflow was processed separately. All proteins and peptides were conservatively filtered at 1% protein and peptide FDR.
Fig. 2.Interrogating the human plasma proteome with a multi-NP panel. (A) Median number of protein groups identified (1% protein, 1% peptide FDR for neat and depleted plasma, with the respective NPs shown as bar plots. Error bars denote SDs of protein IDs in assay replicates. The lower dash represents the number of identified proteins in three out of three assay replicates (complete features). The top dash depicts the number of proteins identified in any of the samples. (B) The top left triangular area shows the JI indicating degree of overlapping identifications as the mean JI across assay replicates (left column) and comparing individual NPs, depleted, and neat plasma (filtering for proteins quantified in three out of three assay replicates). Color and box size are scaled by the magnitude of JI. (C) The bottom right triangular area shows the Pearson correlation coefficient (r) indicating correlation of median normalized log10 intensities as the mean r across assay replicates (right column) and comparing individual NPs and neat plasma (filtering for proteins quantified in three out of three assay replicates, using median normalized intensity comparing pairwise common detections). Color and circle size are scaled by the magnitude of r. (D) Assay precision (CV) calculated for proteins quantified in three out of three assay replicates. Protein intensities were median normalized. Inner boxplots report the 25 (lower hinge), 50, and 75% quantiles (upper hinge). Whiskers indicate observations equal to or outside hinge ± 1.5 × interquartile range (IQR). Outliers (beyond 1.5 × IQR) are not plotted. Violin plots display all data points. (E) Gower distance mapped as distance tree for median protein intensities (filtered for three out of three identifications in assay replicates).
Fig. 3.Effect of NP surface functionalization on protein corona composition. (A) NPs are classified based on a variety of physicochemical properties and functional groups including charge, polymer, sugar, aromatic systems, phosphates, amines, hydrophobicity, hydroxyl groups, coordinating property, and initial reaction class. (B) Unsupervised hierarchical clustering of median-normalized log10 protein intensities (1% FDR on protein and peptide level). Assay replicates of NP classes are median averaged. Missing values were filtered and imputed according to Materials and Methods. (C) The 1D annotation enrichment scores (heat map color ramp) for NPs. The 1D score was calculated for UniProt keywords as described in Materials and Methods. Enriched annotations are indicated in red; depleted annotations are indicated in blue. NPs are clustered based on the 1D score distributions. The log2–odds ratios of the NPs characteristic for each cluster are depicted as fingerprint diagrams on the right, with starred results indicating significance (P < 0.05) in Fisher’s exact test. (D) Variance decomposition analysis modeling normalized protein intensities as a function of NP’s physicochemical makeup (A). Explained variance by each variable was estimated using a linear mixed effects model and variancePartition package in R. The explained variance in protein intensities across NPs and the unexplained variance (residuals) are depicted as a density distribution. Variances explained for each protein across NP’s reaction class and functional groups are summed (turquoise distribution). (E) NP specific variance broken down into reaction class and functional groups. (F) Functional groups broken down into contribution of individual physicochemical properties. (G) Explained variance for functional group “charge” split into high (explained variance >30%), middle (explained variance <25 and >10%), and low (explained variance <10%). Wilcox test was used to determine P values. y axis depicts the absolute of predicted isoelectric point of each protein – 7.4 (pH of the assay). The larger that value, the more likely the protein has a net charge in the assay and can be affected by NP charge. Inner boxplots report 25 (lower hinge), 50, and 75% quantiles (upper hinge). Whiskers indicate observations equal to or outside hinge ± 1.5 × IQR. Outliers (beyond 1.5 × IQR) are not plotted. Violin plots capture all data points.