| Literature DB >> 33053376 |
Konstantin Barylyuk1, Ludek Koreny2, Huiling Ke2, Simon Butterworth2, Oliver M Crook3, Imen Lassadi2, Vipul Gupta2, Eelco Tromer2, Tobias Mourier4, Tim J Stevens5, Lisa M Breckels6, Arnab Pain7, Kathryn S Lilley6, Ross F Waller8.
Abstract
Apicomplexan parasites cause major human disease and food insecurity. They owe their considerable success to highly specialized cell compartments and structures. These adaptations drive their recognition, nondestructive penetration, and elaborate reengineering of the host's cells to promote their growth, dissemination, and the countering of host defenses. The evolution of unique apicomplexan cellular compartments is concomitant with vast proteomic novelty. Consequently, half of apicomplexan proteins are unique and uncharacterized. Here, we determine the steady-state subcellular location of thousands of proteins simultaneously within the globally prevalent apicomplexan parasite Toxoplasma gondii. This provides unprecedented comprehensive molecular definition of these unicellular eukaryotes and their specialized compartments, and these data reveal the spatial organizations of protein expression and function, adaptation to hosts, and the underlying evolutionary trajectories of these pathogens.Entities:
Keywords: apicomplexa; evolution; host-pathogen interaction; invasion; organelle; parasitism; plasmodium; proteomics; subcellular; toxoplasma
Mesh:
Substances:
Year: 2020 PMID: 33053376 PMCID: PMC7670262 DOI: 10.1016/j.chom.2020.09.011
Source DB: PubMed Journal: Cell Host Microbe ISSN: 1931-3128 Impact factor: 21.023
Figure 1HyperLOPIT Reveals Organelle Protein Ensembles through Measuring Cofractionation Profiles of Proteins
(A) Schematic of T. gondii tachyzoite showing the main subcellular compartments and structures.
(B) Summary of hyperLOPIT workflow. Cells are mechanically disrupted, homogenate fractionated (conditions optimization by western blot, e.g., markers for rhoptries (RON4), micronemes (MIC2), mitochondria (TOM40), and IMC (GAP45)), and peptides labeled with a unique 10plex tandem mass tags for relative peptide quantitation by tandem mass spectrometry (LC-SPS-MS3).
(C) Abundance-distribution profiles of select subcellular marker proteins measured in the LOPIT2 experiment. Note the similarity with the WB results shown in (B). See Figure S1 for concatenated profiles of all experiments (30plex).
(D) A Venn diagram showing the numbers of unique and shared proteins identified and quantified in all 10 fractions of the three hyperLOPIT experiments.
(E) A 2D-projection of the 30plex quantitative proteomic data (i.e., abundance-distribution profiles) for 3,832 T. gondii proteins shared across three hyperLOPIT datasets. t-distributed stochastic neighbor embedding (t-SNE) was used for dimensionality reduction. Each data point represents an individual protein, and the clustering of proteins reflects the similarity of their abundance distribution profiles.
(F) Protein clusters discovered by the analysis of raw abundance-distribution profiles with HDBSCAN overlaid on the t-SNE projection. Distinct clusters are indicated by color.
(G) Mapping of 718 subcellular marker proteins on the t-SNE projection of T. gondii spatial proteome data.
Figure 2Validation of HyperLOPIT-Predicted Subcellular Locations
(A) Examples of uncharacterized proteins epitope tagged and detected by immunofluorescence microscopy (magenta) co-located with named marker proteins (green). Cell outlines are indicated (dashed lines). See Figure S2 for all validated proteins. Scale bar, 10 μm.
(B) Optical super-resolution (3D-SIM) images of select proteins (magenta) from (A) with subcellular marker proteins (green). Arrows indicate the cell posterior-to-anterior cell axis. Scale bar, 1 μm.
Figure 3Protein Assignment to Known Subcellular Niches by Supervised Bayesian Classification
(A) TAGM-MAP predicted a steady-state location of proteins (99% probability) superimposed on the t-SNE projection of the 30plex hyperLOPIT data for 3,832 proteins.
(B) The number of proteins assigned to each location. Marker proteins (Mk: previously characterized proteins + verified proteins as in Figures 2 and S2) are indicated in a dark color, newly assigned protein predictions (Pd: at 99% TAGM-MAP probability) in a light color.
(C) Heatmap showing proteins ordered by the TAGM-MAP-assigned class (rows) against joint probabilities of proteins to belong to each of the 26 defined subcellular classes or the outlier component (columns) inferred by TAGM-MCMC. Colorbars on the right show the uncertainty of TAGM-MCMC localization as the 95% equitailed confidence interval of the TAGM-MCMC localization probability (in shades of gray) and the mean Shannon entropy (in shades of red).
(D) A violin plot showing an example TAGM-MCMC distribution of localization probabilities across the 26 subcellular niches. The most probable location predicted by TAGM-MAP and TAGM-MCMC for this protein is PM-integral, but there is also a significant probability of localization to Golgi, consistent with signals seen for proteins that might cycle between multiple compartments.
(E) Fractions of monotopic and polytopic integral membrane proteins (blue and red, respectively) by subcellular class.
(F) Compartment-specific distributions of protein charge (computed pI) are shown as Tukey box plots (legend at right). The probability of class-specific means differing from the dataset average by chance is shown to the right. See also Figures S3 and S4; Tables S4 and S6A.
Figure 4Correlation of Gene-Expression Patterns within Subcellular Compartments
(A) Schematic of analysis of gene co-expression according to protein location. The distribution of co-expression levels between members of a cluster (blue) is plotted against this distribution between members of the cluster and all other genes (orange).
(B) Gene co-expression levels for select hyperLOPIT clusters measured as Pearson correlations. Cohen's d values are shown above each chart along with effect size descriptors. See also Figure S5; Table S5.
Figure 5Distinction of Properties of Apicomplexan Signal Peptide and Transmembrane Domain Sequences According to Subcellular Compartment
(A) Differences in relative positional abundances of amino acids for signal peptide (SP) sequences of proteins from apicomplexan endomembrane compartments shown as logo plots anchored on the cleavage site (position 0). See also Figure S6; Tables S7A and S7B. Amino acids are colored by physicochemical properties. (B) Distributions of apicomplexan transmembrane (TM) span length for single-span proteins of different compartments. The length distributions (violin plots) were compared pairwise by the Mann-Whitney U test, and the resulting p values (heatmap) were used to cluster membrane type. See also Table S7C.
Figure 6T. gondii Subcellular Compartments Show Distinct Distributions of the Functional Redundancy of the Proteomes, Selection Pressure, and Genetic Polymorphism
(A) Compartment-specific distribution of protein functional redundancy expressed as the average gene knockout (KO) phenotype score quantifying the contribution of each T. gondii gene to the parasite fitness during in vitro culture (a negative score indicates relatively indispensable genes; a positive score indicates dispensable genes).
(B) Compartment-specific distributions of evolutionary selection pressures expressed as the protein-average ratio of nonsynonymous and synonymous mutation rates (dN/dS ratio).
(C) Compartment-specific distributions of genetic polymorphism expressed as the density of SNP per kilobase of gene coding sequence (CDS).
Compartment-specific distributions are shown as Tukey box plots as for Figure 3F.
See also Figure S7; Tables S6B–S6D.
Figure 7T. gondii Subcellular Compartment Proteomes Reveal the Tempo of Compartment Evolution Over Evolutionary Time
A dot plot showing the distribution of significant enrichments for new protein orthologues at twelve phylogenetic distance levels within hyperLOPIT-defined apicomplexan compartment classes. p values (colors) calculated by under-representation hypergeometric test and scaled according to the gene ratio (fraction of novel proteins in a compartment against all novel proteins at a given phylogenetic distance level). Toxo./Ham., Toxoplasma/Hammondia; SAR, stramenopiles/Alveolata/Rhizaria.
See also Tables S8A–S8F.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Rabbit polyclonal anti-RON4 | Laboratory of John Boothroyd | N/A |
| Rabbit polyclonal anti-MIC2 | Laboratory of David Sibley | N/A |
| Mouse monoclonal anti-MIC3 (A80D) | Thermo Fisher Scientific | Cat#MA5-18267; RRID: |
| Rabbit polyclonal anti-BiP | Laboratory of Jay Bangs | N/A |
| Rabbit polyclonal anti-TOM40 | Laboratory of Giel van Dooren | N/A |
| Rabbit polyclonal anti-CPN60 | Laboratory of Boris Striepen | N/A |
| Rabbit polyclonal anti-GAP45 | Laboratory of Dominique Soldati-Favre | N/A |
| Mouse monoclonal [TP3] anti-SAG1 | Abcam | Cat#ab8313; RRID: |
| Rabbit polyclonal anti-profilin | Laboratory of Dominique Soldati-Favre | N/A |
| Rabbit polyclonal anti-histone H3 | Abcam | Cat#ab1791; RRID: |
| Rabbit polyclonal anti-GRA1 | Laboratory of Corinne Mercier | N/A |
| Rabbit polyclonal anti-CPL | Laboratory of Vern Carruthers | N/A |
| Rabbit polyclonal anti-CRT | Laboratory of Vern Carruthers | N/A |
| Rabbit polyclonal anti-catalase | Laboratory of Dominique Soldati-Favre | N/A |
| Mouse monoclonal (7E8) anti-ISP1 | Laboratory of Peter Bradley | N/A |
| Mouse monoclonal anti-ROP1 | Laboratory of John Boothroyd | N/A |
| Mouse monoclonal anti-V5 | Invitrogen | Cat#R960-25; RRID: |
| Rat monoclonal anti-HA | Roche | Cat#ROAHAHA; RRID: |
| Peroxidase AffiniPure Goat Anti-Rabbit IgG (H+L) | Jackson Immunoresearch | Cat#111-035-003; RRID: |
| Peroxidase AffiniPure Goat Anti-Mouse IgG (H+L) | Jackson Immunoresearch | Cat#115-035-003; RRID: |
| Peroxidase AffiniPure Donkey Anti-Rat IgG (H+L) | Jackson Immunoresearch | Cat#712-035-153; RRID: |
| Goat anti-Rat IgG (H+L) secondary, Alexa Fluor 594-conjugated | Invitrogen | Cat#A-11007; RRID: |
| Goat anti-Rat IgG (H+L) secondary, Alexa Fluor 488-conjugated | Invitrogen | Cat#A-11006; RRID: |
| Goat anti-Mouse IgG (H+L) secondary, Alexa Fluor 488-conjugated | Invitrogen | Cat#A-11029; RRID: |
| TMT10plex™ Isobaric Label Reagent Set, 3x0.8mg | Thermo Fisher Scientific | Cat#90111 |
| OptiPrep™ Density Gradient Medium | Sigma Aldrich | Cat#D1556 |
| Sequencing Grade Modified Trypsin | Promega | Cat#V5111 |
| Benzonase Nuclease HC | Merck Millipore | Cat#71205-3 |
| SuperSignal West Pico Chemiluminescent Substrate | Thermo Fisher Scientific | Cat#34080 |
| cOmplete™, EDTA-free Protease Inhibitor Cocktail | Sigma Aldrich | Cat#11873580001 |
| cOmplete™, Mini, EDTA-free Protease Inhibitor Cocktail | Sigma Aldrich | Cat#4693159001 |
| BpiI | Thermo Fisher Scientific | Cat#ER1011 |
| BsaI | New England Biolabs | Cat#R3535 |
| Streptavidin, Alexa Fluor™ 594 conjugate | Invitrogen | Cta#S11227 |
| Pierce BCA Protein Quantitation Assay | Thermo Fisher Scientific | Cat#23227 |
| Raw LC-SPS-MS3 data, peptide and protein proteomic identification and quantification results | This paper and PRIDE Archive | PRIDE ID: |
| hyperLOPIT spatial proteome map of | This paper; | |
| Analysed data | This paper | |
| Human: Human Foreskin Fibroblasts | Laboratory of Chris Tonkin | N/A |
| Laboratory of Boris Striepen | N/A | |
| Laboratory of Boris Striepen | N/A | |
| Primers and oligonucleotides for the generation of transgenic | This paper | N/A |
| Plasmid: pGEM T-Easy | Promega | Cat#A1360 |
| Plasmid: pPR2-HA3 | N/A | |
| Plasmids P1-P10 for CRISPR/Cas9-assisted and PCR-mediated genomic tagging in | This study | N/A |
| R | R Core Team, 2020 | |
| RStudio | RStudio Team, 2020 | |
| TAGM-MAP and TAGM-MCMC | ||
| t-SNE | ||
| HDBSCAN | ||
| preprocessCore | Bioconductor | |
| OrthoFinder | ||
| Diamond | ||
| Clustal Omega | ||
| SignalP | ||
| TMHMM | ||
| Phobius | ||
| ImageJ | ||
| softWoRx | Applied Precision | N/A |
| NIS-Elements | Nikon | |
| MassLynx | Waters | |
| XCalibur | Thermo Fisher Scientific | |
| Proteome Discoverer | Thermo Fisher Scientific | |
| Mascot Server | Matrix Science | |
| Interactive interface to the annotated spatial proteome data | This paper | |