| Literature DB >> 31632983 |
Vijay Phanindra Srikanth Kompella1,2, Ian Stansfield3, Maria Carmen Romano2,3, Ricardo L Mancera1.
Abstract
The cytoplasm is a densely packed environment filled with macromolecules with hindered diffusion. Molecular simulation of the diffusion of biomolecules under such macromolecular crowding conditions requires the definition of a simulation cell with a cytoplasmic-like composition. This has been previously done for prokaryote cells (E. coli) but not for eukaryote cells such as yeast as a model organism. Yeast proteomics datasets vary widely in terms of cell growth conditions, the technique used to determine protein composition, the reported relative abundance of proteins, and the units in which abundances are reported. We determined that the gene ontology profiles of the most abundant proteins across these datasets are similar, but their abundances vary greatly. To overcome this problem, we chose five mass spectrometry proteomics datasets that fulfilled the following criteria: high internal consistency, consistency with published experimental data, and freedom from GFP-tagging artifacts. Using these datasets, the contents of a simulation cell containing a single 80S ribosome were defined, such that the macromolecular density and the mass ratio of ribosomal-to-cytoplasmic proteins were consistent with experiment and chosen datasets. Finally, multiple tRNAs were added, consistent with their experimentally-determined number in the yeast cell. The resulting composition can be readily used in molecular simulations representative of yeast cytoplasmic macromolecular crowding conditions to characterize a variety of phenomena, such as protein diffusion, protein-protein interactions and biological processes such as protein translation.Entities:
Keywords: macromolecular crowding; molecular dynamics; protein translation; proteomics; yeast
Year: 2019 PMID: 31632983 PMCID: PMC6783697 DOI: 10.3389/fmolb.2019.00097
Source DB: PubMed Journal: Front Mol Biosci ISSN: 2296-889X
Figure 1Distribution of protein mass (calculated as the product of molecular weight times abundance) per cell plotted as a function of the mass rank of each protein. Proteins in the yeast proteomics dataset were ranked according to their mass, exhibiting a clear exponential decrease as a function of their mass rank in the cell. In the inset the cumulative percentage of mass is plotted as a function of rank. The top 200 cytoplasmic proteins contribute to ~70% of the total cell protein mass.
Figure 2Statistical analyses of proteomics datasets. (A) Pairwise correlations between the ontological profiles obtained for the individual datasets. Correlations were measured using the Pearson correlation coefficient, whose values are color-coded (from the highest correlation in yellow to the lowest correlation in blue). (B) The ontology profile overlap between datasets is quantified using the Jaccard index and the color-code is the same as in the previous panel. In both panels mass spectrometry based datasets are indicated in red on the axes labeled as LU (Lu et al., 2007), PENG (Peng et al., 2012), KUL (Kulak et al., 2014), LAW (Lawless et al., 2016), LAHT (Lahtvee et al., 2017), DGD (De Godoy et al., 2008), PIC (Picotti et al., 2013), LEE2 (Lee et al., 2011), THAK (Thakur et al., 2011), NAG (Nagaraj et al., 2012), and WEB (Webb et al., 2013); GFP datasets are shown in green on the axes and are labeled as TKA (Tkach et al., 2012), BRE (Breker et al., 2013), DEN (Denervaud et al., 2013), MAZ (Mazumder et al., 2013), CHO (Chong et al., 2015), YOF (Yofe et al., 2016), NEW (Newman et al., 2006), LEE (Lee et al., 2007), and DAV (Davidson et al., 2011); and the TAP-immunoblot dataset is shown in white on the axes and is labeled as GHA (Ghaemmaghami et al., 2003). The top 200 proteins are shown to have a similar gene ontology profile across all of the datasets.
Figure 3Testing of statistical difference between the abundance of ribosomal proteins in each of the datasets. Mass spectrometry-based datasets are shown in red on the axes, GFP datasets are shown in green on the axes and the TAP-immunoblot dataset is shown in white. Ribosomal protein numbers were not reported in the YOF dataset and, therefore, it is not included. The results of t-tests with p > (0.05/190) are colored dark blue and all others are colored light blue. GFP datasets exhibit a high level of consistency. There is also consistency among the first five MS datasets. However, there are no discernible patterns in terms of the growth media, growth phase or protein abundance units.