| Literature DB >> 20686685 |
Marlena Siwiak1, Piotr Zielenkiewicz.
Abstract
Translation is still poorly characterised at the level of individual proteins and its role in regulation of gene expression has been constantly underestimated. To better understand the process of protein synthesis we developed a comprehensive and quantitative model of translation, characterising protein synthesis separately for individual genes. The main advantage of the model is that basing it on only a few datasets and general assumptions allows the calculation of many important translational parameters, which are extremely difficult to measure experimentally. In the model, each gene is attributed with a set of translational parameters, namely the absolute number of transcripts, ribosome density, mean codon translation time, total transcript translation time, total time required for translation initiation and elongation, translation initiation rate, mean mRNA lifetime, and absolute number of proteins produced by gene transcripts. Most parameters were calculated based on only one experimental dataset of genome-wide ribosome profiling. The model was implemented in Saccharomyces cerevisiae, and its results were compared with available data, yielding reasonably good correlations. The calculated coefficients were used to perform a global analysis of translation in yeast, revealing some interesting aspects of the process. We have shown that two commonly used measures of translation efficiency - ribosome density and number of protein molecules produced - are affected by two distinct factors. High values of both measures are caused, i.a., by very short times of translation initiation, however, the origins of initiation time reduction are completely different in both cases. The model is universal and can be applied to any organism, if the necessary input data are available. The model allows us to better integrate transcriptomic and proteomic data. A few other possibilities of the model utilisation are discussed concerning the example of the yeast system.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20686685 PMCID: PMC2912337 DOI: 10.1371/journal.pcbi.1000865
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
The translational parameters calculated in the model.
| par | mean | median | sd | min | max | description |
| L | 513.3 | 430.5 | 365.2 | 37 | 4911 | Length of the transcript CDS in codons. |
| x | 7.8 | 2.7 | 28.9 | 0.140 | 591.3 | Absolute number of transcripts in a yeast cell. |
| B | 1.0e+4 | 677 | 7.7e+4 | 0.650 | 2.4e+6 | Total amount of protein molecules produced from transcripts of a particular type. |
| g | 1.1 | 0.8 | 0.9 | 0.003 | 6.6 | Ribosome density in number of ribosomes attached to a transcript per 100 codons. |
| w | 5.6 | 3.1 | 7.3 | 0.010 | 142 | The absolute number of ribosomes on a transcript. |
| P | 5.3e-5 | 3.6e-5 | 5.4e-5 | 1.5e-7 | 6.2e-4 | The translation initiation frequency (the inverse of I). |
| Pz | 2.2e-4 | 7.6e-5 | 8.0e-4 | 3.8e-6 | 1.6e-2 | The relative rate of binding of free ribosomes to the 5′ end of a transcript. |
| Ps | 1.6e-2 | 6.4e-3 | 2.9e-2 | 5.2e-6 | 4.3e-1 | The relative rate of a successful accomplishment of initiation once the ribosome-mRNA complex is formed, normalised by the maximal observed value of Ps, reported for gene YLL040C. |
| T | 2:50 | 2:20 | 3:23 | 0:06 | 113:08 | Total time of translation of one protein molecule from a given transcript (min:sec). |
| I | 0:54 | 0:28 | 3:06 | 0:02 | 111:54 | Total time required for translation initiation (min:sec). |
| E | 1:56 | 1:36 | 1:24 | 0:04 | 17:54 | Total time required for translation elongation of a transcript (min:sec). |
| mean_E | 0.224 | 0.229 | 0.031 | 0.098 | 0.360 | Mean time required for elongation of one codon of a transcript (sec). |
| h | 2:45:51 | 1:31:44 | 3:59:18 | 0:00:19 | 42:27:31 | Estimated half-life of a transcript (h:min:sec). |
| m | 3:59:16 | 2:12:20 | 5:45:13 | 0:00:27 | 61:15:18 | Estimated mean life-time of a transcript (h:min:sec). |
Column descriptions: (1) name of the parameter; (2) mean value; (3) median value; (4) standard deviation; (5) minimal observed value; (6) maximal observed value; and (7) parameter description. For all parameters, except , , and , the columns 2, 3, 4, 5, and 6 were calculated over the entire dataset of 4,470 yeast genes. For parameters , , and the columns 2, 3, 4, 5, and 6 were calculated over the set of 4,192 genes.
Figure 1Translation model of YJL173C.
The bottom plot shows all of the translation initiation events during the mean lifetime of one mRNA molecule. Translation initiations are marked with ribosome-shaped symbols. The orange line indicates the mean lifetime of YJL173C mRNA. The broken curves' slope depicts the rate of polypeptide chain growth measured at particular codons. The number of curves indicates the number of protein molecules (here 46) produced from one mRNA during its lifetime. The top-right plot shows, in magnitude, the translation of the first protein molecule (darkbrown curve). The time is measured since the transcript becomes accessible to the translation machinery. The first seconds are spent on translation initiation; elongation begins after about 10 sec. Red dots mark ribosome positions in time (dotted blue lines) and space (dashed blue lines) when the following ribosomes attach to the mRNA molecule. The histogram on the left shows the mean translation times of particular codons of the YJL173C sequence. The dashed black line is the mean time of translation of one codon of the YJL173C mRNA sequence.
Model determined mRNA and protein abundances versus experimental studies.
| compared datasets | mRNA abundances | protein abundances | ||||
| common genes | adj. |
| common genes | adj. |
| |
| our dataset vs Gygi et al. | 67 | 0.84 | 1.25 | 69 | 0.97 | 1.01 |
| our dataset vs Futcher et al. | 28 | 0.84 | 1.24 | 26 | 0.98 | 0.92 |
| Gygi et al. vs Futcher et al. | 25 | 0.97 | 1.04 | 27 | 0.99 | 0.91 |
| our dataset vs Holstege et al. | 3769 | 0.30 | 0.75 | |||
The comparison of mRNA and protein abundances obtained in the model (reflected by parameters and ) with values reported by three independent experimental studies [13], [26], [29]. We performed a simple linear regression through the origin on the log-transformed values. Column descriptions: (common genes), number of common genes in two compared datasets; (adj. ), adjusted values for the linear regression model; and (), regression coefficient. The third row is the comparison of the two experimental studies with each other. All coefficients were statistically significant (F-statistic p-values ).
Figure 2Model results vs experimental studies.
The plots show the comparison of model parameters (left) and (right) with experimentally determined mRNA and protein abundances by two independent studies [13], [26]. The axes were log transformed. Calculated values are presented in Table 2. The distribution of the log-fold differences of the mRNA and protein concentrations reported by the model and reference studies are presented in Supplementary Figure S1.
Figure 3Calculated transcript abundance vs experimental studies.
Left plot: the comparison of model parameter with mRNA abundances determined by high-density oligonucleotide array (HDA) experiment [29]. The axes were log transformed. Calculated value for the comparison is presented in Table 2. Right plot: distribution of the log-fold differences of the mRNA concentrations reported by the model and reference study.
Figure 4Correlation of mRNA and protein expression levels.
The plot shows the correlation between mRNA abundance (parameter ) and the number of protein molecules produced from a given gene (parameter ). We performed linear regression through the origin on log transformed data. Adjusted value calculated over the entire dataset (4192 genes of known ) was 0.59. This means that over 40% (in log space) of the variation in protein abundance cannot be explained by variation in mRNA abundance, suggesting some additional, posttranscriptional mechanisms of gene expression regulation.
Translational parameters of 20 genes of low transcriptional activity and high protein production rate.
| parameter | median | min | max |
| L | 753.5 | 205 | 1877 |
| x | 3.38 | 1.61 | 4.30 |
| B | 37794 | 25263 | 97900 |
| g | 3.22 | 1.35 | 4.86 |
| w | 20.05 | 8.32 | 63.25 |
| P | 1.7e-4 | 6.9e-5 | 3.3e-4 |
| Pz | 9.4e-5 | 4.5e-5 | 1.2e-4 |
| Ps | 2.9e-2 | 1.0e-2 | 8.3e-2 |
| T | 2:28 | 0:28 | 5:31 |
| I | 0:06 | 0:03 | 0:15 |
| E | 2:23 | 0:25 | 5:26 |
| mean_E | 0.193 | 0.124 | 0.213 |
| m | 24:09:05 | 8:18:33 | 56:26:44 |
The distribution of translational parameter values for the set of 20 genes having high protein production rates () and relatively low transcriptional activity (). Column descriptions: (1) name of the parameter; (2) median value; (3) minimal observed value; and (4) maximal observed value. The units are the same as those presented in Table 1.
Figure 5Codon optimality vs translation time.
The plot shows the coparison of translation times in 30C of individual yeast codons with codon optimality values calculated by [36]. There is negative correlation between value and translation time of a codon. However, while optimal codons (high values) have only short times of translation, non-optimal codons may be translated at both high and low rates. Adjusted value obtained in linear regression model through the origin on log transformed values indicates, that translation speed may explain only 15% of variability in values.
Time of tRNAs insertions at four different temperatures.
| temp |
|
|
|
| 20 | 40.0 | 46.3 | 02.2 |
| 24 | 26.6 | 30.7 | 01.5 |
| 30 | 16.1 | 18.7 | 00.9 |
| 37 | 09.1 | 10.5 | 00.5 |
Values of , , and coefficients at four different temperatures. is the average time to insert an amino acid from a cognate aa-tRNA, and are the average time delays caused by the binding attempts by near- and non-cognate tRNA, respectively. All times are in ms.