| Literature DB >> 24510099 |
Joao C Guimaraes1, Miguel Rocha, Adam P Arkin.
Abstract
The range over which a protein is expressed, and its cell-to-cell variability, is often thought to be linked to the demand for its activity. Steady-state protein level is determined by multiple mechanisms controlling transcription and translation, many of which are limited by DNA- and RNA-encoded signals that affect initiation, elongation and termination of polymerases and ribosomes. We performed a comprehensive analysis of >100 sequence features to derive a predictive model composed of a minimal non-redundant set of factors explaining 66% of the total variation of protein abundance observed in >800 genes in Escherichia coli. The model suggests that protein abundance is primarily determined by the transcript level (53%) and by effectors of translation elongation (12%), whereas only a small fraction of the variation is explained by translational initiation (1%). Our analyses uncover a new sequence determinant, not previously described, affecting translation initiation and suggest that elongation rate is affected by both codon biases and specific amino acid composition. We also show that transcription and translation efficiency may have an effect on expression noise, which is more similar than previously assumed.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24510099 PMCID: PMC4005695 DOI: 10.1093/nar/gku126
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Determinants of PA in E. coli. (A) Predicted versus experimentally measured protein concentration using a composite model with 16 predictors (R2 = 0.66 and CV R2 = 0.65). (B) Aggregated explanation of PA variation by each group of predictors. (C) Regression coefficients for all the predictors in the model. Error bars represent the standard deviation of the regression coefficients based on jackknife variance estimates from 10-fold CV procedure.
Factors’ individual correlation
| Variable | Correlation with PA | Partial correlation with PA given mRNA levels |
|---|---|---|
| mRNA level | 0.7262*** | 0 |
| TIR | ||
| 16S:SD (exterior loop ΔG) | 0.1075** | 0.1240*** |
| RBS calculator score | 0.1462*** | 0.1106** |
| Accessibility | 0.0606 | 0.0912* |
| Single-stranded bases | 0.0630 | 0.0884* |
| Folding energy (ΔG) | 0.0635 | 0.0729 |
| CDS | ||
| CAI | 0.5828*** | 0.3526*** |
| ATC | 0.3974*** | 0.2734*** |
| GAA | 0.3215*** | 0.2527*** |
| Ile | 0.1933*** | 0.2319*** |
| Glu | 0.2940*** | 0.2252*** |
List of the top five predictors with most significant partial Pearson correlation coefficients with PA given the mRNA concentration for each category of features considered. F-test P-values were adjusted using false discovery rate (FDR) method (56) to correct for multiple testing: *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001.
Figure 2.Transcription and translation efficiency act in a concerted fashion. (A) Individual contribution of mRNA and CDS features for low (<4 molecules per cell in average), medium and highly (>54 molecules per cell in average) expressed genes. We observed a concerted contribution of mRNA levels and CDS features to the steady-state PA. (B) Most low abundant genes tend to be expressed using medium to low levels of mRNA and a low contribution of the CDS features. (C) Genes expressed at medium abundance show a balance between mRNA and CDS contribution, where both factors appear most of the time at average levels. (D) Highly abundant genes demand for high levels of mRNA and a medium-high contribution of CDS features. Heatmap shade indicates the number of genes.
Figure 3.Transcription and translation efficiency affect expression noise. Genes that have noisier expression tend to have less efficient transcription (A) and increased translation efficiency (defined as the aggregated contribution of TIR and CDS sequence features) (B). The genes were subdivided into two groups: low- and high-noise differential genes, accordingly to the lower and upper quartile of their noise differential levels. High/low-noise differential genes have higher/lower than expected coefficient of variation given the mean expression. Mann–Whitney test significance: *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001.