| Literature DB >> 28438116 |
Amy L Bauernfeind1,2, Courtney C Babbitt3.
Abstract
BACKGROUND: Next generation sequencing methods are the gold standard for evaluating expression of the transcriptome. When determining the biological implications of such studies, the assumption is often made that transcript expression levels correspond to protein levels in a meaningful way. However, the strength of the overall correlation between transcript and protein expression is inconsistent, particularly in brain samples.Entities:
Keywords: Chimpanzee; Gene; Proteomics; RNA-Seq
Mesh:
Substances:
Year: 2017 PMID: 28438116 PMCID: PMC5402646 DOI: 10.1186/s12864-017-3674-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Cumulative percentages of R2 values. GO categories of biological process (green), molecular function (red) and cellular component (blue) are plotted together. The lighter green, red, and blue lines represent the results of permutation tests that produced 1000 OLS regressions from randomly sampled transcripts and proteins of equivalent categorical sizes of biological process, molecular function, and cellular component, respectively. The observed data from all three GO annotations have distributions with a positive skew, displaying how there are categories of both molecular function and cellular components that are more predictive of protein expression than randomly associated transcripts and proteins. The data is plotted with a bin width is 0.05, and the representative line is Gaussian smoothed. Duplicate categories that contain the same molecules and have the same R2 value as another category are not represented
GO categories with the highest predictive value between transcript/protein pairs
| Category |
| R2 |
| Slope | Slope CI | Intercept | Intercept CI |
|---|---|---|---|---|---|---|---|
|
| |||||||
| regulation of protein modification process | 10 | 0.51 | 0.0211 | 0.38 | 0.07 – 0.68 | 4.06 | 3.37 – 4.75 |
| positive regulation of cell communication | 11 | 0.50 | 0.0145 | 0.47 | 0.12 – 0.82 | 3.86 | 3.05 – 4.68 |
| membrane lipid metabolic process | 11 | 0.47 | 0.0192 | 0.75 | 0.15 – 1.35 | 2.99 | 1.52 – 4.46 |
| regulation of phosphate metabolic process | 10 | 0.47 | 0.0290 | 0.38 | 0.05 – 0.7 | 4.08 | 3.34 – 4.82 |
| regulation of synaptic plasticity | 10 | 0.46 | 0.0317 | 0.73 | 0.08 – 1.38 | 3.58 | 1.88 – 5.29 |
| regulation of kinase activity | 15 | 0.44 | 0.0070 | 0.76 | 0.25 – 1.27 | 3.16 | 1.85 – 4.46 |
| protein amino acid phosphorylation | 34 | 0.43 | 0.0000 | 0.56 | 0.33 – 0.79 | 3.8 | 3.27 – 4.33 |
| phosphate metabolic process | 55 | 0.40 | 0.0000 | 0.57 | 0.38 – 0.76 | 3.75 | 3.31 – 4.19 |
| cellular protein complex assembly | 16 | 0.40 | 0.0089 | 0.67 | 0.2 – 1.15 | 4.02 | 2.87 – 5.17 |
| phosphorylation | 41 | 0.38 | 0.0000 | 0.56 | 0.33 – 0.79 | 3.82 | 3.3 – 4.35 |
| endocytosis | 19 | 0.37 | 0.0055 | 0.57 | 0.19 – 0.95 | 4.07 | 3.2 – 4.93 |
| calcium ion transport | 11 | 0.37 | 0.0483 | 0.65 | 0.01 – 1.3 | 3.72 | 2.04 – 5.39 |
| DNA metabolic process | 11 | 0.35 | 0.0548 | 0.6 | -0.02 – 1.21 | 3.77 | 2.48 – 5.05 |
| negative regulation of protein metabolic process | 19 | 0.35 | 0.0081 | 0.45 | 0.13 – 0.76 | 4.08 | 3.23 – 4.92 |
| cell cycle | 14 | 0.35 | 0.0272 | 0.5 | 0.07 – 0.93 | 3.7 | 2.77 – 4.62 |
| positive regulation of cell proliferation | 20 | 0.34 | 0.0068 | 0.41 | 0.13 – 0.69 | 4.02 | 3.50 – 4.54 |
| regulation of hydrolase activity | 18 | 0.34 | 0.0110 | 0.53 | 0.14 – 0.92 | 3.91 | 2.98 – 4.84 |
| regulation of actin filament length | 12 | 0.34 | 0.0477 | 0.48 | 0.01 – 0.96 | 4.18 | 2.87 – 5.49 |
| cell projection organization | 14 | 0.33 | 0.0306 | 0.67 | 0.07 – 1.26 | 3.64 | 2.00 – 5.28 |
| regulation of growth | 12 | 0.33 | 0.0499 | 0.51 | 0 – 1.02 | 3.93 | 2.67 – 5.18 |
| regulation of cell growth | 10 | 0.33 | 0.0824 | 0.48 | -0.08 – 1.04 | 3.89 | 2.55 – 5.24 |
| positive regulation of cellular process | 64 | 0.31 | 0.0000 | 0.44 | 0.27 – 0.61 | 4.16 | 3.77 – 4.54 |
| protein modification process | 65 | 0.30 | 0.0000 | 0.46 | 0.28 – 0.64 | 3.96 | 3.55 – 4.38 |
|
| |||||||
| phosphatase activity | 16 | 0.47 | 0.0036 | 0.97 | 0.37 – 1.56 | 2.67 | 1.24 – 4.11 |
| phosphoprotein phosphatase activity | 12 | 0.41 | 0.0241 | 0.72 | 0.11 – 1.32 | 3.29 | 1.83 – 4.74 |
| protein kinase activity | 33 | 0.37 | 0.0002 | 0.55 | 0.28 – 0.81 | 3.76 | 3.2 – 4.33 |
| protein domain specific binding | 18 | 0.37 | 0.0079 | 0.55 | 0.16 – 0.93 | 3.74 | 2.82 – 4.65 |
| phosphotransferase activity alcohol group as acceptor | 45 | 0.35 | 0.0000 | 0.56 | 0.33 – 0.79 | 3.8 | 3.31 – 4.3 |
| manganese ion binding | 12 | 0.34 | 0.0451 | -0.39 | -0.76 – - 0.01 | 5.25 | 4.52 – 5.98 |
| ATPase activity coupled to transmembrane movement of ions phosphorylative mechanism | 17 | 0.34 | 0.0135 | 0.39 | 0.09 – 0.69 | 4.74 | 4.04 – 5.44 |
| GTPase activity | 52 | 0.33 | 0.0000 | 0.48 | 0.29 – 0.68 | 4.28 | 3.8 – 4.75 |
| calmodulin binding | 31 | 0.33 | 0.0008 | 0.52 | 0.24 – 0.81 | 4.11 | 3.4 – 4.83 |
| ligase activity | 20 | 0.31 | 0.0102 | 0.42 | 0.11 – 0.72 | 3.85 | 3.11 – 4.58 |
|
| |||||||
| outer membrane | 10 | 0.66 | 0.0043 | 0.88 | 0.37 – 1.40 | 3.4 | 2.30 – 4.51 |
| membrane coat | 13 | 0.46 | 0.0107 | 0.76 | 0.21 – 1.30 | 3.58 | 2.25 – 4.90 |
| extracellular matrix | 11 | 0.38 | 0.0435 | 0.67 | 0.02 – 1.32 | 3.71 | 2.16 – 5.25 |
| organelle lumen | 10 | 0.37 | 0.0612 | -0.38 | -0.79 – 0.02 | 5.93 | 4.98 – 6.89 |
| mitochondrial membrane part | 21 | 0.36 | 0.0043 | 0.84 | 0.30 – 1.39 | 3.23 | 1.94 – 4.52 |
| cell surface | 12 | 0.30 | 0.0630 | 0.35 | -0.02 – 0.72 | 4.35 | 3.43 – 5.27 |
| membrane-enclosed lumen | 12 | 0.30 | 0.0638 | -0.37 | -0.77 – 0.03 | 6.01 | 5.07 – 6.94 |
Data in bold are the subcategories
Fig. 2Coefficients of determination for GO categories of molecular function and cellular component. R2 values are typically quite low but certain categories (labeled) display greater predictive value than others. These categories have R2 values that exceed the mean of the 1000 resampled categories by approximately four standard deviations (two standard deviations above the means for the respective annotations)
Spearman rank correlation coefficients (ρ) between the transcript/protein categorical R2 values and possible sources of variation
| Biological Process | Molecular Function | Cellular Compartment | ||
|---|---|---|---|---|
| Category Size | All categories | −0.08 | −0.05 | −0.25** |
| Categories with < 20 transcript/protein pairs | −0.05 | −0.05 | −0.26 | |
| Categories with ≥ 20 transcript/protein pairs | 0.03 | −0.22* | −0.19 | |
| Abundance | Mean gene expression | −0.09 | −0.17 | 0.10 |
| Mean protein expression | −0.07 | 0.15 | 0.04 | |
| Gene Length | All categories | 0.05 | −0.03 | 0.11 |
| Synthesis/degradation | Transcription rate | 0.17* | −0.31** | −0.29* |
| Translation rate | 0.13 | 0.29** | −0.30** | |
| mRNA half-life | −0.10 | −0.36** | 0.26* | |
| Protein half-life | 0.12 | 0.05 | −0.24 |
*p value of < 0.05
**p value of < 0.01
Results of multiple regression analyses of R2 value (dependent variable) against transcription and translation rates and mRNA and protein half-lives (independent variables)
| Estimate | Std. error | t-value |
| |
|---|---|---|---|---|
|
| ||||
| Intercept | 4.23E-02 | 8.07E-02 | 0.52 | 0.60 |
| Transcription rate | 1.09E-02 | 4.06E-03 | 2.69 | <0.01 |
| Translation rate | 8.86E-05 | 5.11E-05 | 1.73 | 0.08 |
| mRNA half-life | -4.32E-03 | 5.60E-03 | −0.77 | 0.44 |
| Protein half-life | 5.61E-04 | 1.97E-04 | 2.84 | <0.01 |
| R2 = 0.08, F(4, 207) = 4.315, | ||||
|
| ||||
| Intercept | 3.41E-01 | 1.02E-01 | 3.35 | <0.01 |
| Transcription rate | -8.63E-03 | 5.74E-03 | −1.5 | 0.14 |
| Translation rate | 4.12E-05 | 2.89E-05 | 1.43 | 0.16 |
| mRNA half-life | -1.38E-02 | 6.71E-03 | −2.06 | <0.05 |
| Protein half-life | -2.58E-04 | 1.97E-04 | −1.31 | 0.19 |
| R2 = 0.13, F(4, 87) = 3.125, | ||||
|
| ||||
| Intercept | -1.89E-01 | 1.05E-01 | -1.80 | 0.08 |
| Transcription rate | -2.99E-02 | 8.26E-03 | −3.62 | <0.01 |
| Translation rate | 8.71E-05 | 6.15E-05 | 1.42 | 0.16 |
| mRNA half-life | 3.63E-02 | 6.71E-03 | 5.40 | <0.01 |
| Protein half-life | -8.21E-04 | 2.79E-04 | −2.94 | <0.01 |
| R2 = 0.40, F(4, 63) = 10.41, | ||||
Data in bold are the subcategories