| Literature DB >> 35596139 |
Federica Conte1, Federico Papa1,2, Paola Paci3, Lorenzo Farina4.
Abstract
BACKGROUND: Gene expression is the result of the balance between transcription and degradation. Recent experimental findings have shown fine and specific regulation of RNA degradation and the presence of various molecular machinery purposely devoted to this task, such as RNA binding proteins, non-coding RNAs, etc. A biological process can be studied by measuring time-courses of RNA abundance in response of internal and/or external stimuli, using recent technologies, such as the microarrays or the Next Generation Sequencing devices. Unfortunately, the picture provided by looking only at the transcriptome abundance may not gain insight into its dynamic regulation. By contrast, independent simultaneous measurement of RNA expression and half-lives could provide such valuable additional insight. A computational approach to the estimation of RNAs half-lives from RNA expression time profiles data, can be a low-cost alternative to its experimental measurement which may be also affected by various artifacts.Entities:
Keywords: Bioinformatics; Computational biology; Gene expression time profiles; RNA half-lives
Mesh:
Substances:
Year: 2022 PMID: 35596139 PMCID: PMC9123730 DOI: 10.1186/s12859-022-04730-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Illustration of the mathematical model underlying StaRTrEK. Gene pairs used to estimated half-lives are assumed to have a common (up to a scaling factor) promoter activity and different half-life values that can explain the different shape of the gene expression time profile
Performance indices on artificial data of half-lives and expression time-courses with different numbers of time samples (n)
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| 10 | 3 | 0.89 | ||
|
| 10 | 3 | 0.89 | ||
|
| 10 | 4 | 0.89 | ||
|
| 10 | 2 | 0.86 |
Number of gene expression profiles () and noise amplitude () are kept fixed. Legend—: optimal value of the regularization parameter; : error threshold expressed as percentile of the MSE distribution; : Pearson correlation coefficient; pval: p value; FDR: false discovery rate. For each instance of the noise distribution considered, we found a negligible variability of the quality indices, therefore variance is not reported in the table
Performance indices on artificial data of half-lives and expression time-courses with different noise amplitudes (C)
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| 10 | 1 | 0.90 | ||
|
| 10 | 2 | 0.86 | ||
|
| 10 | 2 | 0.86 | ||
|
| 10 | 4 | 0.68 | ||
|
| 10 | 10 | 0.11 |
Numbers of gene expression profiles () and of time samples () are kept fixed. Legend—: optimal value of the regularization parameter; : error threshold expressed as percentile of the MSE distribution; : Pearson correlation coefficient; pval: p value; FDR: false discovery rate. For each instance of the noise distribution considered, we found a negligible variability of the quality indices, therefore variance is not reported in the table
Performance indices on artificial data of half-lives and expression time-courses with different numbers of half-lives to be estimated (m)
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| 10 | 2 | 0.86 | ||
|
| 10 | 3 | 0.82 | ||
|
| 10 | 3 | 0.78 | ||
|
| 10 | 6 | 0.78 |
Number of time samples () and noise amplitude () are kept fixed. Legend—: optimal value of the regularization parameter; : error threshold expressed as percentile of the MSE distribution; : Pearson correlation coefficient; pval: p value; FDR: false discovery rate. For each instance of the noise distribution considered, we found a negligible variability of the quality indices, therefore variance is not reported in the table
Fig. 2StaRTrEK algorithm validation using experimental data. StaRTrEK results obtained for yeast DNA damage (A), yeast oxidative stress (B), and malaria IDC (C) datasets. For each dataset, left panel shows the criterion for the selection of the error threshold , whereas the right panel shows the scatterplot of RNA half-lives estimated by the StaRTrEK algorithm versus experimentally measured values
Summary of performance indices on experimental data. Legend—: optimal value of the regularization parameter; : error threshold expressed as percentile of the MSE distribution; : Pearson correlation coefficient; pval: p value; FDR: false discovery rate
|
|
|
|
|
| |
|---|---|---|---|---|---|
| nDNA damage | 7 | 3 | 0.68 | ||
| Oxidative stress | 14 | 31 | 0.54 | ||
| Malaria IDC | 15 | 9 | 0.59 |