| Literature DB >> 32421834 |
Anjali Garg1, Neelja Singhal1, Ravindra Kumar1, Manish Kumar1.
Abstract
Recent evidences suggest that the localization of mRNAs near the subcellular compartment of the translated proteins is a more robust cellular tool, which optimizes protein expression, post-transcriptionally. Retention of mRNA in the nucleus can regulate the amount of protein translated from each mRNA, thus allowing a tight temporal regulation of translation or buffering of protein levels from bursty transcription. Besides, mRNA localization performs a variety of additional roles like long-distance signaling, facilitating assembly of protein complexes and coordination of developmental processes. Here, we describe a novel machine-learning based tool, mRNALoc, to predict five sub-cellular locations of eukaryotic mRNAs using cDNA/mRNA sequences. During five fold cross-validations, the maximum overall accuracy was 65.19, 75.36, 67.10, 99.70 and 73.59% for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. Assessment on independent datasets revealed the prediction accuracies of 58.10, 69.23, 64.55, 96.88 and 69.35% for extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The corresponding values of AUC were 0.76, 0.75, 0.70, 0.98 and 0.74 for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The mRNALoc standalone software and web-server are freely available for academic use under GNU GPL at http://proteininformatics.org/mkumar/mrnaloc.Entities:
Year: 2020 PMID: 32421834 PMCID: PMC7319581 DOI: 10.1093/nar/gkaa385
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overall schema of mRNALoc. mRNALoc predicts five subcellular locations viz., mitochondria, cytoplasm, nucleus, endoplasmic reticulum and extracellular. Firstly, it removes the sequences from the query that has non-standard nucleotides then generates combined features from pseudo K-tuple nucleotide composition, which is further used as input for Support Vector Machine (SVM) prediction.
The performance metrics for mRNA subcellular localization under hybrid K-mer feature (2+3+4+5), and performance of the SVM based classifiers (mRNALoc) on independent data
| Location | Sen (%) | Spe (%) | ACC (%) | MCC | THR | AUC |
|---|---|---|---|---|---|---|
|
| ||||||
| Extracellular region | 62.67 | 65.34 | 65.19 | 0.14 | −0.20 | 0.69 |
| Endoplasmic reticulum | 74.09 | 75.49 | 75.36 | 0.32 | 0.40 | 0.81 |
| Cytoplasm | 66.69 | 67.41 | 67.10 | 0.34 | 0.40 | 0.69 |
| Mitochondria | 96.28 | 99.79 | 99.70 | 0.95 | 0.10 | 0.98 |
| Nucleus | 74.17 | 73.22 | 73.59 | 0.47 | 0.40 | 0.76 |
|
| ||||||
| Extracellular region | 81.38 | 56.67 | 58.10 | 0.18 | −0.20 | 0.76 |
| Endoplasmic reticulum | 75.10 | 68.60 | 69.23 | 0.27 | 0.40 | 0.75 |
| Cytoplasm | 73.26 | 58.06 | 64.55 | 0.31 | 0.40 | 0.70 |
| Mitochondria | 87.32 | 97.16 | 96.88 | 0.63 | 0.10 | 0.98 |
| Nucleus | 50.20 | 81.62 | 69.35 | 0.34 | 0.40 | 0.74 |
Sen: sensitivity, Spe: specificity, ACC: accuracy, MCC: Mathews correlation coefficient, THR: threshold, and AUC: area under ROC curve.
Comparative evaluation of mRNALoc and iLoc-mRNA. In extracellular region and mitochondria no human mRNA was present, hence these two locations were not included in the evaluation
| mRNALoc | iLoc-mRNA | ||||
|---|---|---|---|---|---|
| Location | Number of human mRNA sequences | True positive | False negative | True positive | False negative |
| Cytoplasm | 50 | 35 | 15 | 18 | 32 |
| Endoplasmic reticulum | 50 | 34 | 16 | 37 | 13 |
| Extracellular region | 0 | 0 | 0 | 0 | 0 |
| Mitochondria | 0 | 0 | 0 | 0 | 0 |
| Nucleus | 50 | 33 | 17 | 13 | 37 |
Figure 2.Screenshots of mRNALoc webserver.