| Literature DB >> 29875778 |
Ning Zhang1,2, R S P Rao2,3, Fernanda Salvato3, Jesper F Havelund4, Ian M Møller4, Jay J Thelen2,3, Dong Xu1,2,5.
Abstract
Targeting and translocation of proteins to the appropriate subcellular compartments are crucial for cell organization and function. Newly synthesized proteins are transported to mitochondria with the assistance of complex targeting sequences containing either an N-terminal pre-sequence or a multitude of internal signals. Compared with experimental approaches, computational predictions provide an efficient way to infer subcellular localization of a protein. However, it is still challenging to predict plant mitochondrially localized proteins accurately due to various limitations. Consequently, the performance of current tools can be improved with new data and new machine-learning methods. We present MU-LOC, a novel computational approach for large-scale prediction of plant mitochondrial proteins. We collected a comprehensive dataset of plant subcellular localization, extracted features including amino acid composition, protein position weight matrix, and gene co-expression information, and trained predictors using deep neural network and support vector machine. Benchmarked on two independent datasets, MU-LOC achieved substantial improvements over six state-of-the-art tools for plant mitochondrial targeting prediction. In addition, MU-LOC has the advantage of predicting plant mitochondrial proteins either possessing or lacking N-terminal pre-sequences. We applied MU-LOC to predict candidate mitochondrial proteins for the whole proteome of Arabidopsis and potato. MU-LOC is publicly available at http://mu-loc.org.Entities:
Keywords: deep neural network; gene co-expression; machine learning; mitochondrial targeting; position weight matrix; support vector machine
Year: 2018 PMID: 29875778 PMCID: PMC5974146 DOI: 10.3389/fpls.2018.00634
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Plant mitochondrial protein data collected in this study.
| Class | Data source | Species | Number of proteins | Reference |
|---|---|---|---|---|
| Positive (1,104) | Literature | 1,060 | ||
| Literature | 541 | |||
| Literature | 327 | |||
| PPDB | 460 | |||
| PPDB | 666 | |||
| SUBA3 | 1,196 | |||
| Uniprot/Swiss-Prot | Multiple plantsa | 1,547 | UniprotKB | |
| Negative (5,809) | Uniprot/Swiss-Prot | Multiple plantsa | 27,966 | UniprotKB |
Performance comparison of MU-LOC with existing tools for general plant mitochondrial targeting prediction (independent testing set 1).
| Tool | Parameter | Specificity | Sensitivity | Accuracy | Precision | F1 score | MCC |
|---|---|---|---|---|---|---|---|
| MU-LOC(DNN) | Default | 0.88 | 0.833 | ||||
| MU-LOC(SVM) | Default | 0.88 | 0.57 | 0.725 | 0.826 | 0.675 | 0.473 |
| TargetP | Default | 0.94 | 0.41 | 0.675 | 0.872 | 0.558 | 0.413 |
| Predotar | Default | 0.33 | 0.645 | 0.482 | 0.373 | ||
| YLoc | YLoc-LowRes | 0.95 | 0.28 | 0.615 | 0.848 | 0.421 | 0.310 |
| MitoProt II | Probability > 0.8a | 0.89 | 0.33 | 0.610 | 0.750 | 0.458 | 0.266 |
| MitoFates | Default | 0.24 | 0.600 | 0.857 | 0.375 | 0.288 | |
| LOCALIZER | Default | 0.90 | 0.27 | 0.585 | 0.730 | 0.394 | 0.219 |
Performance comparison of MU-LOC with existing tools for predicting plant mitochondrial proteins with N-terminal pre-sequences (independent testing set 2).
| Tool | Parameter | Specificity | Sensitivity | Accuracy | Precision | MCC |
|---|---|---|---|---|---|---|
| MU-LOC(DNN) | Default | 0.964 | 0.937 | 0.682 | 0.652 | |
| MU-LOC(SVM) | Default | 0.662 | ||||
| TargetP | 0.891 | 0.646 | 0.867 | 0.396 | 0.440 | |
| Predotar | 0.944 | 0.600 | 0.910 | 0.542 | 0.520 | |
| YLoc | 0.940 | 0.462 | 0.893 | 0.462 | 0.400 | |
| MitoProt II | Probability > 0.8a | 0.842 | 0.600 | 0.817 | 0.295 | 0.329 |
| MitoFates | Default | 0.966 | 0.615 | 0.931 | 0.667 | 0.602 |
| LOCALIZER | 0.952 | 0.600 | 0.917 | 0.582 | 0.540 |