Literature DB >> 26432830

MitoMiner v3.1, an update on the mitochondrial proteomics database.

Anthony C Smith1, Alan J Robinson2.   

Abstract

Mitochondrial proteins remain the subject of intense research interest due to their implication in an increasing number of different conditions including mitochondrial and metabolic disease, cancer, and neuromuscular degenerative and age-related disorders. However, the mitochondrial proteome has yet to be accurately and comprehensively defined, despite many studies. To support mitochondrial research, we developed MitoMiner (http://mitominer.mrc-mbu.cam.ac.uk), a freely accessible mitochondrial proteomics database. MitoMiner integrates different types of subcellular localisation evidence with protein information from public resources, and so provides a comprehensive central resource for data on mitochondrial protein localisation. Here we report important updates to the database including the addition of subcellular immunofluorescent staining results from the Human Protein Atlas, computational predictions of mitochondrial targeting sequences, and additional large-scale mass-spectrometry and GFP tagging data sets. This evidence is shared across the 12 species in MitoMiner (now including Schizosaccharomyces pombe) by homology mapping. MitoMiner provides multiple ways of querying the data including simple text searches, predefined queries and custom queries created using the interactive QueryBuilder. For remote programmatic access, API's are available for several programming languages. This combination of data and flexible querying makes MitoMiner a unique platform to investigate mitochondrial proteins, with application in mitochondrial research and prioritising candidate mitochondrial disease genes.
© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26432830      PMCID: PMC4702766          DOI: 10.1093/nar/gkv1001

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Mitochondria are involved in a diverse range of cellular processes including metabolism, energy production, signalling, cell growth and apoptosis. They are mobile organelles constantly fusing, dividing and replicating, and have tissue specific roles such as ammonia detoxification in liver. It is therefore unsurprising these organelles are associated with a wide spectrum of metabolic, degenerative and age-related human diseases as well as cancer. This has generated considerable interest in mitochondria from a wide range of researchers. However, much of the mitochondrial proteome has yet to be conclusively identified which hinders investigations into the role of the organelle. Many different approaches have been used to address this problem, but each has limitations and no single technique provides full coverage of the mitochondrial proteome. Numerous mass spectrometry experiments have identified proteins in purified fractions of mitochondria, but a proportion of these proteins are cellular contaminants, and the results are limited to identifying proteins expressed in the tissue type examined. Further, it is challenging to extract and cross-reference results from these studies, as the data are usually published as supplementary tables with varying identifiers. A different approach uses GFP tagging to identify mitochondrial proteins. However, the tag can interfere with translocation of the protein into mitochondria. In addition, the approach is time-consuming and technically challenging in mammals and so many of these data sets originate from yeast, although these have functionally distinct mitochondria compared to higher eukaryotes. Computational methods have focussed on predicting subcellular targeting motifs in the N-termini of protein sequences (1–3). However, many known mitochondrial proteins lack a targeting sequence whereas many other proteins are predicted to have one but are experimentally found not to localise to the organelle. The Gene Ontology provides literature-based annotation of proteins, including subcellular localisation (4). However, this is an indivisible combination of annotation for well-characterised proteins whose mitochondrial localisation has been conclusively determined, and annotation derived from (often only single) large-scale localisation studies that include many false positives. The most recent effort has been from the Human Protein Atlas (5), which used antibodies to immunofluorescently stain proteins and localise them by microscopy. But this approach may suffer from cross reactivity and staining failures. Thus cross-referencing between these different evidence types would be useful to independently verify candidates and reduce false positive rates, and was the premise for the first version of MitoMiner (6), which then only included mass spectrometry and GFP tagging data from 33 studies with Gene Ontology annotation. We have now updated MitoMiner to include the new localisation evidence from the Human Protein Atlas, mitochondrial targeting sequence predictions and have expanded the number of experimental studies to 58. Homology information from HomoloGene (7) allows this evidence to be shared across the 12 species in MitoMiner (Homo sapiens, Mus musculus, Rattus norvegicus, Bos taurus, Drosophila melanogaster, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Plasmodium falciparum, Neurospora crassa, Tetrahymena thermophila and Giardia lamblia). MitoMiner has a complementary role of giving a biological context for candidate mitochondrial proteins by integrating information from other public resources. This provides a useful and flexible starting point for many analyses, such as assessing and prioritising candidates generated from ‘omics data sets or exome sequencing of mitochondrial disease patients. This information includes annotation from UniProt (8), and the Gene Ontology (9), metabolic pathway data from KEGG (10), disease information from OMIM (11) and (new to latest version) tissue and cancer expression from the Human Protein Atlas (5) and InterPro protein domain information (12). To query these data, MitoMiner provides a powerful and flexible user interface, allowing everything from simple text searches to complicated queries with multiple constraints spanning any of the included data types, (see previous publications for a detailed description (6,13)). Users can also run queries on uploaded lists of proteins, or use a pre-existing list such as the widely-respected MitoCarta inventory of mitochondrial proteins (14).

SOFTWARE IMPLEMENTATION AND DATA IMPORT

To minimise development time and reduce legacy issues, MitoMiner was built using the InterMine open source data warehouse system, updated to version 1.2.2 (15). The InterMine core model is the basis for the database structure and describes types of biological data including genes, proteins, publications and hierarchical gene ontology terms. To model data types specific to MitoMiner—such as mass spectrometry and GFP tagging data sets, metabolic pathway data and homology mappings—bespoke tables were created that extend the database structure. Data were imported by using either InterMine-provided data loaders, or custom Perl scripts to convert raw data files to InterMine compatible XML data files. These scripts were designed so data updates require minimal manual intervention and so ease database maintenance. The MitoMiner data sources are updated on a 9–12 month basis.

UPDATES TO DATA SOURCES

Addition of new mass-spectrometry and GFP data sets

Since the last publication (13) we have increased the number of large-scale mass spectrometry and GFP tagging studies in MitoMiner from 46 to 58 (16–27). Every data entry in MitoMiner has full provenance of its originating study and for mass spectrometry includes the experimental techniques used for purification, separation and identification, to show how the authors reduced contaminants. All entries of existing data sets were remapped to UniProt to remove obsolete and redundant UniProt protein identifiers. The total number of data entries in MitoMiner by species is shown in Table 1.
Table 1.

Summary of mitochondrial proteomics studies in MitoMiner

SpeciesNumber of publicationsNumber of data entriesaNumber of genes with experimental evidenceb
Mass spectrometryGFP
H. sapiens1549031441839
M. musculus1217577523076
B. taurus128030
R. norvegicus9339801836
D. melanogaster143042
S. cerevisiae11319312571291
S. pombe10430432
A. thaliana59530483
N. crassa12900232
T. thermophila13100294
G. lamblia19930641

aThe number of unique data entries from mass spectrometry or GFP tagging mitochondrial localization studies.

bThe number of unique Ensembl gene identifiers. Does not include mitochondrial evidence from homologs.

aThe number of unique data entries from mass spectrometry or GFP tagging mitochondrial localization studies. bThe number of unique Ensembl gene identifiers. Does not include mitochondrial evidence from homologs.

Addition of mitochondrial targeting sequence predictions

Many programs have been developed to predict subcellular targeting motifs in protein sequences. All these programs have web services to scan individual sequences, but with a large number of candidates this is cumbersome and hinders comparison with other localisation evidence. Therefore, in this update MitoMiner now includes the results from three popular mitochondrial target sequence prediction programs: iPSORT (1), TargetP (2) and MITOPROT (3). For each program, MitoMiner stores the prediction score for every protein in the proteome of the 12 species included, which allows different score thresholds for each program to be used in queries. The number of proteins predicted to have a mitochondrial targeting sequence, by species is shown in Table 2.
Table 2.

Summary of mitochondrial targeting sequence predictions in different proteomes

SpeciesNumber of genes encoding proteins with a predicted mitochondrial targeting sequenceTotal
iPSORTaMitoProtbTargetPb
H. sapiens394018863874716
M. musculus305215263633679
B. taurus231212352672911
R. norvegicus261713502973220
D. melanogaster16549161871990
S. cerevisiae9915851201182
S. pombe68438981822
A. thaliana487122819276323
N. crassa10395712681161
T. thermophila1678827362133
G. lamblia909282601023

aWith a score of 1.0 (scoring is binary).

bWith a score equal to or greater than 0.9.

aWith a score of 1.0 (scoring is binary). bWith a score equal to or greater than 0.9.

Addition of data from the human protein atlas

The most important new type of large-scale subcellular localisation data comes from immunofluorescent staining and microscopy conducted by the Human Protein Atlas (HPA) (5). For each protein with HPA data we incorporated the original Ensembl gene identifier, main subcellular location reported, any other subcellular locations, expression type (whether localisation has been confirmed with multiple antibodies) and reliability (does this the location agree with UniProt annotation). To provide more biological context for protein entries, we also incorporated the HPA immunohistochemical expression results from 59 different tissues and 20 cancer types. For tissue expression we included tissue name, tissue group, cell type, expression type, expression level and reliability. To aid interpreting these data we used an InterMine graphical summary to provide the results in an easily understandable format (Figure 1). For cancer expression we included the original Ensembl gene identifier, tumour type, number of patient samples with a particular level of expression (strong, moderate, weak or negative) and expression type.
Figure 1.

Graphical summary of Human Protein Atlas tissue expression data for a mitochondrial protein in MitoMiner.

Graphical summary of Human Protein Atlas tissue expression data for a mitochondrial protein in MitoMiner.

Other improvements

To improve the searchability of MitoMiner for gene-based queries and analyses (such as in identifying mitochondrial genes amongst variants found in exome sequencing), we expanded gene information to include HUGO gene symbol, Ensembl identifier, Ensembl gene description, chromosome, NCBI gene identifier and model organism specific gene identifiers (e.g. from Mouse Genome Database, Rat Genome Database and Saccharomcyes Genome Database). To improve metabolic analyses for systems biology applications, KEGG reaction entries were expanded to include the reaction's estimated change in Gibbs free energy (ΔG) (28), the reaction directionality defined by KEGG, and the reaction equation using KEGG compound identifiers. Protein entries now include InterPro domain information (29) enabling queries for subsets of (novel) mitochondrial proteins with particular functions—e.g. RNA binding. Remote programmatic access via the Application Programming Interface (API) was improved with the updated version of the InterMine software and includes client libraries for Ruby in addition to Perl, Python and Java. Lastly the documentation, tutorials and user guides have been extensively updated.

AVAILABILITY

MitoMiner is freely available at the Medical Research Council Mitochondrial Biology Unit website (http://mitominer.mrc-mbu.cam.ac.uk/). The main website is accompanied with a full set of support pages including FAQ's, user guides, examples and tutorials (http://mitominer.mrc-mbu.cam.ac.uk/support).
  28 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Extensive feature detection of N-terminal protein sorting signals.

Authors:  Hideo Bannai; Yoshinori Tamada; Osamu Maruyama; Kenta Nakai; Satoru Miyano
Journal:  Bioinformatics       Date:  2002-02       Impact factor: 6.937

3.  Defining the mitochondrial proteomes from five rat organs in a physiologically significant context using 2D blue-native/SDS-PAGE.

Authors:  Nicole H Reifschneider; Sataro Goto; Hideko Nakamoto; Ryoya Takahashi; Michiru Sugawa; Norbert A Dencher; Frank Krause
Journal:  J Proteome Res       Date:  2006-05       Impact factor: 4.466

4.  Locating proteins in the cell using TargetP, SignalP and related tools.

Authors:  Olof Emanuelsson; Søren Brunak; Gunnar von Heijne; Henrik Nielsen
Journal:  Nat Protoc       Date:  2007       Impact factor: 13.491

5.  Computational method to predict mitochondrially imported proteins and their targeting sequences.

Authors:  M G Claros; P Vincens
Journal:  Eur J Biochem       Date:  1996-11-01

6.  Proteomics. Tissue-based map of the human proteome.

Authors:  Mathias Uhlén; Linn Fagerberg; Björn M Hallström; Cecilia Lindskog; Per Oksvold; Adil Mardinoglu; Åsa Sivertsson; Caroline Kampf; Evelina Sjöstedt; Anna Asplund; IngMarie Olsson; Karolina Edlund; Emma Lundberg; Sanjay Navani; Cristina Al-Khalili Szigyarto; Jacob Odeberg; Dijana Djureinovic; Jenny Ottosson Takanen; Sophia Hober; Tove Alm; Per-Henrik Edqvist; Holger Berling; Hanna Tegel; Jan Mulder; Johan Rockberg; Peter Nilsson; Jochen M Schwenk; Marica Hamsten; Kalle von Feilitzen; Mattias Forsberg; Lukas Persson; Fredric Johansson; Martin Zwahlen; Gunnar von Heijne; Jens Nielsen; Fredrik Pontén
Journal:  Science       Date:  2015-01-23       Impact factor: 47.728

7.  ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombe.

Authors:  Akihisa Matsuyama; Ritsuko Arai; Yoko Yashiroda; Atsuko Shirai; Ayako Kamata; Shigeko Sekido; Yumiko Kobayashi; Atsushi Hashimoto; Makiko Hamamoto; Yasushi Hiraoka; Sueharu Horinouchi; Minoru Yoshida
Journal:  Nat Biotechnol       Date:  2006-06-25       Impact factor: 54.908

8.  Global survey of human T leukemic cells by integrating proteomics and transcriptomics profiling.

Authors:  Linfeng Wu; Sun-Il Hwang; Karim Rezaul; Long J Lu; Viveka Mayya; Mark Gerstein; Jimmy K Eng; Deborah H Lundgren; David K Han
Journal:  Mol Cell Proteomics       Date:  2007-05-21       Impact factor: 5.911

9.  A novel single-cell screening platform reveals proteome plasticity during yeast stress responses.

Authors:  Michal Breker; Melissa Gymrek; Maya Schuldiner
Journal:  J Cell Biol       Date:  2013-03-18       Impact factor: 10.539

10.  Identification of novel proteins affected by rotenone in mitochondria of dopaminergic cells.

Authors:  Jinghua Jin; Jeanne Davis; David Zhu; Daniel T Kashima; Marc Leroueil; Catherine Pan; Kathleen S Montine; Jing Zhang
Journal:  BMC Neurosci       Date:  2007-08-16       Impact factor: 3.288

View more
  67 in total

1.  Accounting for Protein Subcellular Localization: A Compartmental Map of the Rat Liver Proteome.

Authors:  Michel Jadot; Marielle Boonen; Jaqueline Thirion; Nan Wang; Jinchuan Xing; Caifeng Zhao; Abla Tannous; Meiqian Qian; Haiyan Zheng; John K Everett; Dirk F Moore; David E Sleat; Peter Lobel
Journal:  Mol Cell Proteomics       Date:  2016-12-06       Impact factor: 5.911

Review 2.  Mitochondrial Morphofunction in Mammalian Cells.

Authors:  Elianne P Bulthuis; Merel J W Adjobo-Hermans; Peter H G M Willems; Werner J H Koopman
Journal:  Antioxid Redox Signal       Date:  2018-11-29       Impact factor: 8.401

3.  Omics Integration for Mitochondria Systems Biology.

Authors:  Xin Hu; Young-Mi Go; Dean P Jones
Journal:  Antioxid Redox Signal       Date:  2020-02-03       Impact factor: 8.401

Review 4.  Systemic effects of mitochondrial stress.

Authors:  Raz Bar-Ziv; Theodore Bolas; Andrew Dillin
Journal:  EMBO Rep       Date:  2020-05-24       Impact factor: 8.807

Review 5.  Regulation of Mammalian Mitochondrial Gene Expression: Recent Advances.

Authors:  Sarah F Pearce; Pedro Rebelo-Guiomar; Aaron R D'Souza; Christopher A Powell; Lindsey Van Haute; Michal Minczuk
Journal:  Trends Biochem Sci       Date:  2017-03-09       Impact factor: 13.807

6.  mtProtEvol: the resource presenting molecular evolution analysis of proteins involved in the function of Vertebrate mitochondria.

Authors:  Anastasia A Kuzminkova; Anastasia D Sokol; Kristina E Ushakova; Konstantin Yu Popadin; Konstantin V Gunbin
Journal:  BMC Evol Biol       Date:  2019-02-26       Impact factor: 3.260

7.  Mitochondrial Retrograde Signaling in Mammals Is Mediated by the Transcriptional Cofactor GPS2 via Direct Mitochondria-to-Nucleus Translocation.

Authors:  Maria Dafne Cardamone; Bogdan Tanasa; Carly T Cederquist; Jiawen Huang; Kiana Mahdaviani; Wenbo Li; Michael G Rosenfeld; Marc Liesa; Valentina Perissi
Journal:  Mol Cell       Date:  2018-03-01       Impact factor: 17.970

8.  Mouse Trmt2B protein is a dual specific mitochondrial metyltransferase responsible for m5U formation in both tRNA and rRNA.

Authors:  Ivan Laptev; Ekaterina Shvetsova; Sergey Levitskii; Marina Serebryakova; Maria Rubtsova; Alexey Bogdanov; Piotr Kamenski; Petr Sergiev; Olga Dontsova
Journal:  RNA Biol       Date:  2019-11-27       Impact factor: 4.652

9.  Dysregulation in the Brain Protein Profile of Zebrafish Lacking the Parkinson's Disease-Related Protein DJ-1.

Authors:  Amanda J Edson; Helena A Hushagen; Ann Kristin Frøyset; Inga Elda; Essa A Khan; Antonio Di Stefano; Kari E Fladmark
Journal:  Mol Neurobiol       Date:  2019-06-19       Impact factor: 5.590

10.  The mitochondrial protein PGAM5 suppresses energy consumption in brown adipocytes by repressing expression of uncoupling protein 1.

Authors:  Sho Sugawara; Yusuke Kanamaru; Shiori Sekine; Lila Maekawa; Akinori Takahashi; Tadashi Yamamoto; Kengo Watanabe; Takao Fujisawa; Kazuki Hattori; Hidenori Ichijo
Journal:  J Biol Chem       Date:  2020-03-06       Impact factor: 5.157

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.