| Literature DB >> 23955518 |
Wataru Iwasaki1, Tsukasa Fukunaga, Ryota Isagozawa, Koichiro Yamada, Yasunobu Maeda, Takashi P Satoh, Tetsuya Sado, Kohji Mabuchi, Hirohiko Takeshima, Masaki Miya, Mutsumi Nishida.
Abstract
Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.Entities:
Keywords: database; fish; genome annotator; high-throughput sequencing; mitochondrion; phylogenetics
Mesh:
Substances:
Year: 2013 PMID: 23955518 PMCID: PMC3808866 DOI: 10.1093/molbev/mst141
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Comparison Table of Mitogenomic Databases.
| Database | Taxonomic Coverage | Sequence Data Type | Availability of Re-annotation Pipeline | Update Frequency/ Last Update | Reference |
|---|---|---|---|---|---|
| GOBASE | Eukaryotes | Complete + partial | — | June 2010 | |
| MamMiBase | Mammals | Protein coding genes only | — | June 2010 | |
| METAMiGA | Metazoans | Complete | — | Daily | |
| MitoZoa | Metazoans, excluding placozoans | Complete + nearly complete | Semiautomatic | December 2011 | |
| MitoFish | Fish (vertebrates, excluding tetrapods) | Complete + partial | Fully automatic | Monthly |
aThe last update dates were checked on 25 June, 2013.
FMitoFish home page. A vertical menu bar on the right-hand side allows users to access the main functions of MitoFish. The fish species/taxonomy search and sequence similarity searches can also be performed directly from the home page.
FMitogenome page of individual species. The mitogenome page of each species includes a picture of the fish and a visual representation of the annotated circular mitogenome to aid visual recognition. Users can download mitogenomic sequences and the associated annotation data from the links. Information on sample vouchers and registration institutions is also provided. To facilitate further analysis, taxonomic information and links to external databases are comprehensively summarized.
FOverview of the MitoAnnotator pipeline. Please refer to the main text and figure 4 for the details of each procedure.
FWorkflows to determine the coordinates of protein-coding genes. Workflows to determine the end position (A) and the start position (B) of protein-coding genes are presented.
Numbers of Genomes Whose Automatic Annotations Were Inconsistent with Annotations Performed by Experts for 42 Mitogenomes.
| Category of inconsistent annotations | Mito Annotator | MITOS |
|---|---|---|
| Annotation of additional genes | 0 | 3 |
| Different start positions of protein-coding genes | 0 | 42 |
| Different stop positions of protein-coding genes | 0 | 42 |
| Different start positions of tRNA genes | 0 | 0 |
| Different stop positions of tRNA genes | 0 | 3 |
aWe excluded start/stop positions of rRNA genes from this comparison table because the annotation of rRNA genes is intrinsically difficult as described in the text.
bEach of the three additional genes predicted by MITOS was a second protein-coding gene copy located in the d-loop of each mitogenome. These genes were very short (the 105-bp ATP8 gene of Nesiarchus nasutus, the 324-bp ND6 gene of Kali indica, and the 438-bp ND2 gene of Diplospinus multistriatus) and are likely to be misannotations.