Literature DB >> 34325658

Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree.

Tetsu Sakamoto1,2, J Miguel Ortega3.   

Abstract

BACKGROUND: NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks.
RESULTS: To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or "no rank" node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles.
CONCLUSION: Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy .
© 2021. The Author(s).

Entities:  

Keywords:  Linnaean system; NCBI Taxonomy; No rank; Taxonomic lineage; Taxonomic rank

Year:  2021        PMID: 34325658     DOI: 10.1186/s12859-021-04304-3

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  30 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  SMART, a simple modular architecture research tool: identification of signaling domains.

Authors:  J Schultz; F Milpetz; P Bork; C P Ponting
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

3.  miRBase: integrating microRNA annotation and deep-sequencing data.

Authors:  Ana Kozomara; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2010-10-30       Impact factor: 16.971

4.  The NCBI Taxonomy database.

Authors:  Scott Federhen
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

5.  ArrayExpress update--simplifying data submissions.

Authors:  Nikolay Kolesnikov; Emma Hastings; Maria Keays; Olga Melnichuk; Y Amy Tang; Eleanor Williams; Miroslaw Dylag; Natalja Kurbatova; Marco Brandizi; Tony Burdett; Karyn Megy; Ekaterina Pilicheva; Gabriella Rustici; Andrew Tikhonov; Helen Parkinson; Robert Petryszak; Ugis Sarkans; Alvis Brazma
Journal:  Nucleic Acids Res       Date:  2014-10-31       Impact factor: 16.971

6.  The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements.

Authors:  Adrian M Altenhoff; Nives Škunca; Natasha Glover; Clément-Marie Train; Anna Sueki; Ivana Piližota; Kevin Gori; Bartlomiej Tomiczek; Steven Müller; Henning Redestig; Gaston H Gonnet; Christophe Dessimoz
Journal:  Nucleic Acids Res       Date:  2014-11-15       Impact factor: 16.971

7.  The International Nucleotide Sequence Database Collaboration.

Authors:  Guy Cochrane; Ilene Karsch-Mizrachi; Toshihisa Takagi
Journal:  Nucleic Acids Res       Date:  2015-12-10       Impact factor: 16.971

8.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees.

Authors:  Huaiyu Mi; Anushya Muruganujan; Paul D Thomas
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

9.  Pfam: the protein families database.

Authors:  Robert D Finn; Alex Bateman; Jody Clements; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Andreas Heger; Kirstie Hetherington; Liisa Holm; Jaina Mistry; Erik L L Sonnhammer; John Tate; Marco Punta
Journal:  Nucleic Acids Res       Date:  2013-11-27       Impact factor: 16.971

10.  The Ensembl gene annotation system.

Authors:  Bronwen L Aken; Sarah Ayling; Daniel Barrell; Laura Clarke; Valery Curwen; Susan Fairley; Julio Fernandez Banet; Konstantinos Billis; Carlos García Girón; Thibaut Hourlier; Kevin Howe; Andreas Kähäri; Felix Kokocinski; Fergal J Martin; Daniel N Murphy; Rishi Nag; Magali Ruffier; Michael Schuster; Y Amy Tang; Jan-Hinnerk Vogel; Simon White; Amonida Zadissa; Paul Flicek; Stephen M J Searle
Journal:  Database (Oxford)       Date:  2016-06-23       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.