| Literature DB >> 29106550 |
Adrian M Altenhoff1,2, Natasha M Glover1,3,4, Clément-Marie Train1,3,4, Klara Kaleb5, Alex Warwick Vesztrocy1,5, David Dylus1,3,4, Tarcisio M de Farias1,3,4, Karina Zile1,5, Charles Stevenson5, Jiao Long6, Henning Redestig6, Gaston H Gonnet1,2, Christophe Dessimoz1,3,4,5,7.
Abstract
The Orthologous Matrix (OMA) is a leading resource to relate genes across many species from all of life. In this update paper, we review the recent algorithmic improvements in the OMA pipeline, describe increases in species coverage (particularly in plants and early-branching eukaryotes) and introduce several new features in the OMA web browser. Notable improvements include: (i) a scalable, interactive viewer for hierarchical orthologous groups; (ii) protein domain annotations and domain-based links between orthologous groups; (iii) functionality to retrieve phylogenetic marker genes for a subset of species of interest; (iv) a new synteny dot plot viewer; and (v) an overhaul of the programmatic access (REST API and semantic web), which will facilitate incorporation of OMA analyses in computational pipelines and integration with other bioinformatic resources. OMA can be freely accessed at https://omabrowser.org.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29106550 PMCID: PMC5753216 DOI: 10.1093/nar/gkx1019
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Distribution of the 2085 species contained in the October 2017 OMA release. The number of genomes in each taxonomic rank is conveyed as the angle of the relevant sector, and the average number of proteins is conveyed as its height in a square-root scale. Colors are automatically selected to contrast the different domains of life, and within them the different sister clades.
Figure 2.New interactive HOG viewer. An excerpt of the NOX family at the deuterostome level (left) and at the vertebrate level (right). The tree depicts relationships between species, squares depict genes (human NOX1, NOX2 and NOX3 genes are highlighted in color) and HOGs are delineated by vertical black lines.
Figure 3.The domain architecture view of a HOG. Information about the HOG (on the top) is followed by the table containing information about other HOGs that share at least one domain in common with the HOG of interest. Deepest level: the last common ancestor of the species represented in a HOG; HOG size: the number of genes in a HOG; Representative Domain Architecture: the architecture that is characteristic of most of the proteins in a HOG; Prevalence: the percentage of the proteins in a HOG that have this domain architecture; Similarity: the number of the domains shared between this HOG and the HOG of interest (including duplicated domains). The table can be sorted by any of the attributes.
Figure 4.New dotplot synteny viewer, which enables users to identify gene order conservation between chromosomes as diagonal segments (main view in panel A). Inversions are visible as diagonal flips, which can be nested (panel B). Tandem duplications on one or the other chromosome are visible as vertical or horizontal lines—and, if both are present, as blocks (panel C). To focus on a subset of the data according to sequence divergence, the user can restrict the desired range of the distribution of the evolutionary distance of each point. Points can be selected by the user, in which case more details are provided in a table (panel D), including links to the local synteny viewer (panel E).
Figure 5.Example of a SPARQL query to programmatically retrieve pairwise orthologs involving the sequence LATCH00597. Sample queries are provided in the right column of the page, accessible at http://sparql.omabrowser.org.