| Literature DB >> 32374845 |
Adrian M Altenhoff1,2, Javier Garrayo-Ventas3, Salvatore Cosentino4, David Emms5, Natasha M Glover1,6,7, Ana Hernández-Plaza8, Yannis Nevers1,6,7,9, Vicky Sundesha3, Damian Szklarczyk1,10, José M Fernández3, Laia Codó3, The Quest For Orthologs Consortium, Josep Ll Gelpi3,11, Jaime Huerta-Cepas8, Wataru Iwasaki4, Steven Kelly5, Odile Lecompte9, Matthieu Muffato12, Maria J Martin12, Salvador Capella-Gutierrez3, Paul D Thomas13, Erik Sonnhammer14, Christophe Dessimoz1,6,7,15,16.
Abstract
The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32374845 PMCID: PMC7319555 DOI: 10.1093/nar/gkaa308
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the https://orthology.benchmarkservice.org website. Benchmarks are now computed using the OpenEBench cloud-based platform from ELIXIR.
Figure 2.Functionality of the OpenEBench platform for ortholog benchmarking: 1) Upload the data to evaluate an OrthoXML file containing participant's orthologs predictions. 2) Select the benchmarking event among the available in the Virtual Research Environment. 3) Set the parameters for the benchmarking run. 4) Compare the results of the new predictor against the rest of the participants with the available visualizers.
Figure 3.Excerpts from the public benchmark results on the 2018 QfO reference dataset. (A, B) The choice of recall measure (x-axis) can have a big impact on the generalized species tree discordance test. (C) In the Gene Ontology conservation benchmark, nearly all methods lie on the ‘Pareto frontier’ (dotted line). (D) In the reference gene tree benchmark based on SwissTree, most methods have a similar precision (y-axis) but vary considerably in recall (x-axis). Error bars indicate 95% confidence intervals. Note that the axis ranges have been chosen to optimise the separation of the data points. As such they do not show proportional changes in accuracy measures and so careful interpretation is required. The full results, with interactive viewing options (selection of methods included, choice of precision and recall measures, display of full axis range etc.), are accessible at https://orthology.benchmarkservice.org.
Figure 4.Consensus orthology call for the gene RPB7 in S. cerevisiae, available from https://alliancegenome.org.