Riccardo Dondi1, Manuel Lafond2, Nadia El-Mabrouk3. 1. Dipartimento di Lettere, Filosofia, Comunicazione, Università degli Studi di Bergamo, Via Donizetti 3, 24129 Bergamo, Italy. 2. Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada. 3. Département d'informatique et de recherche opérationnelle, Université de Montréal, Quebec, Canada.
Abstract
BACKGROUND: Given a gene family, the relations between genes (orthology/paralogy), are represented by a relation graph, where edges connect pairs of orthologous genes and "missing" edges represent paralogs. While a gene tree directly induces a relation graph, the converse is not always true. Indeed, a relation graph is not necessarily "satisfiable", i.e. does not necessarily correspond to a gene tree. And even if that holds, it may not be "consistent", i.e. the tree may not represent a true history in agreement with a species tree. Previous studies have addressed the problem of correcting a relation graph for satisfiability and consistency. Here we consider the weighted version of the problem, where a degree of confidence is assigned to each orthology or paralogy relation. We also consider a maximization variant of the unweighted version of the problem. RESULTS: We provide complexity and algorithmic results for the approximation of the considered problems. We show that minimizing the correction of a weighted graph does not admit a constant factor approximation algorithm assuming the unique game conjecture, and we give an n-approximation algorithm, n being the number of vertices in the graph. We also provide polynomial time approximation schemes for the maximization variant for unweighted graphs. CONCLUSIONS: We provided complexity and algorithmic results for variants of the problem of correcting a relation graph for satisfiability and consistency. For the maximization variants we were able to design polynomial time approximation schemes, while for the weighted minimization variants we were able to provide the first inapproximability results.
BACKGROUND: Given a gene family, the relations between genes (orthology/paralogy), are represented by a relation graph, where edges connect pairs of orthologous genes and "missing" edges represent paralogs. While a gene tree directly induces a relation graph, the converse is not always true. Indeed, a relation graph is not necessarily "satisfiable", i.e. does not necessarily correspond to a gene tree. And even if that holds, it may not be "consistent", i.e. the tree may not represent a true history in agreement with a species tree. Previous studies have addressed the problem of correcting a relation graph for satisfiability and consistency. Here we consider the weighted version of the problem, where a degree of confidence is assigned to each orthology or paralogy relation. We also consider a maximization variant of the unweighted version of the problem. RESULTS: We provide complexity and algorithmic results for the approximation of the considered problems. We show that minimizing the correction of a weighted graph does not admit a constant factor approximation algorithm assuming the unique game conjecture, and we give an n-approximation algorithm, n being the number of vertices in the graph. We also provide polynomial time approximation schemes for the maximization variant for unweighted graphs. CONCLUSIONS: We provided complexity and algorithmic results for variants of the problem of correcting a relation graph for satisfiability and consistency. For the maximization variants we were able to design polynomial time approximation schemes, while for the weighted minimization variants we were able to provide the first inapproximability results.
Entities:
Keywords:
Approximation algorithms; Gene tree; Orthology; Paralogy; Species tree
Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock Journal: Nat Genet Date: 2000-05 Impact factor: 38.330
Authors: Marc Hellmuth; Maribel Hernandez-Rosales; Katharina T Huber; Vincent Moulton; Peter F Stadler; Nicolas Wieseke Journal: J Math Biol Date: 2012-03-29 Impact factor: 2.259
Authors: Marc Hellmuth; Nicolas Wieseke; Marcus Lechner; Hans-Peter Lenhof; Martin Middendorf; Peter F Stadler Journal: Proc Natl Acad Sci U S A Date: 2015-02-02 Impact factor: 11.205
Authors: Marcus Lechner; Sven Findeiss; Lydia Steiner; Manja Marz; Peter F Stadler; Sonja J Prohaska Journal: BMC Bioinformatics Date: 2011-04-28 Impact factor: 3.169
Authors: Manuela Geiß; Edgar Chávez; Marcos González Laffitte; Alitzel López Sánchez; Bärbel M R Stadler; Dulce I Valdivia; Marc Hellmuth; Maribel Hernández Rosales; Peter F Stadler Journal: J Math Biol Date: 2019-04-09 Impact factor: 2.259
Authors: Manuela Geiß; Marcos E González Laffitte; Alitzel López Sánchez; Dulce I Valdivia; Marc Hellmuth; Maribel Hernández Rosales; Peter F Stadler Journal: J Math Biol Date: 2020-01-30 Impact factor: 2.259
Authors: Nikolai Nøjgaard; Manuela Geiß; Daniel Merkle; Peter F Stadler; Nicolas Wieseke; Marc Hellmuth Journal: Algorithms Mol Biol Date: 2018-02-06 Impact factor: 1.405