Literature DB >> 34718760

PhylomeDB V5: an expanding repository for genome-wide catalogues of annotated gene phylogenies.

Diego Fuentes^1,2, Manuel Molina^1,2, Uciel Chorostecki^1,2, Salvador Capella-Gutiérrez¹, Marina Marcet-Houben^1,2, Toni Gabaldón^1,2,3.

Abstract

PhylomeDB is a unique knowledge base providing public access to minable and browsable catalogues of pre-computed genome-wide collections of annotated sequences, alignments and phylogenies (i.e. phylomes) of homologous genes, as well as to their corresponding phylogeny-based orthology and paralogy relationships. In addition, PhylomeDB trees and alignments can be downloaded for further processing to detect and date gene duplication events, infer past events of inter-species hybridization and horizontal gene transfer, as well as to uncover footprints of selection, introgression, gene conversion, or other relevant evolutionary processes in the genes and organisms of interest. Here, we describe the latest evolution of PhylomeDB (version 5). This new version includes a newly implemented web interface and several new functionalities such as optimized searching procedures, the possibility to create user-defined phylome collections, and a fully redesigned data structure. This release also represents a significant core data expansion, with the database providing access to 534 phylomes, comprising over 8 million trees, and homology relationships for genes in over 6000 species. This makes PhylomeDB the largest and most comprehensive public repository of gene phylogenies. PhylomeDB is available at http://www.phylomedb.org.

Entities: Chemical

Mesh：

Substances：
Proteome

Year: 2022 PMID： 34718760 PMCID： PMC8728271 DOI： 10.1093/nar/gkab966

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The evolutionary history of a group of homologous genes (i.e. a gene family) is best established through phylogenetic analysis, which results in representations of homology relationships between the gene's sequences (i.e. multiple sequence alignments), and of the inferred pattern of their past divergence (i.e. a phylogenetic tree) (1). Phylogenetic trees and multiple sequence alignments can be analyzed to delineate homology, orthology and paralogy relationships between genes or domains (2). Distinguishing orthologs (i.e. homologous genes that diverged from speciation events) from paralogs (homologous genes that diverged from duplication events) has important implications for comparing genomes across species, predicting the function of newly annotated genes, or reconstructing species relationships (3). In addition, gene trees serve to uncover relevant past evolutionary events such as gene duplication, loss or horizontal transfer, or to reveal footprints of purifying and directional selection, among others. In addition, when analyzing genome-wide collections of gene trees (i.e. a phylome) in a collective manner rather than individually, then evolutionary events at the genome and organismal level can be inferred, such as past hybridizations and polyploidizations (4–6). Similarly, by searching for specific patterns among gene trees in a phylome, the set of genes undergoing duplication or positive selection in a specific lineage can be identified, which in turn can serve to derive hypotheses about the genomic changes underlying a given evolutionary innovation (7,8). These and other applications underscore the usefulness of genome-wide collections of phylogenetic trees. However, building large collections of phylogenetic trees is computationally costly and complex in design, and many researchers with modest expertise in bioinformatics or phylogenetic reconstruction methods benefit from precomputed gene phylogenies present in public repositories, such as Ensembl, PANTHER, EGGNOG or PhylomeDB (9–12). PhylomeDB (Phylome database) is a knowledgebase of evolutionary relationships between protein-coding genes, represented in the form of annotated phylogenetic trees and multiple sequence alignments, which are reconstructed through standardized, state-of-the-art phylogenetic pipelines. One unique feature of PhylomeDB is that its phylogenetic reconstruction pipeline uses a gene-centric approach in which each gene encoded in the genome of interest (seed genome) is sequentially used as a seed in a phylogenetic reconstruction pipeline, which reproduces the procedures a phylogeneticist would perform to reconstruct the evolutionary history of a given gene (see below for the description of the pipeline). This gene-centric approach circumvents several of the problems associated with defining gene families. By nature, gene families are inherently hierarchical, and diversify in complex ways due to gene duplication and loss (3). Alternative approaches define families by clustering a network of pairwise relations to identify densely connected sub-networks that cannot represent the actual hierarchy present in the data (2). PhylomeDB’s gene-centric approach overcomes this step and therefore results in a collection of evolutionary histories, each one taken from the perspective of a single gene. This collection fully covers the genes encoded in the seed genome and is partially redundant, with a given evolutionary (i.e. a speciation, or a gene duplication) event likely captured in several trees. Such redundancy, in turn, allows the use of consistency-based approaches in downstream evolutionary analyses, such as the detection of duplications (13), and the inference of orthology and paralogy relationships (14). Beyond reconstructing comprehensive collections of gene trees and alignments, which can be browsed and mined, PhylomeDB annotates them with functional and evolutionary information. Firstly, protein sequences are annotated with respect to their protein domain composition using HMMER searches (15) to protein domains in PFAM (16), and GO functions through Uniprot crosslinking (17). Secondly, phylogenetic trees are automatically processed (see below) to provide a root and annotate internal nodes, which are labelled as speciation or duplication events according to the species-overlap algorithm (2). In addition, sequences at terminal nodes are annotated for taxonomy, gene and protein name, as well as other available annotations provided by source databases, if available. Finally, PhylomeDB provides orthology and paralogy calls based on the most up-to-date release of consistency-based predictions from the MetaPhOrs server (14), which extends homology relationships to over 6000 species. Phylogenetic trees, alignments, sequences and other associated information can be searched, browsed and displayed interactively. In addition, entire collections (phylomes) and relevant tables (i.e. orthology and paralogy) can be downloaded through an FTP server for bulk downstream processing. Since its first release in 2006 (18), PhylomeDB has been continuously expanded and has been enriched with new features, always in accordance with recommendations and standards set by the Quest for Orthologs consortium (19), in which PhylomeDB is a founder and active member. Here we describe the major changes in the current release (version 5) of this resource.

PHYLOMEDB CORE IMPROVEMENTS

Changes to the back end

The PhylomeDB database is currently hosted at the Barcelona Supercomputing Center (BSC-CNS), which is the third institution hosting this resource. Moving such a resource through different institutions and computer systems is not ideal, but reflects the reality of our research group and underscores the commitment of the PhylomeDB team to maintain this resource. The current version of PhylomeDB runs on a Virtual Machine with Ubuntu Linux and 16 GB of memory, where most tasks related to database and web interface operations are carried out. Other computations such as the reconstruction of new phylomes or the annotation of trees and alignments are computed in the Mare Nostrum supercomputer, one of the most powerful supercomputers in Europe. The current back end has been completely updated, with significant software package updates including Apache (version 2.4.18), PHP (version 7.0.15), MariaDB (version 10.2.22), ETE (version 3.1.1) and Python (version 3.7.4). Given the significant expansion of the database, new implementations in the sequence search algorithms have been necessary to reduce response times. Firstly, the faster Diamond v2.0.9.147 (20) algorithm instead of BLAST (21) is used for sequence searches. Secondly, the search jobs are run asynchronously in the Mare Nostrum supercomputer, thereby avoiding virtual machine memory overload.

Newly designed web interface and Improved functionalities

This upgrade of PhylomeDB includes a newly designed web interface, which has been implemented using Drupal (version 7.80), SQLite (version 3.7.3) and several JavaScript libraries, including jQuery (version 1.11.0). The tracking of page visits is provided by Google Analytics. The help and frequently asked questions section have been improved based on user's feedback. To facilitate navigation and access to relevant data, the information on gene trees, orthology predictions and alignments have been combined into a single entry page (see Figure 1), and the FTP section has been completely restructured to improve data accessibility, including direct links to the relevant FTP sections from the relevant parts in the entry pages. New search functionalities include the availability to restrict searches in subsets of the database (i.e. specific phylomes or phylome collections according to user's preferences). Search results tables can be sorted and filtered. Although phylome collections (i.e. subsets of related phylomes) have been available since version 4, this new release includes the possibility of creating user-defined collections. This provides the opportunity to registered users (registration is free) to create their own collections comprising the phylomes of their interest (i.e. those covering their organisms in focus), which streamlines searches and simplifies navigation. Additional efforts have been made to improve the navigation layout for tablets and mobile devices

Figure 1.

(A) Example of the search functionality in PhylomeDB. Users can choose among four different search approaches, shown in different tabs: Search by gene (left-most tab), in which users can search a gene tree by PhylomeID, gene name or external identifiers; Search by sequence (middle-left) in which users can provide a protein sequence which is going to be used for similarity search against sequences in PhylomeDB; Search by phylome (middle-right) in which users can search for specific phylomes among those publicly available; Finally, collection search (right-most) allows users to filter gene and phylome searches to specific collections. (B) Example of the integrated tree visualization showing the gene SHP1 from the yeast Saccharomyces cerevisiae. Several items can be distinguished: The top panel (I) allows the user to switch among available trees, including the ones containing the target protein sequence as seed as well as the ones in which that sequence is present but is not seed (i.e: collateral trees). The tool panel (II) above the tree has multiple elements: it can open a drop-down list of tree features: to interact with during the tree visualization, it can generate a hard link of the tree, download the tree image in .PNG format or download the orthology relationships within the tree in OrthoXML format (48). The tree features pop-up (III) allows the user to change the number of attributes displayed by the image. The search pop-up (IV) allows highlighting specific nodes that match the query term for different categories such as the species name. In addition, clicking on the nodes and leaves will generate a pop-up menu with multiple options such as collapsing nodes, switching sister branches, rerooting, and more. There is also a domain and sequence panel in which PFAM motifs are represented by different shapes, lengths and colors (V). They can be clicked and a direct link will redirect the motif to the original PFAM entry. Finally, the tree legend (VI) indicates the rooting strategy followed in this tree and its classification for the rest of evolutionary events.

Phylogenetic reconstruction, analysis pipelines and benchmarking

At the core of PhylomeDB lies a fully-automated phylogenetic reconstruction pipeline which is regularly updated to keep up to date with recent developments in relevant software tools. In designing this pipeline, we prioritized accuracy over speed. Each phylome is reconstructed over a set of species with available proteomes (here we refer to proteomes as the complete catalogue of sequences of protein-coding genes encoded in a species genome), of which one species - the one in focus - will act as a seed. The PhylomeDB V5 pipeline includes several software updates and technical implementations and proceeds as follows. For each (seed) gene in the seed species a BLAST search against a database containing all phylome proteomes is launched to retrieve a set of proteins with a significant similarity (e-value < 1e–05, continuous overlap over 50% of the query sequence). The number of hits used in further processing is limited to the closest 150 homologs, unless specified otherwise in the phylome description page. Then, a multiple sequence alignment is subsequently constructed using a consistency-based approach. First the set of homologous protein sequences is aligned with three different programs: MUSCLE v3.8.1551 (22), MAFFT v7.407 (23) and KALIGN v2.04 (24). These programs are run with the sequences in forward and reverse orientation, resulting in six different alignments that are combined into a consensus alignment using M-Coffee v12.0 (25). Then this alignment is trimmed using trimAl v1.4.rev15 (26) using a consistency cut-off of 0.1667 and a gap score cut-off of 0.1. The final, trimmed alignment is used to reconstruct a maximum likelihood tree using IQ-Tree v1.6.9 (27) under different models (DCmut, JTTDCMut, LG, WAG, VT, models are explored by default, unless specified otherwise). The final maximum likelihood tree is reconstructed using the best model selected based on the Bayesian information criterion (28). Finally, partition support is calculated using rapid bootstrap (1000 repetitions), as implemented in IQ-Tree. This procedure is iteratively run over all genes of the seed species until a phylome is completed. Several automated pipelines are run over entire proteomes or phylomes to precompute relevant features, these include rooting of the trees (either by out-group taxon rooting or midpoint rooting, depending on the phylome), and annotation of internal nodes as speciation or duplication events, as inferred by the species overlap algorithm. The methodology for orthology and paralogy inference has been assessed using the Quest for Orthologs (QFO) 2020 benchmarking service (29). This benchmark is based on a set of 78 selected species, covering the entire tree of life. Given that the size of the test dataset largely exceeds the standard size of phylomes in phylomeDB (12–20 species), we reconstructed 155 phylomes (totalling 1 566 697 trees), to provide a phylogenetic coverage over the species in the benchmark similar to those obtained in a standard phylome analysis. This allowed us to first assess that the changes in the tree reconstruction pipeline did not negatively impact the results as well as to explore different parameters used during orthology prediction. For instance, we studied the effect of using different tree rooting methods, such as taxon out-group rooting, midpoint rooting, reconciliation-based rooting as implemented in Treerecs (30), or rooting at the most distant sequence from the seed. Our results indicated that the four rooting methods provided similar results with respect to orthology predictions for the seed, and therefore we did not alter the rooting algorithm implemented in PhylomeDB (outgroup-based rooting, see Figure 2A and B). In addition, we studied the effects of applying a consistency-based approach as implemented in MetaPhORs (14), or of restricting orthology calls to include sequences used as a seed. Our results (Figure 2C and D) show that these two filters can serve to improve the accuracy of orthology prediction (measured as the ability of selected orthologs to reconstruct an assumed species tree, y-axis), at the cost of the number of predicted orthologs. Overall PhylomeDB produced results in line with other methods, which vary along a similar accuracy versus coverage trade-off trend.

Figure 2.

Plots depicting the results obtained in two test sets of the QFO 2020 benchmark when comparing different approaches to orthology prediction based on phylome information. Graphs represented here correspond to the generalized species tree discordance test (G_STD2) run on the set of fungal (graphs A and C) and vertebrate (graphs B and D) orthology predictions. This test compares a gene tree reconstructed based on the submitted orthology predictions to a pre-computed, binary, species tree. In the x-axis we find the number of completed gene trees obtained from the submitted orthologs and in the y-axis the robinson and foulds (RF) measure that calculates the number of shared bipartitions between the species tree and the gene tree and normalized by the total number of bipartitions in both trees. Graphs A and B compare results obtained using four different rooting methods: in pink rooting to the farthest sequence from an outgroup taxon (oldest), in orange rooting to the leaf that is farthest located from the seed, in cyan rooting based on minimizing the reconciliation cost and in purple using midpoint rooting. Graphs C and D compare different ways to filter orthology predictions. In green are all found orthologous pairs, in brown all orthologous pairs with a consistency score above 0.5, in yellow all possible orthologous pairs involving the seed protein, and in blue all orthologous pairs involving the seed and with a consistency score above 0.5. Grey coloured dots represent results obtained by other methods and were extracted from the QFO public results 2020 (https://orthology.benchmarkservice.org/proxy/). Size of the dots is relative to the number of orthologs in the dataset. Square found in graph D indicates which sets of data are found in the region as they overlap.

Dataset expansion and community project support

PhylomeDB v5 provides evolutionary computations for >23 million proteins (compared to ∼10 million in previous release), over 8 million trees (as opposed to 1.5 millions in v4) and 534 public phylomes (roughly a 4-fold increase since 2014). In addition, integration with MetaPhOrs (14) provides consistency-based orthology and paralogy relationships and expands homology relationships in PhylomeDB to over 6000 species. PhylomeDB provides support in the annotation and analysis of newly sequenced genomes, some as part of large-scale initiatives such as i5K (31), 1KFG (1000 Fungal Genomes) (http://1000.fungalgenomes.org/), or ERGA (https://www.erga-biodiversity.eu/). These are community-driven projects that partner with PhylomeDB to perform a phylome reconstruction coupled with the annotation and initial analyses of a newly sequenced genome. In these projects, phylomeDB analyses have proven useful to, among others, (i) identify potential errors in gene annotation based on comparative analyses (split genes, transposable elements, etc.), (ii) provide functional annotation based on annotated functions of orthologs and (iii) identify major genomic changes in the relevant lineages. Finally, the resulting organism-focused phylome constitutes a valuable resource for the research community working in this species (see Table 1 for a representative list of community-driven projects in PhylomeDB).

Table 1.

Representative projects where PhylomeDB has been coupled to annotation and first analysis of newly sequenced genomes

Species (common name)	Phylomedb ID	Reference
Plants
Olea europaea (Olive tree)	215–222	(6)
Nicotiana benthamiana (Benth, a close relative of tobacco)	817	(32)
Beta vulgaris (Sugar beet)	152	(33)
Phaseolus vulgaris (Common bean)	8–11	(34)
Solanum commersonii (Wild potato)	147	(35)
Vertebrates
48 bird species	225–230	(36)
Scophthalmus maximus (Turbot)	18	(37)
Panthera onca (Jaguar)	583 and 584	(38)
Lynx pardinus (Iberian lynx)	277 and 278	(39)
Other (invertebrate) animals
Cinara cedri (Cedar aphid)	701–706	(5)
Polistes canadensis (Red paper wasp) and other eusocial insects	134–136	(40)
Strigamia martitima (Centipede)	177	(41)
Daktulosphaira vitifoliae (Grape phylloxera)	196	(42)
Mytilus galloprovincialis (Mediterranean mussel)	787	(43)
Fungi
Penicillium expansum (Blue mold)	279–283	(44)
Phycomyces blakesleeanus and Mucor circinelloides	252–255	(45)
Geotrichum candidum	233–236	(46)
Candida subhashii	777	(47)

First column indicates the name of the species of interest for the project, the second column lists the phylomeID for the phylomes reconstructed as part of the project and in the third column is the reference to the publication.

Representative projects where PhylomeDB has been coupled to annotation and first analysis of newly sequenced genomes First column indicates the name of the species of interest for the project, the second column lists the phylomeID for the phylomes reconstructed as part of the project and in the third column is the reference to the publication.

DATA AVAILABILITY

PhylomeDB is freely available, without registration at http://phylomedb.org/.

47 in total

1. The Solanum commersonii Genome Sequence Provides Insights into Adaptation to Stress Conditions and Genome Evolution of Wild Potato Relatives.

Authors: Riccardo Aversano; Felice Contaldi; Maria Raffaella Ercolano; Valentina Grosso; Massimo Iorizzo; Filippo Tatino; Luciano Xumerle; Alessandra Dal Molin; Carla Avanzato; Alberto Ferrarini; Massimo Delledonne; Walter Sanseverino; Riccardo Aiese Cigliano; Salvador Capella-Gutierrez; Toni Gabaldón; Luigi Frusciante; James M Bradeen; Domenico Carputo
Journal: Plant Cell Date: 2015-04-14 Impact factor: 11.277

Review 2. Functional and evolutionary implications of gene orthology.

Authors: Toni Gabaldón; Eugene V Koonin
Journal: Nat Rev Genet Date: 2013-04-04 Impact factor: 53.242

3. The GOA database: gene Ontology annotation updates for 2015.

Authors: Rachael P Huntley; Tony Sawford; Prudence Mutowo-Meullenet; Aleksandra Shypitsyna; Carlos Bonilla; Maria J Martin; Claire O'Donovan
Journal: Nucleic Acids Res Date: 2014-11-06 Impact factor: 19.160

4. Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes.

Authors: Anna Vlasova; Salvador Capella-Gutiérrez; Martha Rendón-Anaya; Miguel Hernández-Oñate; André E Minoche; Ionas Erb; Francisco Câmara; Pablo Prieto-Barja; André Corvelo; Walter Sanseverino; Gastón Westergaard; Juliane C Dohm; Georgios J Pappas; Soledad Saburido-Alvarez; Darek Kedra; Irene Gonzalez; Luca Cozzuto; Jessica Gómez-Garrido; María A Aguilar-Morón; Nuria Andreu; O Mario Aguilar; Jordi Garcia-Mas; Maik Zehnsdorf; Martín P Vázquez; Alfonso Delgado-Salinas; Luis Delaye; Ernesto Lowy; Alejandro Mentaberry; Rosana P Vianello-Brondani; José Luís García; Tyler Alioto; Federico Sánchez; Heinz Himmelbauer; Marta Santalla; Cedric Notredame; Toni Gabaldón; Alfredo Herrera-Estrella; Roderic Guigó
Journal: Genome Biol Date: 2016-02-25 Impact factor: 13.583

5. Genome-wide signatures of complex introgression and adaptive evolution in the big cats.

Authors: Henrique V Figueiró; Gang Li; Fernanda J Trindade; Juliana Assis; Fabiano Pais; Gabriel Fernandes; Sarah H D Santos; Graham M Hughes; Aleksey Komissarov; Agostinho Antunes; Cristine S Trinca; Maíra R Rodrigues; Tyler Linderoth; Ke Bi; Leandro Silveira; Fernando C C Azevedo; Daniel Kantek; Emiliano Ramalho; Ricardo A Brassaloti; Priscilla M S Villela; Adauto L V Nunes; Rodrigo H F Teixeira; Ronaldo G Morato; Damian Loska; Patricia Saragüeta; Toni Gabaldón; Emma C Teeling; Stephen J O'Brien; Rasmus Nielsen; Luiz L Coutinho; Guilherme Oliveira; William J Murphy; Eduardo Eizirik
Journal: Sci Adv Date: 2017-07-19 Impact factor: 14.136

6. Ten Years of Collaborative Progress in the Quest for Orthologs.

Authors: Benjamin Linard; Ingo Ebersberger; Shawn E McGlynn; Natasha Glover; Tomohiro Mochizuki; Mateus Patricio; Odile Lecompte; Yannis Nevers; Paul D Thomas; Toni Gabaldón; Erik Sonnhammer; Christophe Dessimoz; Ikuo Uchiyama
Journal: Mol Biol Evol Date: 2021-07-29 Impact factor: 16.240

7. Whole-genome analyses resolve early branches in the tree of life of modern birds.

Authors: Erich D Jarvis; Siavash Mirarab; Andre J Aberer; Bo Li; Peter Houde; Cai Li; Simon Y W Ho; Brant C Faircloth; Benoit Nabholz; Jason T Howard; Alexander Suh; Claudia C Weber; Rute R da Fonseca; Jianwen Li; Fang Zhang; Hui Li; Long Zhou; Nitish Narula; Liang Liu; Ganesh Ganapathy; Bastien Boussau; Md Shamsuzzoha Bayzid; Volodymyr Zavidovych; Sankar Subramanian; Toni Gabaldón; Salvador Capella-Gutiérrez; Jaime Huerta-Cepas; Bhanu Rekepalli; Kasper Munch; Mikkel Schierup; Bent Lindow; Wesley C Warren; David Ray; Richard E Green; Michael W Bruford; Xiangjiang Zhan; Andrew Dixon; Shengbin Li; Ning Li; Yinhua Huang; Elizabeth P Derryberry; Mads Frost Bertelsen; Frederick H Sheldon; Robb T Brumfield; Claudio V Mello; Peter V Lovell; Morgan Wirthlin; Maria Paula Cruz Schneider; Francisco Prosdocimi; José Alfredo Samaniego; Amhed Missael Vargas Velazquez; Alonzo Alfaro-Núñez; Paula F Campos; Bent Petersen; Thomas Sicheritz-Ponten; An Pas; Tom Bailey; Paul Scofield; Michael Bunce; David M Lambert; Qi Zhou; Polina Perelman; Amy C Driskell; Beth Shapiro; Zijun Xiong; Yongli Zeng; Shiping Liu; Zhenyu Li; Binghang Liu; Kui Wu; Jin Xiao; Xiong Yinqi; Qiuemei Zheng; Yong Zhang; Huanming Yang; Jian Wang; Linnea Smeds; Frank E Rheindt; Michael Braun; Jon Fjeldsa; Ludovic Orlando; F Keith Barker; Knud Andreas Jønsson; Warren Johnson; Klaus-Peter Koepfli; Stephen O'Brien; David Haussler; Oliver A Ryder; Carsten Rahbek; Eske Willerslev; Gary R Graves; Travis C Glenn; John McCormack; Dave Burt; Hans Ellegren; Per Alström; Scott V Edwards; Alexandros Stamatakis; David P Mindell; Joel Cracraft; Edward L Braun; Tandy Warnow; Wang Jun; M Thomas P Gilbert; Guojie Zhang
Journal: Science Date: 2014-12-12 Impact factor: 47.728

8. PhylomeDB: a database for genome-wide collections of gene phylogenies.

Authors: Jaime Huerta-Cepas; Anibal Bueno; Joaquín Dopazo; Toni Gabaldón
Journal: Nucleic Acids Res Date: 2007-10-25 Impact factor: 16.971

9. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

Authors: Ariel D Chipman; David E K Ferrier; Carlo Brena; Jiaxin Qu; Daniel S T Hughes; Reinhard Schröder; Montserrat Torres-Oliva; Nadia Znassi; Huaiyang Jiang; Francisca C Almeida; Claudio R Alonso; Zivkos Apostolou; Peshtewani Aqrawi; Wallace Arthur; Jennifer C J Barna; Kerstin P Blankenburg; Daniela Brites; Salvador Capella-Gutiérrez; Marcus Coyle; Peter K Dearden; Louis Du Pasquier; Elizabeth J Duncan; Dieter Ebert; Cornelius Eibner; Galina Erikson; Peter D Evans; Cassandra G Extavour; Liezl Francisco; Toni Gabaldón; William J Gillis; Elizabeth A Goodwin-Horn; Jack E Green; Sam Griffiths-Jones; Cornelis J P Grimmelikhuijzen; Sai Gubbala; Roderic Guigó; Yi Han; Frank Hauser; Paul Havlak; Luke Hayden; Sophie Helbing; Michael Holder; Jerome H L Hui; Julia P Hunn; Vera S Hunnekuhl; LaRonda Jackson; Mehwish Javaid; Shalini N Jhangiani; Francis M Jiggins; Tamsin E Jones; Tobias S Kaiser; Divya Kalra; Nathan J Kenny; Viktoriya Korchina; Christie L Kovar; F Bernhard Kraus; François Lapraz; Sandra L Lee; Jie Lv; Christigale Mandapat; Gerard Manning; Marco Mariotti; Robert Mata; Tittu Mathew; Tobias Neumann; Irene Newsham; Dinh N Ngo; Maria Ninova; Geoffrey Okwuonu; Fiona Ongeri; William J Palmer; Shobha Patil; Pedro Patraquim; Christopher Pham; Ling-Ling Pu; Nicholas H Putman; Catherine Rabouille; Olivia Mendivil Ramos; Adelaide C Rhodes; Helen E Robertson; Hugh M Robertson; Matthew Ronshaugen; Julio Rozas; Nehad Saada; Alejandro Sánchez-Gracia; Steven E Scherer; Andrew M Schurko; Kenneth W Siggens; DeNard Simmons; Anna Stief; Eckart Stolle; Maximilian J Telford; Kristin Tessmar-Raible; Rebecca Thornton; Maurijn van der Zee; Arndt von Haeseler; James M Williams; Judith H Willis; Yuanqing Wu; Xiaoyan Zou; Daniel Lawson; Donna M Muzny; Kim C Worley; Richard A Gibbs; Michael Akam; Stephen Richards
Journal: PLoS Biol Date: 2014-11-25 Impact factor: 8.029

10. Parental origin of the allotetraploid tobacco Nicotiana benthamiana.

Authors: Matteo Schiavinato; Marina Marcet-Houben; Juliane C Dohm; Toni Gabaldón; Heinz Himmelbauer
Journal: Plant J Date: 2020-01-13 Impact factor: 7.091

5 in total

1. PhyloCloud: an online platform for making sense of phylogenomic data.

Authors: Ziqi Deng; Jorge Botas; Carlos P Cantalapiedra; Ana Hernández-Plaza; Jordi Burguet-Castell; Jaime Huerta-Cepas
Journal: Nucleic Acids Res Date: 2022-05-11 Impact factor: 19.160

2. Chromosome-level assembly, annotation and phylome of Pelobates cultripes, the western spadefoot toad.

Authors: Hans Christoph Liedtke; Fernando Cruz; Jèssica Gómez-Garrido; Diego Fuentes Palacios; Marina Marcet-Houben; Marta Gut; Tyler Alioto; Toni Gabaldón; Ivan Gomez-Mestre
Journal: DNA Res Date: 2022-05-27 Impact factor: 4.477

3. The Quest for Orthologs orthology benchmark service in 2022.

Authors: Yannis Nevers; Tamsin E M Jones; Dushyanth Jyothi; Bethan Yates; Meritxell Ferret; Laura Portell-Silva; Laia Codo; Salvatore Cosentino; Marina Marcet-Houben; Anna Vlasova; Laetitia Poidevin; Arnaud Kress; Mark Hickman; Emma Persson; Ivana Piližota; Cristina Guijarro-Clarke; Wataru Iwasaki; Odile Lecompte; Erik Sonnhammer; David S Roos; Toni Gabaldón; David Thybert; Paul D Thomas; Yanhui Hu; David M Emms; Elspeth Bruford; Salvador Capella-Gutierrez; Maria J Martin; Christophe Dessimoz; Adrian Altenhoff
Journal: Nucleic Acids Res Date: 2022-05-12 Impact factor: 19.160

4. Genome analysis of five recently described species of the CUG-Ser clade uncovers Candida theae as a new hybrid lineage with pathogenic potential in the Candida parapsilosis species complex.

Authors: Verónica Mixão; Valentina Del Olmo; Eva Hegedűsová; Ester Saus; Leszek Pryszcz; Andrea Cillingová; Jozef Nosek; Toni Gabaldón
Journal: DNA Res Date: 2022-02-27 Impact factor: 4.477

5. Evolutionary analyses of genes in Echinodermata offer insights towards the origin of metazoan phyla.

Authors: Saoirse Foley; Anna Vlasova; Marina Marcet-Houben; Toni Gabaldón; Veronica F Hinman
Journal: Genomics Date: 2022-07-12 Impact factor: 4.310

5 in total