Literature DB >> 21036865

The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.

Carlos Llorens1, Ricardo Futami, Laura Covelli, Laura Domínguez-Escribá, Jose M Viu, Daniel Tamarit, Jose Aguilar-Rodríguez, Miguel Vicente-Ripolles, Gonzalo Fuster, Guillermo P Bernet, Florian Maumus, Alfonso Munoz-Pomer, Jose M Sempere, Amparo Latorre, Andres Moya.   

Abstract

This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 21036865      PMCID: PMC3013669          DOI: 10.1093/nar/gkq1061

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Mobile genetic elements (MGEs) are ubiquitous, autonomous genetic units that often constitute a significant part of their host genomes. It is commonly accepted that mobile DNA elements are powerful vectors for disease and evolution, from which distinct host genes have evolved during the history of life (1,2). The emergence and subsequent role played by viruses and MGEs in the history of life is an exciting topic that requires further investigation. In this respect, researchers aim to discern relevant aspects of the molecular changes responsible for various characteristics in organisms related to horizontal transfer, infection and disease. Among the distinct initiatives launched with the aim of investigating the diversity of MGEs (see for example 3–5) was the Gypsy Database (GyDB) of MGEs (6), a research project devoted to the evolutionary dynamics of viruses and MGEs (and their related host proteins), which was launched in 2008. The GyDB project is a highly informative database established within an evolutionary context of classification, where one piece of research delivers one conclusion that drives individuals towards another goal. The most captivating aspect of this project is that a share of our efforts are dedicated to the interpretation of analyses, paying particular attention to non-redundant elements displaying a certain degree of distance and investigating how they can be collectively aligned or related, in terms of protein domain architecture, with other lineages and elements. Because of the impressive molecular diversity of viruses and MGEs, the GyDB is a long-term project that has been arranged in a database in continuous progression, and must be achieved in stages. The current database stage and scope is retroviruses and retrotransposons with long terminal repeats (LTR retroelements) and their relatives. Following the outline of the earlier release (the study of Ty3/Gypsy and Retroviridae LTR retroelements), this article presents the GyDB update based on the phylogenetic evaluation of the most representative LTR retroelement families and the plant caulimoviruses. This update, called GyDB 2.0, is available at http://gydb.org and includes sequence phylogenetic classification in addition to significant bioinformatic improvements. In particular, the new infrastructure implements a wiki management system constructed with the aim of promoting a world-wide community of researchers collaborating in the analysis and classification of MGEs and viruses inhabiting (or circulating in) living organisms.

THE UPDATE: NEW FEATURES

GyDB 2.0 consists of 1234 web pages addressing the phylogenetic study of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelement. Caulimoviruses (Caulimoviridae) are formally plant DNA pararetroviruses, but they were considered in GyDB 2.0 owing to their relationship with LTR retroelements based on the common gag/coat and pol regions [for more details, see (7) and references therein]. Table 1 summarizes the topics addressed in this update, as well as the servers and database sections it offers. The sequences on which GyDB 2.0 is based were retrieved from GenBank (8) and the methodologies employed were the same as those described earlier in references (6,7,9). At GyDB we evaluate the phylogenetic signal of classified distinct elements and create hidden Markov model (HMMs) profiles (10) per lineage and protein domain. In addition, the project is concerned with the evolutionary relationships between MGEs and their host genomes, based on the analysis of common protein families. In this regard, GyDB 2.0 focuses on two protein superfamilies including protein products commonly encoded by LTR retroelements and their host genomes; the chromodomain superfamily (11) and clan AA of aspartic peptidases (12,13). This second release is accompanied by bibliographic data-mining from PubMed databases hosted at the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/) to document up to date information regarding the distinct classified elements.
Table 1.

GyDB 2.0 new features: topics and contents

SystemsFamiliesLineagesElementsProtein domainsAccessory proteinsLTRs
LTR retroelementsTy3/Gypsy349681Yes
LTR retroelementsTy1/Copia19698Yes
LTR retroelementsRetroviridae850841Yes
LTR retroelementsBel/Pao5237Yes
LTR retroelementsaCaulimoviridae6301027No
Related familiesClan AA353231No
Related familiesChromodomains21231No

aWe included caulimoviruses in the second release in view of their relationship with LTR retroelements based on the common gag/coat and pol region.

GyDB 2.0 new features: topics and contents aWe included caulimoviruses in the second release in view of their relationship with LTR retroelements based on the common gag/coat and pol region.

DATABASE ORGANIZATION

GyDB 2.0 is deployed over a Linux-MySQL-Apache-PHP (LAMP) stack, with additional Ajax programming to minimize server responses to client browsers. The design is similar to that of the previous release but implements various changes on the web interface. As shown in Figure 1, the database organization is founded upon two major menus––a top menu and a side menu. The top menu allows access to the three servers: An additional new tool in GyDB 2.0 is its wiki, powered by the MediaWiki content management system (http://www.mediawiki.org/). This tool has been implemented to allow other users participate in the project by editing or creating topics. Accession to this wiki is free but it requires a subscription (registration). The rationale behind this choice is that edits are registered by date and author in order to credit contributions, and secondly, we have programmed a revision mechanism to review all changes constructively before making them public. The top menu includes three sections to log in and manage the distinct wiki resources. Finally, to the right of the top menu, GyDB 2.0 includes a text field to search the whole project under two modes (detailed in Figure 1). The side menu divides the distinct GyDB sections into three major demarcations (emphasized with boxes in Figure 1). The first collects sections associated with the systematics applied at GyDB. The second implements information concerning the domains typically observed in the genomic structure of the elements we classify. The third demarcation offers free access to distinct databases, which are organized into three sections: Finally, a variety of links to other database initiatives relevant to the topic are included in the side menu.
Figure 1.

GyDB 2.0 organization and implementation.

BLAST server; implements a BLAST search powered by the NCBI BLAST package (14), allowing protein and DNA comparisons with the GENOMES, LTRs and CORES databases. These databases collect the full-length genomes, the LTR sequences and all the protein sequences on which the second release is based, respectively. HMM server; implements HMMER3 package (http://hmmer.janelia.org) and allows protein comparisons against a database of protein domain lineage-specific HMM profiles created based on the update. This server provides additional comparisons between HMM profiles and the aforementioned CORES database. LITERATURE server; allows users to search bibliography of interest in the topic. Trees and Networks; consists of the collection of inferred phylogenetic trees based on distinct protein domains encoded by the classified elements, or based on their concatenation (when they are parts of polyproteins). Remarkably, inferred pol polyprotein phylogenies based on the concatenation of the protease, reverse transcriptase, RNaseH and integrase domains, are the major criterion for assigning phylogenetic levels at GyDB 2.0 [results introduced in (7)]. Phylogenetic trees provide links to the corresponding element page at GyDB 2.0. By clicking any element name in any tree an entry assigned to this element is opened. These tree image maps were created using Phylograph 1.0 (15). This section includes the clan AA reference database (CAARD) of ancestral maximum likelihood (ML) reconstructions (13) that has been implemented and maintained at GyDB. GyDB collection (16) or the repository of multiple alignments, HMMs, and majority rule consensus (MRC) sequences offered at GyDB 2.0. When a deposited alignment, profile or MRC sequence is associated with a journal publication, its entry in the collection includes citation information. REF SEQ DATABASES or the repository for downloading the databases (GENOMES, CORES and LTRs) implemented in the BLAST server. GyDB 2.0 organization and implementation.

FUTURE PERSPECTIVES

Sequencing projects constantly deliver new types of MGEs [for example (17–22)]; hence the classification of non-redundant elements based on their phylogenetic signal is an open issue at GyDB, and results in the preparation of new sections. For example, we are committed to improving the understanding of the diversity and evolutionary dynamics of MGEs in eukaryotic and prokaryotic organisms. In this regard of eukaryotic LTR retroelements (the current database scope), the sequence repertoire at GyDB with representative elements retrieved from recently sequenced marine secondary endosymbionts including the brown alga Ectocarpus siliculosus (heterokont) and the coccolithophore Emiliania huxleyi (haptophyte) will be implemented. In terms of other research topics in preparation, one concerns the construction of a server devoted to the study of the complete set of MGEs and repeats (the mobilome) of biological genomes. This server will be introduced with two forthcoming publications focusing on the LTR retroelements and their related transposases of the pea aphid Acyrthosiphon pisum genome [see (23)]. At the technical level, we are exploring the application of formal grammars and machine learning algorithms to automate, as far as possible, the management and classification of the sequence data. We are also committed to developing solutions for other non-trivial difficulties that arise with the growing size of the databases. Viruses and MGEs usually show different rates of evolution and high variability depending on the evaluated protein or region. Therefore, we aim to implement more than one method of phylogenetic reconstruction to offer the user different perspectives based on different methods (or the opportunity to upload updated phylogenies via the wiki). On the other hand, the traditional view of the origin and evolution of biological systems is that they are usually monophyletic, but such an assumption has been challenged by increasing evidence suggesting that natural evolution can frequently proceed by gradual and vertical means, in addition to distinct modular, saltatory and reticulate events (24–36). In this respect, we are investigating appropriate protocols to combine phylogenetic inference with new tendencies in network biology [see also (7)].

FUNDING

Centro de Desarrollo Tecnológico Industrial (CDTI) (grant IDI-20100007, partial); Empresa Nacional de Innovación, S.A (ENISA) (17092008, partial); IMPIVA (IMIDTA/2009/118 and IMDTA/2010/740, partial); European Regional Development Fund (ERDF); Ministerio de Ciencia e Innovación (MICINN) (Torres-Quevedo grants PTQ-09-01-00020, PTQ-09-01-00670 and PTQ-10-03552, partial). Funding for open access charge: University of Valencia. Conflict of interest statement. None declared.
  31 in total

Review 1.  Repbase Update, a database of eukaryotic repetitive elements.

Authors:  J Jurka; V V Kapitonov; A Pavlicek; P Klonowski; O Kohany; J Walichiewicz
Journal:  Cytogenet Genome Res       Date:  2005       Impact factor: 1.636

Review 2.  Modern genomes with retro-look: retrotransposed elements, retroposition and the origin of new genes.

Authors:  J-N Volff; J Brosius
Journal:  Genome Dyn       Date:  2007

3.  Novel clades of chromodomain-containing Gypsy LTR retrotransposons from mosses (Bryophyta).

Authors:  Olga Novikova; Vladimir Mayorov; Georgiy Smyshlyaev; Michail Fursov; Linda Adkison; Olga Pisarenko; Alexander Blinov
Journal:  Plant J       Date:  2008-08-11       Impact factor: 6.417

4.  Gypsy endogenous retrovirus maintains potential infectivity in several species of Drosophilids.

Authors:  Jose V Llorens; Jonathan B Clark; Isabel Martínez-Garay; Sirena Soriano; Rosa de Frutos; María J Martínez-Sebastián
Journal:  BMC Evol Biol       Date:  2008-10-31       Impact factor: 3.260

5.  Relationships of gag-pol diversity between Ty3/Gypsy and Retroviridae LTR retroelements and the three kings hypothesis.

Authors:  Carlos Llorens; Mario A Fares; Andres Moya
Journal:  BMC Evol Biol       Date:  2008-10-08       Impact factor: 3.260

6.  GenBank.

Authors:  Dennis A Benson; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers
Journal:  Nucleic Acids Res       Date:  2008-10-21       Impact factor: 16.971

7.  Bioinformatic flowchart and database to investigate the origins and diversity of clan AA peptidases.

Authors:  Carlos Llorens; Ricardo Futami; Gabriel Renaud; Andrés Moya
Journal:  Biol Direct       Date:  2009-01-27       Impact factor: 4.540

8.  The Gypsy Database (GyDB) of mobile genetic elements.

Authors:  C Lloréns; R Futami; D Bezemer; A Moya
Journal:  Nucleic Acids Res       Date:  2007-09-25       Impact factor: 16.971

9.  How Athila retrotransposons survive in the Arabidopsis genome.

Authors:  Antonio Marco; Ignacio Marín
Journal:  BMC Genomics       Date:  2008-05-14       Impact factor: 3.969

10.  PwRn1, a novel Ty3/gypsy-like retrotransposon of Paragonimus westermani: molecular characters and its differentially preserved mobile potential according to host chromosomal polyploidy.

Authors:  Young-An Bae; Jong-Sook Ahn; Seon-Hee Kim; Mun-Gan Rhyu; Yoon Kong; Seung-Yull Cho
Journal:  BMC Genomics       Date:  2008-10-14       Impact factor: 3.969

View more
  141 in total

1.  Diversity, distribution and dynamics of full-length Copia and Gypsy LTR retroelements in Solanum lycopersicum.

Authors:  Rosalía Cristina Paz; Melisa Eliana Kozaczek; Hernán Guillermo Rosli; Natalia Pilar Andino; Maria Virginia Sanchez-Puerta
Journal:  Genetica       Date:  2017-08-03       Impact factor: 1.082

2.  Large distribution and high sequence identity of a Copia-type retrotransposon in angiosperm families.

Authors:  Elaine Silva Dias; Clémence Hatt; Serge Hamon; Perla Hamon; Michel Rigoreau; Dominique Crouzillat; Claudia Marcia Aparecida Carareto; Alexandre de Kochko; Romain Guyot
Journal:  Plant Mol Biol       Date:  2015-08-06       Impact factor: 4.076

3.  Reconstructing de novo silencing of an active plant retrotransposon.

Authors:  Arturo Marí-Ordóñez; Antonin Marchais; Mathilde Etcheverry; Antoine Martin; Vincent Colot; Olivier Voinnet
Journal:  Nat Genet       Date:  2013-07-14       Impact factor: 38.330

4.  GyDB mobilomics: LTR retroelements and integrase-related transposons of the pea aphid Acyrthosiphon pisum genome.

Authors:  Guillermo P Bernet; Alfonso Muñoz-Pomer; Laura Domínguez-Escribá; Laura Covelli; Lucía Bernad; Sukanya Ramasamy; Ricardo Futami; Jose M Sempere; Andrés Moya; Carlos Llorens
Journal:  Mob Genet Elements       Date:  2011-07-01

5.  The landscape and structural diversity of LTR retrotransposons in Musa genome.

Authors:  Faisal Nouroz; Shumaila Noreen; Habib Ahmad; J S Pat Heslop-Harrison
Journal:  Mol Genet Genomics       Date:  2017-06-10       Impact factor: 3.291

6.  Distribution of Divo in Coffea genomes, a poorly described family of angiosperm LTR-Retrotransposons.

Authors:  Mathilde Dupeyron; Rogerio Fernandes de Souza; Perla Hamon; Alexandre de Kochko; Dominique Crouzillat; Emmanuel Couturon; Douglas Silva Domingues; Romain Guyot
Journal:  Mol Genet Genomics       Date:  2017-03-17       Impact factor: 3.291

7.  The population genetic structure approach adds new insights into the evolution of plant LTR retrotransposon lineages.

Authors:  Vanessa Fuentes Suguiyama; Luiz Augusto Baciega Vasconcelos; Maria Magdalena Rossi; Cibele Biondo; Nathalia de Setta
Journal:  PLoS One       Date:  2019-05-20       Impact factor: 3.240

8.  Genome sequence of the hot pepper provides insights into the evolution of pungency in Capsicum species.

Authors:  Seungill Kim; Minkyu Park; Seon-In Yeom; Yong-Min Kim; Je Min Lee; Hyun-Ah Lee; Eunyoung Seo; Jaeyoung Choi; Kyeongchae Cheong; Ki-Tae Kim; Kyongyong Jung; Gir-Won Lee; Sang-Keun Oh; Chungyun Bae; Saet-Byul Kim; Hye-Young Lee; Shin-Young Kim; Myung-Shin Kim; Byoung-Cheorl Kang; Yeong Deuk Jo; Hee-Bum Yang; Hee-Jin Jeong; Won-Hee Kang; Jin-Kyung Kwon; Chanseok Shin; Jae Yun Lim; June Hyun Park; Jin Hoe Huh; June-Sik Kim; Byung-Dong Kim; Oded Cohen; Ilan Paran; Mi Chung Suh; Saet Buyl Lee; Yeon-Ki Kim; Younhee Shin; Seung-Jae Noh; Junhyung Park; Young Sam Seo; Suk-Yoon Kwon; Hyun A Kim; Jeong Mee Park; Hyun-Jin Kim; Sang-Bong Choi; Paul W Bosland; Gregory Reeves; Sung-Hwan Jo; Bong-Woo Lee; Hyung-Taeg Cho; Hee-Seung Choi; Min-Soo Lee; Yeisoo Yu; Yang Do Choi; Beom-Seok Park; Allen van Deynze; Hamid Ashrafi; Theresa Hill; Woo Taek Kim; Hyun-Sook Pai; Hee Kyung Ahn; Inhwa Yeam; James J Giovannoni; Jocelyn K C Rose; Iben Sørensen; Sang-Jik Lee; Ryan W Kim; Ik-Young Choi; Beom-Soon Choi; Jong-Sung Lim; Yong-Hwan Lee; Doil Choi
Journal:  Nat Genet       Date:  2014-01-19       Impact factor: 38.330

9.  Ortervirales: New Virus Order Unifying Five Families of Reverse-Transcribing Viruses.

Authors:  Mart Krupovic; Jonas Blomberg; John M Coffin; Indranil Dasgupta; Hung Fan; Andrew D Geering; Robert Gifford; Balázs Harrach; Roger Hull; Welkin Johnson; Jan F Kreuze; Dirk Lindemann; Carlos Llorens; Ben Lockhart; Jens Mayer; Emmanuelle Muller; Neil E Olszewski; Hanu R Pappu; Mikhail M Pooggin; Katja R Richert-Pöggeler; Sead Sabanadzovic; Hélène Sanfaçon; James E Schoelz; Susan Seal; Livia Stavolone; Jonathan P Stoye; Pierre-Yves Teycheney; Michael Tristem; Eugene V Koonin; Jens H Kuhn
Journal:  J Virol       Date:  2018-05-29       Impact factor: 5.103

10.  Chromosomal distribution and evolution of abundant retrotransposons in plants: gypsy elements in diploid and polyploid Brachiaria forage grasses.

Authors:  Fabíola Carvalho Santos; Romain Guyot; Cacilda Borges do Valle; Lucimara Chiari; Vânia Helena Techio; Pat Heslop-Harrison; André Luís Laforga Vanzela
Journal:  Chromosome Res       Date:  2015-09       Impact factor: 5.239

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.