Literature DB >> 31722421

The European Nucleotide Archive in 2019.

Clara Amid1, Blaise T F Alako1, Vishnukumar Balavenkataraman Kadhirvelu1, Tony Burdett1, Josephine Burgin1, Jun Fan1, Peter W Harrison1, Sam Holt1, Abdulrahman Hussein1, Eugene Ivanov1, Suran Jayathilaka1, Simon Kay1, Thomas Keane1, Rasko Leinonen1, Xin Liu1, Josue Martinez-Villacorta1, Annalisa Milano1, Amir Pakseresht1, Nadim Rahman1, Jeena Rajan1, Kethi Reddy1, Edward Richards1, Dmitriy Smirnov1, Alexey Sokolov1, Senthilnathan Vijayaraja1, Guy Cochrane1.   

Abstract

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2020        PMID: 31722421      PMCID: PMC7145635          DOI: 10.1093/nar/gkz1063

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

For the last 37 years, since the European Molecular Biology Laboratory (EMBL) launched the first EMBL nucleotide sequence database library, major advances in sequencing and archiving technologies have led to a broad range of nucleotide sequences that build the content of today’s European Nucleotide Archive (ENA). The spectrum extends from raw reads to assembled and annotated sequences and related data types. Having a broad user profile, the ENA offers both general support for the world’s sequence data operations and specific thematic collaborative data coordination (see the ‘Data Coordination Services’ section). As a founding partner in the International Nucleotide Sequence Database Collaboration (INSDC, www.insdc.org) (1), ENA represents a globally comprehensive nucleotide data resource, contributes towards data standards and moves forward with technological advances in sequencing. As an ELIXIR (https://elixir-europe.org/) Core Data Resource (https://elixir-europe.org/platforms/data/core-data-resources), the ENA has a mission to contribute towards the FAIR guiding principles for data management and discovery (2). This mission is achieved by various means: public data stored in the ENA are ‘findable’ through various search tools covering both programmatic and interactive options to provide maximum flexibility for ENA users. Public data are also ‘accessible’ both directly through the ENA and globally though the INSDC exchange. ‘Interoperability’ is provided through structured data and metadata formats that are validated at the time of reporting. Finally, ‘reusability’ is supported through promotion of data sharing and clear terms of use (https://www.ebi.ac.uk/about/terms-of-use). An important tool in assisting users with FAIR compliance for their datasets, the ENA reaches high levels of compliance for most of its content and strives to improve its services further for greater compliance and user value. Throughout 2019, we have continued to provide services to our user base and have developed in selected key areas. In this article, we focus on data submissions, the introduction of new data classes and metadata standards, the ENA’s expanded data coordination portfolio linked with these services and last, but not least, we highlight the new ENA Browser as one of the year’s significant new offerings.

ENA CONTENT AND DEPOSITION SERVICES

In 2019, we have continued to operate our open services for user support, submissions, archiving, presentation and discovery of nucleotide sequence data. Table 1 lists ENA services and their entry points.
Table 1.

ENA services and the respective entry points

ServicesService entry pointsPurpose of serviceLink to service
User supportSupport formContact and feedback to Helpdesk https://www.ebi.ac.uk/ena/browser/support
Support documentationSubmission, update and discovery guidelines and FAQs https://ena-docs.readthedocs.io/en/latest/
Data submissionSubmission toolsProvision of various submission tools https://www.ebi.ac.uk/ena/browser/submit
Data accessENA BrowserProvision of various search tools https://www.ebi.ac.uk/ena/browser/search
ENA services and the respective entry points During the past year, we have supported substantial data growth and delivered major new components. The Webin framework has continued to provide for ENA’s deposition services, with a few recently applied changes towards simplification and streamlining on both the submitter and ENA Helpdesk support sides. While metadata registration services (studies and samples) are still supported by interactive and programmatic Webin, a Command Line Interface (Webin-CLI) introduced in 2018 (3) has become ENA’s primary submission tool for genomes and transcriptomes, but also supporting reads and annotated sequences (https://ena-docs.readthedocs.io/en/latest/submit/general-guide.html). Webin-CLI is provided in the form of a standalone executable JAR file, which can be downloaded from https://github.com/enasequence/webin-cli/releases and run from a UNIX terminal or Windows command prompt, and has the major advantage of supporting a pre-submission validation functionality. The Webin submission interfaces have provided support to several thousand active data submitters from numerous countries over the last year, covering 419 490 direct submissions to the ENA in 12 months, comprising 5700 studies, around 620 000 samples, 493 000 runs and 197 000 (meta)genome assemblies. Figure 1 shows the data growth of total content in ENA, which includes the extensive data exchange with the INSDC partners.
Figure 1.

Data growth of total content, by assembled/annotated sequences and reads.

Data growth of total content, by assembled/annotated sequences and reads.

SELECTED DEVELOPMENTS IN 2019

New data types

The ENA has continued to adapt rapidly and in an agile way to emerging community requirements. We have added support for new sequencing platforms and experiment types and built services around diverse new analysis data types, including 10x reads (https://www.10xgenomics.com/) and metagenome assemblies. In recent years, ENA has focused its extensibility into its analysis objects. Examples of new analysis types in the last year include new assembly classes and taxonomic reference data detailed below.

Assemblies

In response to a growing metagenomics world, ENA introduced new analysis classes for primary metagenome, binned metagenome, metagenome-assembled genome (MAG) and single-cell amplified genome, and has implemented accompanying community metadata standards (4). The new analysis types offer an opportunity to explore a new generation of assembly submission and storage. This is achieved by the separation of high-volume primary and binned metagenomes that are difficult to handle in traditional flat files from other, for example, MAGs or isolate genome assemblies. The new and separate analysis types also allow better indexing of the different data groups enabling an improved search and presentation of the data. To ease support for our (meta)genomic assembly submitters, we have added comprehensive documentation describing the new assembly model (https://ena-docs.readthedocs.io/en/latest/submit/assembly.html; https://ena-docs.readthedocs.io/en/latest/submit/assembly/metagenome.html). Figure 2 shows the cumulative number of assemblies submitted to ENA by type.
Figure 2.

Cumulative number of assemblies submitted to ENA classed by type.

Cumulative number of assemblies submitted to ENA classed by type.

Taxonomic reference datasets

In environmental sequencing (e.g. metabarcoding), there is a need to map unknown sequences to taxonomically classified and curated reference sequences. Sets of these reference sequences are typically derived from ENA sequences that are cleaned up (e.g. trimmed and contamination-screened) and mapped to improve and correct taxonomy. Reference datasets are produced by groups (e.g. SILVA: https://www.arb-silva.de/; ITSoneDB: http://itsonedb.cloud.ba.infn.it/; UNITE: https://unite.ut.ee/) (5-7) that consume ENA, add value through such curation processes and make their data available to tool and service providers. With this new analysis class, we support this community data flow.

Standards

The ENA has continued working with communities to develop and deploy data standards, with a main focus on metagenomics this year. In collaboration with the Genomic Standards Consortium (https://press3.mcs.anl.gov/gensc/), we have deployed three new sample checklists that can be found under the ‘Environmental checklists’ group in Webin: MIMAGs, for metagenome-assembled genomes (4); MISAGs, for environmental single-cell amplified genomes (4); MIUVIG, for environmental/uncultivated virus genomes (8). In addition to the above, we have also deployed ENA binned metagenome sample checklists to support all levels of assemblies derived from a biome, with corresponding documentation (https://ena-docs.readthedocs.io/en/latest/faq/metagenomes.html; https://ena-docs.readthedocs.io/en/latest/submit/assembly/metagenome.html). With these, the ENA offers 21 environmental sample checklists that can be selected from based on the biome the sequenced sample is derived from. Furthermore, there are four checklists for marine samples, seven pathogen-related sample metadata checklists, a number of project-specific checklists, for example one for patient-derived xenograft models or patient samples and one developed for the Global Microbial Identifier Proficiency Test (https://www.globalmicrobialidentifier.org/workgroups/about-the-gmi-proficiency-tests). The complete list of the ENA checklists including the required fields for each checklist can be viewed and browsed through in the new ENA Browser (https://www.ebi.ac.uk/ena/browser/checklists).

Data coordination services

We have continued to provide specific data coordination support for our collaborating partners in projects and initiatives across a broad range of scientific areas, expanding our portfolio of collaborations over the last year. Working closely with our partners, we provide support in data sharing, analysis, archiving, search and presentation services through often dedicated search and discovery portal application program interfaces (APIs) and/or graphical user interfaces. This service is extremely valuable to all ENA end users because of its direct link to setting standards and improving the quality and richness of content. The expansion of the analysis types and addition of standards support for metagenomic assembly data described above (see the ‘New Data Types’ and ‘Standards’ sections) have resulted from a data coordination service; this work improves search and discoverability of all assembly types, and in particular metagenomic assemblies that are growing in number. The ENA Rulespace (https://www.ebi.ac.uk/ena/browser/rulespace) is a further example for a service that is developed to provide improved search and synchronization tools for ENA. This service was driven specifically to serve custom views of eukaryote diversity-related content. The Rulespace service enables the creation and management of user-defined rules, and metadata relating to these rules, that can be shared with other interested parties and that are used to define searches on services such as the ENA Discovery API (see also under the ‘ENA Browser’ section). Our current portfolio includes partners from pathogen surveillance and outbreak genomics using the COMPARE data hub system (9) and the Pathogen Portal (https://www.ebi.ac.uk/ena/pathogens/home), livestock functional genomics under the FAANG collaboration (10), metagenomics communities through the Metagenome Exchange (https://www.ebi.ac.uk/ena/registry/metagenome/api/) and MGnify (11) projects, stem cell data through HipSci (12), marine projects such as Tara Oceans (https://www.ebi.ac.uk/ena/about/tara-oceans-assemblies) (13) and Ocean Sampling Day (14) and microbial eukaryote biodiversity projects such as UniEuk (15). A list of our current collaborations and their descriptions can be found at https://www.ebi.ac.uk/ena/browser/about/data_coordination.

The new ENA Browser

A particular focus for the year has been the development of the new ENA Browser (https://www.ebi.ac.uk/ena/browser/home). This features a completely new modern technology stack (Angular: https://angular.io/; Material: https://material.angular.io/; MongoDB: https://www.mongodb.com/; Vertica: https://www.vertica.com/; Oracle: https://www.oracle.com/; Spring Boot: https://spring.io/projects/spring-boot), a move to microservices for improved maintainability, a complete review and modernization of all previous browser features, a streamlined and simplified user experience and the addition of key new features that improve data discovery and access. The streamlined design focuses each data view on the most important information for the user; this has potential to boost the user experience, make navigation more intuitive and promote easy access to the underlying data. For example, the new homepage features quick access buttons to key site sections, a redesigned tab and page structure, and both a direct accession access and free text search boxes (Figure 3).
Figure 3.

The new ENA Browser, showing its streamlined landing page.

The new ENA Browser, showing its streamlined landing page. Search has been overhauled in the new browser with improvements to existing search interfaces and addition of new features. We offer five distinct search interfaces: free text search (simple keyword search), sequence similarity search (BLAST search), sequence version archive search (find non-current sequence versions), cross-reference search (search our extensive array of cross-references and extended annotations from an increasing number of external databases and resources) and a new advanced search service. Advanced search enables the guided construction of complex queries using a range of predefined filters, combined with autocompletion assistance for many fields (Figure 4), with the interface constructing the query language on the user’s behalf. Users can refine the results output using inclusion and exclusion by accession. As the query language is the same as for our API interfaces, the browser features a copy to cURL command button so that a query constructed in the browser can be easily utilized programmatically. CURL is a widely used command line tool for web address-based API interactions, such as those with the ENA APIs, and allows for the transfer of data and files. For example, the following is an example of an advanced search query for all human raw reads in ENA copied to a cURL command to run the same search programmatically, ‘curl -X POST -H “Content-Type: application/x-www-form-urlencoded” -d “result=read_run&query=tax_eq(9606)&format=tsv” https://www.ebi.ac.uk/ena/portal/api/search’.
Figure 4.

Advanced search query interface for constructing complex searches, for example geographical boundaries.

Advanced search query interface for constructing complex searches, for example geographical boundaries. Rulespace is a completely new feature of the ENA Browser that allows users to save advanced search queries to their own account, to re-run the same query as new data emerges and also to share the query with collaborators to enable work on identical datasets. An authenticated management interface (Figure 5) enables a user to edit, run or share any previously saved rule queries that each have a user-provided title and description to aid identification. The queries are particularly powerful when the ‘Last updated’ field is included as it allows users to continually return to Rulespace to obtain updated records since a given date. For example, this can be set to be the last time they ran the query. This service is particularly powerful for consortiums and projects that wish to generate and distribute to all of their members a saved custom ENA advanced search. Additionally, Rulespace can also be managed programmatically through its API interface (https://www.ebi.ac.uk/ena/rulespace/api/) that enables creation, management and exploitation of the saved custom queries programmatically.
Figure 5.

Rulespace interface for managing saved user advanced queries.

Rulespace interface for managing saved user advanced queries. The new browser sits upon a new public ENA Browser API (https://www.ebi.ac.uk/ena/browser/api/) that was released early in 2019 and serves direct programmatic access. It provides a significant improvement in stability and performance over the previous programmatic data access that was integrated with the old browser and thus subject to file system performance bottlenecks. Browser API is focused on fast sequence retrieval by accession, but works perfectly in tandem with the ENA Portal (Discovery) API (https://www.ebi.ac.uk/ena/portal/api/) that supports powerful search across metadata fields. These APIs work together to provide an integrated search and retrieval programmatic service. Each of the APIs have Swagger interfaces to assist with query construction, configurable outputs and pre-publication authenticated data access. A Swagger user interface helps users easily consume our new APIs, providing easily navigable documentation, a clear overview of the available endpoints and by enabling test queries assists with the design of commands to consume our data and services (https://swagger.io). With the new MongoDB backed deployment of the APIs, we have significant flexibility for future scalability with an easily adaptable database schema design and the option of sharding over an increasing number of machines. This allows us to more easily respond to future changes in metadata, data types and technologies, and to distribute ever more complex queries from an increasing user base over a scalable system.
  15 in total

1.  Functional Annotation of Animal Genomes (FAANG): Current Achievements and Roadmap.

Authors:  Elisabetta Giuffra; Christopher K Tuggle
Journal:  Annu Rev Anim Biosci       Date:  2018-11-14       Impact factor: 8.923

2.  The ocean sampling day consortium.

Authors:  Anna Kopf; Mesude Bicak; Renzo Kottmann; Julia Schnetzer; Ivaylo Kostadinov; Katja Lehmann; Antonio Fernandez-Guerra; Christian Jeanthon; Eyal Rahav; Matthias Ullrich; Antje Wichels; Gunnar Gerdts; Paraskevi Polymenakou; Giorgos Kotoulas; Rania Siam; Rehab Z Abdallah; Eva C Sonnenschein; Thierry Cariou; Fergal O'Gara; Stephen Jackson; Sandi Orlic; Michael Steinke; Julia Busch; Bernardo Duarte; Isabel Caçador; João Canning-Clode; Oleksandra Bobrova; Viggo Marteinsson; Eyjolfur Reynisson; Clara Magalhães Loureiro; Gian Marco Luna; Grazia Marina Quero; Carolin R Löscher; Anke Kremp; Marie E DeLorenzo; Lise Øvreås; Jennifer Tolman; Julie LaRoche; Antonella Penna; Marc Frischer; Timothy Davis; Barker Katherine; Christopher P Meyer; Sandra Ramos; Catarina Magalhães; Florence Jude-Lemeilleur; Ma Leopoldina Aguirre-Macedo; Shiao Wang; Nicole Poulton; Scott Jones; Rachel Collin; Jed A Fuhrman; Pascal Conan; Cecilia Alonso; Noga Stambler; Kelly Goodwin; Michael M Yakimov; Federico Baltar; Levente Bodrossy; Jodie Van De Kamp; Dion Mf Frampton; Martin Ostrowski; Paul Van Ruth; Paul Malthouse; Simon Claus; Klaas Deneudt; Jonas Mortelmans; Sophie Pitois; David Wallom; Ian Salter; Rodrigo Costa; Declan C Schroeder; Mahrous M Kandil; Valentina Amaral; Florencia Biancalana; Rafael Santana; Maria Luiza Pedrotti; Takashi Yoshida; Hiroyuki Ogata; Tim Ingleton; Kate Munnik; Naiara Rodriguez-Ezpeleta; Veronique Berteaux-Lecellier; Patricia Wecker; Ibon Cancio; Daniel Vaulot; Christina Bienhold; Hassan Ghazal; Bouchra Chaouni; Soumya Essayeh; Sara Ettamimi; El Houcine Zaid; Noureddine Boukhatem; Abderrahim Bouali; Rajaa Chahboune; Said Barrijal; Mohammed Timinouni; Fatima El Otmani; Mohamed Bennani; Marianna Mea; Nadezhda Todorova; Ventzislav Karamfilov; Petra Ten Hoopen; Guy Cochrane; Stephane L'Haridon; Kemal Can Bizsel; Alessandro Vezzi; Federico M Lauro; Patrick Martin; Rachelle M Jensen; Jamie Hinks; Susan Gebbels; Riccardo Rosselli; Fabio De Pascale; Riccardo Schiavon; Antonina Dos Santos; Emilie Villar; Stéphane Pesant; Bruno Cataletto; Francesca Malfatti; Ranjith Edirisinghe; Jorge A Herrera Silveira; Michele Barbier; Valentina Turk; Tinkara Tinta; Wayne J Fuller; Ilkay Salihoglu; Nedime Serakinci; Mahmut Cerkez Ergoren; Eileen Bresnan; Juan Iriberri; Paul Anders Fronth Nyhus; Edvardsen Bente; Hans Erik Karlsen; Peter N Golyshin; Josep M Gasol; Snejana Moncheva; Nina Dzhembekova; Zackary Johnson; Christopher David Sinigalliano; Maribeth Louise Gidley; Adriana Zingone; Roberto Danovaro; George Tsiamis; Melody S Clark; Ana Cristina Costa; Monia El Bour; Ana M Martins; R Eric Collins; Anne-Lise Ducluzeau; Jonathan Martinez; Mark J Costello; Linda A Amaral-Zettler; Jack A Gilbert; Neil Davies; Dawn Field; Frank Oliver Glöckner
Journal:  Gigascience       Date:  2015-06-19       Impact factor: 6.524

3.  ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences.

Authors:  Monica Santamaria; Bruno Fosso; Flavio Licciulli; Bachir Balech; Ilaria Larini; Giorgio Grillo; Giorgio De Caro; Sabino Liuni; Graziano Pesole
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

4.  Common genetic variation drives molecular heterogeneity in human iPSCs.

Authors:  Helena Kilpinen; Angela Goncalves; Andreas Leha; Vackar Afzal; Kaur Alasoo; Sofie Ashford; Sendu Bala; Dalila Bensaddek; Francesco Paolo Casale; Oliver J Culley; Petr Danecek; Adam Faulconbridge; Peter W Harrison; Annie Kathuria; Davis McCarthy; Shane A McCarthy; Ruta Meleckyte; Yasin Memari; Nathalie Moens; Filipa Soares; Alice Mann; Ian Streeter; Chukwuma A Agu; Alex Alderton; Rachel Nelson; Sarah Harper; Minal Patel; Alistair White; Sharad R Patel; Laura Clarke; Reena Halai; Christopher M Kirton; Anja Kolb-Kokocinski; Philip Beales; Ewan Birney; Davide Danovi; Angus I Lamond; Willem H Ouwehand; Ludovic Vallier; Fiona M Watt; Richard Durbin; Oliver Stegle; Daniel J Gaffney
Journal:  Nature       Date:  2017-05-10       Impact factor: 49.962

Review 5.  UniEuk: Time to Speak a Common Language in Protistology!

Authors:  Cédric Berney; Andreea Ciuprina; Sara Bender; Juliet Brodie; Virginia Edgcomb; Eunsoo Kim; Jeena Rajan; Laura Wegener Parfrey; Sina Adl; Stéphane Audic; David Bass; David A Caron; Guy Cochrane; Lucas Czech; Micah Dunthorn; Stefan Geisen; Frank Oliver Glöckner; Frédéric Mahé; Christian Quast; Jonathan Z Kaye; Alastair G B Simpson; Alexandros Stamatakis; Javier Del Campo; Pelin Yilmaz; Colomban de Vargas
Journal:  J Eukaryot Microbiol       Date:  2017-04-21       Impact factor: 3.346

6.  Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Authors:  Robert M Bowers; Nikos C Kyrpides; Ramunas Stepanauskas; Miranda Harmon-Smith; Devin Doud; T B K Reddy; Frederik Schulz; Jessica Jarett; Adam R Rivers; Emiley A Eloe-Fadrosh; Susannah G Tringe; Natalia N Ivanova; Alex Copeland; Alicia Clum; Eric D Becraft; Rex R Malmstrom; Bruce Birren; Mircea Podar; Peer Bork; George M Weinstock; George M Garrity; Jeremy A Dodsworth; Shibu Yooseph; Granger Sutton; Frank O Glöckner; Jack A Gilbert; William C Nelson; Steven J Hallam; Sean P Jungbluth; Thijs J G Ettema; Scott Tighe; Konstantinos T Konstantinidis; Wen-Tso Liu; Brett J Baker; Thomas Rattei; Jonathan A Eisen; Brian Hedlund; Katherine D McMahon; Noah Fierer; Rob Knight; Rob Finn; Guy Cochrane; Ilene Karsch-Mizrachi; Gene W Tyson; Christian Rinke; Alla Lapidus; Folker Meyer; Pelin Yilmaz; Donovan H Parks; A M Eren; Lynn Schriml; Jillian F Banfield; Philip Hugenholtz; Tanja Woyke
Journal:  Nat Biotechnol       Date:  2017-08-08       Impact factor: 54.908

7.  The European Nucleotide Archive in 2018.

Authors:  Peter W Harrison; Blaise Alako; Clara Amid; Ana Cerdeño-Tárraga; Iain Cleland; Sam Holt; Abdulrahman Hussein; Suran Jayathilaka; Simon Kay; Thomas Keane; Rasko Leinonen; Xin Liu; Josué Martínez-Villacorta; Annalisa Milano; Nima Pakseresht; Jeena Rajan; Kethi Reddy; Edward Richards; Marc Rosello; Nicole Silvester; Dmitriy Smirnov; Ana-Luisa Toribio; Senthilnathan Vijayaraja; Guy Cochrane
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  Minimum Information about an Uncultivated Virus Genome (MIUViG).

Authors:  Simon Roux; Evelien M Adriaenssens; Bas E Dutilh; Eugene V Koonin; Andrew M Kropinski; Mart Krupovic; Jens H Kuhn; Rob Lavigne; J Rodney Brister; Arvind Varsani; Clara Amid; Ramy K Aziz; Seth R Bordenstein; Peer Bork; Mya Breitbart; Guy R Cochrane; Rebecca A Daly; Christelle Desnues; Melissa B Duhaime; Joanne B Emerson; François Enault; Jed A Fuhrman; Pascal Hingamp; Philip Hugenholtz; Bonnie L Hurwitz; Natalia N Ivanova; Jessica M Labonté; Kyung-Bum Lee; Rex R Malmstrom; Manuel Martinez-Garcia; Ilene Karsch Mizrachi; Hiroyuki Ogata; David Páez-Espino; Marie-Agnès Petit; Catherine Putonti; Thomas Rattei; Alejandro Reyes; Francisco Rodriguez-Valera; Karyna Rosario; Lynn Schriml; Frederik Schulz; Grieg F Steward; Matthew B Sullivan; Shinichi Sunagawa; Curtis A Suttle; Ben Temperton; Susannah G Tringe; Rebecca Vega Thurber; Nicole S Webster; Katrine L Whiteson; Steven W Wilhelm; K Eric Wommack; Tanja Woyke; Kelly C Wrighton; Pelin Yilmaz; Takashi Yoshida; Mark J Young; Natalya Yutin; Lisa Zeigler Allen; Nikos C Kyrpides; Emiley A Eloe-Fadrosh
Journal:  Nat Biotechnol       Date:  2018-12-17       Impact factor: 54.908

9.  The SILVA and "All-species Living Tree Project (LTP)" taxonomic frameworks.

Authors:  Pelin Yilmaz; Laura Wegener Parfrey; Pablo Yarza; Jan Gerken; Elmar Pruesse; Christian Quast; Timmy Schweer; Jörg Peplies; Wolfgang Ludwig; Frank Oliver Glöckner
Journal:  Nucleic Acids Res       Date:  2013-11-28       Impact factor: 16.971

10.  EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies.

Authors:  Alex L Mitchell; Maxim Scheremetjew; Hubert Denise; Simon Potter; Aleksandra Tarkowska; Matloob Qureshi; Gustavo A Salazar; Sebastien Pesseat; Miguel A Boland; Fiona M I Hunter; Petra Ten Hoopen; Blaise Alako; Clara Amid; Darren J Wilkinson; Thomas P Curtis; Guy Cochrane; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

View more
  36 in total

1.  Scripting Analyses of Genomes in Ensembl Plants.

Authors:  Bruno Contreras-Moreira; Guy Naamati; Marc Rosello; James E Allen; Sarah E Hunt; Matthieu Muffato; Astrid Gall; Paul Flicek
Journal:  Methods Mol Biol       Date:  2022

2.  Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

Authors:  Marcela Karey Tello-Ruiz; Pankaj Jaiswal; Doreen Ware
Journal:  Methods Mol Biol       Date:  2022

3.  getSequenceInfo: a suite of tools allowing to get genome sequence information from public repositories.

Authors:  Vincent Moco; Damien Cazenave; Maëlle Garnier; Matthieu Pot; Isabel Marcelino; Antoine Talarmin; Stéphanie Guyomard-Rabenirina; Sébastien Breurec; Séverine Ferdinand; Alexis Dereeper; Yann Reynaud; David Couvin
Journal:  BMC Bioinformatics       Date:  2022-07-08       Impact factor: 3.307

4.  Trypanosoma cruzi iron superoxide dismutases: insights from phylogenetics to chemotherapeutic target assessment.

Authors:  Silvane Maria Fonseca Murta; Laila Alves Nahum; Jéssica Hickson; Lucas Felipe Almeida Athayde; Thainá Godinho Miranda; Policarpo Ademar Sales Junior; Anderson Coqueiro Dos Santos; Lúcia Maria da Cunha Galvão; Antônia Cláudia Jácome da Câmara; Daniella Castanheira Bartholomeu; Rita de Cássia Moreira de Souza
Journal:  Parasit Vectors       Date:  2022-06-06       Impact factor: 4.047

Review 5.  Ecosystem-specific microbiota and microbiome databases in the era of big data.

Authors:  Victor Lobanov; Angélique Gobet; Alyssa Joyce
Journal:  Environ Microbiome       Date:  2022-07-16

6.  To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Authors:  R A Leo Elworth; Qi Wang; Pavan K Kota; C J Barberan; Benjamin Coleman; Advait Balaji; Gaurav Gupta; Richard G Baraniuk; Anshumali Shrivastava; Todd J Treangen
Journal:  Nucleic Acids Res       Date:  2020-06-04       Impact factor: 16.971

Review 7.  Integrating Systems and Synthetic Biology to Understand and Engineer Microbiomes.

Authors:  Patrick A Leggieri; Yiyi Liu; Madeline Hayes; Bryce Connors; Susanna Seppälä; Michelle A O'Malley; Ophelia S Venturelli
Journal:  Annu Rev Biomed Eng       Date:  2021-03-29       Impact factor: 9.590

8.  Automated Phylogenetic Analysis Using Best Reciprocal BLAST.

Authors:  Erin R Butterfield; James C Abbott; Mark C Field
Journal:  Methods Mol Biol       Date:  2021

9.  Legionella pneumophila CRISPR-Cas Suggests Recurrent Encounters with One or More Phages in the Family Microviridae.

Authors:  Shayna R Deecker; Malene L Urbanus; Beth Nicholson; Alexander W Ensminger
Journal:  Appl Environ Microbiol       Date:  2021-08-11       Impact factor: 4.792

10.  The FAANG Data Portal: Global, Open-Access, "FAIR", and Richly Validated Genotype to Phenotype Data for High-Quality Functional Annotation of Animal Genomes.

Authors:  Peter W Harrison; Alexey Sokolov; Akshatha Nayak; Jun Fan; Daniel Zerbino; Guy Cochrane; Paul Flicek
Journal:  Front Genet       Date:  2021-06-17       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.