Literature DB >> 31722421

The European Nucleotide Archive in 2019.

Clara Amid¹, Blaise T F Alako¹, Vishnukumar Balavenkataraman Kadhirvelu¹, Tony Burdett¹, Josephine Burgin¹, Jun Fan¹, Peter W Harrison¹, Sam Holt¹, Abdulrahman Hussein¹, Eugene Ivanov¹, Suran Jayathilaka¹, Simon Kay¹, Thomas Keane¹, Rasko Leinonen¹, Xin Liu¹, Josue Martinez-Villacorta¹, Annalisa Milano¹, Amir Pakseresht¹, Nadim Rahman¹, Jeena Rajan¹, Kethi Reddy¹, Edward Richards¹, Dmitriy Smirnov¹, Alexey Sokolov¹, Senthilnathan Vijayaraja¹, Guy Cochrane¹.

Abstract

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.

Entities: Gene Species

Mesh：

Year: 2020 PMID： 31722421 PMCID： PMC7145635 DOI： 10.1093/nar/gkz1063

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

For the last 37 years, since the European Molecular Biology Laboratory (EMBL) launched the first EMBL nucleotide sequence database library, major advances in sequencing and archiving technologies have led to a broad range of nucleotide sequences that build the content of today’s European Nucleotide Archive (ENA). The spectrum extends from raw reads to assembled and annotated sequences and related data types. Having a broad user profile, the ENA offers both general support for the world’s sequence data operations and specific thematic collaborative data coordination (see the ‘Data Coordination Services’ section). As a founding partner in the International Nucleotide Sequence Database Collaboration (INSDC, www.insdc.org) (1), ENA represents a globally comprehensive nucleotide data resource, contributes towards data standards and moves forward with technological advances in sequencing. As an ELIXIR (https://elixir-europe.org/) Core Data Resource (https://elixir-europe.org/platforms/data/core-data-resources), the ENA has a mission to contribute towards the FAIR guiding principles for data management and discovery (2). This mission is achieved by various means: public data stored in the ENA are ‘findable’ through various search tools covering both programmatic and interactive options to provide maximum flexibility for ENA users. Public data are also ‘accessible’ both directly through the ENA and globally though the INSDC exchange. ‘Interoperability’ is provided through structured data and metadata formats that are validated at the time of reporting. Finally, ‘reusability’ is supported through promotion of data sharing and clear terms of use (https://www.ebi.ac.uk/about/terms-of-use). An important tool in assisting users with FAIR compliance for their datasets, the ENA reaches high levels of compliance for most of its content and strives to improve its services further for greater compliance and user value. Throughout 2019, we have continued to provide services to our user base and have developed in selected key areas. In this article, we focus on data submissions, the introduction of new data classes and metadata standards, the ENA’s expanded data coordination portfolio linked with these services and last, but not least, we highlight the new ENA Browser as one of the year’s significant new offerings.

ENA CONTENT AND DEPOSITION SERVICES

In 2019, we have continued to operate our open services for user support, submissions, archiving, presentation and discovery of nucleotide sequence data. Table 1 lists ENA services and their entry points.

Table 1.

ENA services and the respective entry points

Services	Service entry points	Purpose of service	Link to service
User support	Support form	Contact and feedback to Helpdesk	https://www.ebi.ac.uk/ena/browser/support
	Support documentation	Submission, update and discovery guidelines and FAQs	https://ena-docs.readthedocs.io/en/latest/
Data submission	Submission tools	Provision of various submission tools	https://www.ebi.ac.uk/ena/browser/submit
Data access	ENA Browser	Provision of various search tools	https://www.ebi.ac.uk/ena/browser/search

ENA services and the respective entry points During the past year, we have supported substantial data growth and delivered major new components. The Webin framework has continued to provide for ENA’s deposition services, with a few recently applied changes towards simplification and streamlining on both the submitter and ENA Helpdesk support sides. While metadata registration services (studies and samples) are still supported by interactive and programmatic Webin, a Command Line Interface (Webin-CLI) introduced in 2018 (3) has become ENA’s primary submission tool for genomes and transcriptomes, but also supporting reads and annotated sequences (https://ena-docs.readthedocs.io/en/latest/submit/general-guide.html). Webin-CLI is provided in the form of a standalone executable JAR file, which can be downloaded from https://github.com/enasequence/webin-cli/releases and run from a UNIX terminal or Windows command prompt, and has the major advantage of supporting a pre-submission validation functionality. The Webin submission interfaces have provided support to several thousand active data submitters from numerous countries over the last year, covering 419 490 direct submissions to the ENA in 12 months, comprising 5700 studies, around 620 000 samples, 493 000 runs and 197 000 (meta)genome assemblies. Figure 1 shows the data growth of total content in ENA, which includes the extensive data exchange with the INSDC partners.

Figure 1.

Data growth of total content, by assembled/annotated sequences and reads.

SELECTED DEVELOPMENTS IN 2019

New data types

The ENA has continued to adapt rapidly and in an agile way to emerging community requirements. We have added support for new sequencing platforms and experiment types and built services around diverse new analysis data types, including 10x reads (https://www.10xgenomics.com/) and metagenome assemblies. In recent years, ENA has focused its extensibility into its analysis objects. Examples of new analysis types in the last year include new assembly classes and taxonomic reference data detailed below.

Assemblies

In response to a growing metagenomics world, ENA introduced new analysis classes for primary metagenome, binned metagenome, metagenome-assembled genome (MAG) and single-cell amplified genome, and has implemented accompanying community metadata standards (4). The new analysis types offer an opportunity to explore a new generation of assembly submission and storage. This is achieved by the separation of high-volume primary and binned metagenomes that are difficult to handle in traditional flat files from other, for example, MAGs or isolate genome assemblies. The new and separate analysis types also allow better indexing of the different data groups enabling an improved search and presentation of the data. To ease support for our (meta)genomic assembly submitters, we have added comprehensive documentation describing the new assembly model (https://ena-docs.readthedocs.io/en/latest/submit/assembly.html; https://ena-docs.readthedocs.io/en/latest/submit/assembly/metagenome.html). Figure 2 shows the cumulative number of assemblies submitted to ENA by type.

Figure 2.

Cumulative number of assemblies submitted to ENA classed by type.

Taxonomic reference datasets

In environmental sequencing (e.g. metabarcoding), there is a need to map unknown sequences to taxonomically classified and curated reference sequences. Sets of these reference sequences are typically derived from ENA sequences that are cleaned up (e.g. trimmed and contamination-screened) and mapped to improve and correct taxonomy. Reference datasets are produced by groups (e.g. SILVA: https://www.arb-silva.de/; ITSoneDB: http://itsonedb.cloud.ba.infn.it/; UNITE: https://unite.ut.ee/) (5-7) that consume ENA, add value through such curation processes and make their data available to tool and service providers. With this new analysis class, we support this community data flow.

Standards

The ENA has continued working with communities to develop and deploy data standards, with a main focus on metagenomics this year. In collaboration with the Genomic Standards Consortium (https://press3.mcs.anl.gov/gensc/), we have deployed three new sample checklists that can be found under the ‘Environmental checklists’ group in Webin: MIMAGs, for metagenome-assembled genomes (4); MISAGs, for environmental single-cell amplified genomes (4); MIUVIG, for environmental/uncultivated virus genomes (8). In addition to the above, we have also deployed ENA binned metagenome sample checklists to support all levels of assemblies derived from a biome, with corresponding documentation (https://ena-docs.readthedocs.io/en/latest/faq/metagenomes.html; https://ena-docs.readthedocs.io/en/latest/submit/assembly/metagenome.html). With these, the ENA offers 21 environmental sample checklists that can be selected from based on the biome the sequenced sample is derived from. Furthermore, there are four checklists for marine samples, seven pathogen-related sample metadata checklists, a number of project-specific checklists, for example one for patient-derived xenograft models or patient samples and one developed for the Global Microbial Identifier Proficiency Test (https://www.globalmicrobialidentifier.org/workgroups/about-the-gmi-proficiency-tests). The complete list of the ENA checklists including the required fields for each checklist can be viewed and browsed through in the new ENA Browser (https://www.ebi.ac.uk/ena/browser/checklists).

Data coordination services

We have continued to provide specific data coordination support for our collaborating partners in projects and initiatives across a broad range of scientific areas, expanding our portfolio of collaborations over the last year. Working closely with our partners, we provide support in data sharing, analysis, archiving, search and presentation services through often dedicated search and discovery portal application program interfaces (APIs) and/or graphical user interfaces. This service is extremely valuable to all ENA end users because of its direct link to setting standards and improving the quality and richness of content. The expansion of the analysis types and addition of standards support for metagenomic assembly data described above (see the ‘New Data Types’ and ‘Standards’ sections) have resulted from a data coordination service; this work improves search and discoverability of all assembly types, and in particular metagenomic assemblies that are growing in number. The ENA Rulespace (https://www.ebi.ac.uk/ena/browser/rulespace) is a further example for a service that is developed to provide improved search and synchronization tools for ENA. This service was driven specifically to serve custom views of eukaryote diversity-related content. The Rulespace service enables the creation and management of user-defined rules, and metadata relating to these rules, that can be shared with other interested parties and that are used to define searches on services such as the ENA Discovery API (see also under the ‘ENA Browser’ section). Our current portfolio includes partners from pathogen surveillance and outbreak genomics using the COMPARE data hub system (9) and the Pathogen Portal (https://www.ebi.ac.uk/ena/pathogens/home), livestock functional genomics under the FAANG collaboration (10), metagenomics communities through the Metagenome Exchange (https://www.ebi.ac.uk/ena/registry/metagenome/api/) and MGnify (11) projects, stem cell data through HipSci (12), marine projects such as Tara Oceans (https://www.ebi.ac.uk/ena/about/tara-oceans-assemblies) (13) and Ocean Sampling Day (14) and microbial eukaryote biodiversity projects such as UniEuk (15). A list of our current collaborations and their descriptions can be found at https://www.ebi.ac.uk/ena/browser/about/data_coordination.

The new ENA Browser

A particular focus for the year has been the development of the new ENA Browser (https://www.ebi.ac.uk/ena/browser/home). This features a completely new modern technology stack (Angular: https://angular.io/; Material: https://material.angular.io/; MongoDB: https://www.mongodb.com/; Vertica: https://www.vertica.com/; Oracle: https://www.oracle.com/; Spring Boot: https://spring.io/projects/spring-boot), a move to microservices for improved maintainability, a complete review and modernization of all previous browser features, a streamlined and simplified user experience and the addition of key new features that improve data discovery and access. The streamlined design focuses each data view on the most important information for the user; this has potential to boost the user experience, make navigation more intuitive and promote easy access to the underlying data. For example, the new homepage features quick access buttons to key site sections, a redesigned tab and page structure, and both a direct accession access and free text search boxes (Figure 3).

Figure 3.

The new ENA Browser, showing its streamlined landing page.

The new ENA Browser, showing its streamlined landing page. Search has been overhauled in the new browser with improvements to existing search interfaces and addition of new features. We offer five distinct search interfaces: free text search (simple keyword search), sequence similarity search (BLAST search), sequence version archive search (find non-current sequence versions), cross-reference search (search our extensive array of cross-references and extended annotations from an increasing number of external databases and resources) and a new advanced search service. Advanced search enables the guided construction of complex queries using a range of predefined filters, combined with autocompletion assistance for many fields (Figure 4), with the interface constructing the query language on the user’s behalf. Users can refine the results output using inclusion and exclusion by accession. As the query language is the same as for our API interfaces, the browser features a copy to cURL command button so that a query constructed in the browser can be easily utilized programmatically. CURL is a widely used command line tool for web address-based API interactions, such as those with the ENA APIs, and allows for the transfer of data and files. For example, the following is an example of an advanced search query for all human raw reads in ENA copied to a cURL command to run the same search programmatically, ‘curl -X POST -H “Content-Type: application/x-www-form-urlencoded” -d “result=read_run&query=tax_eq(9606)&format=tsv” https://www.ebi.ac.uk/ena/portal/api/search’.

Figure 4.

Advanced search query interface for constructing complex searches, for example geographical boundaries.

Advanced search query interface for constructing complex searches, for example geographical boundaries. Rulespace is a completely new feature of the ENA Browser that allows users to save advanced search queries to their own account, to re-run the same query as new data emerges and also to share the query with collaborators to enable work on identical datasets. An authenticated management interface (Figure 5) enables a user to edit, run or share any previously saved rule queries that each have a user-provided title and description to aid identification. The queries are particularly powerful when the ‘Last updated’ field is included as it allows users to continually return to Rulespace to obtain updated records since a given date. For example, this can be set to be the last time they ran the query. This service is particularly powerful for consortiums and projects that wish to generate and distribute to all of their members a saved custom ENA advanced search. Additionally, Rulespace can also be managed programmatically through its API interface (https://www.ebi.ac.uk/ena/rulespace/api/) that enables creation, management and exploitation of the saved custom queries programmatically.

Figure 5.

Rulespace interface for managing saved user advanced queries.

Rulespace interface for managing saved user advanced queries. The new browser sits upon a new public ENA Browser API (https://www.ebi.ac.uk/ena/browser/api/) that was released early in 2019 and serves direct programmatic access. It provides a significant improvement in stability and performance over the previous programmatic data access that was integrated with the old browser and thus subject to file system performance bottlenecks. Browser API is focused on fast sequence retrieval by accession, but works perfectly in tandem with the ENA Portal (Discovery) API (https://www.ebi.ac.uk/ena/portal/api/) that supports powerful search across metadata fields. These APIs work together to provide an integrated search and retrieval programmatic service. Each of the APIs have Swagger interfaces to assist with query construction, configurable outputs and pre-publication authenticated data access. A Swagger user interface helps users easily consume our new APIs, providing easily navigable documentation, a clear overview of the available endpoints and by enabling test queries assists with the design of commands to consume our data and services (https://swagger.io). With the new MongoDB backed deployment of the APIs, we have significant flexibility for future scalability with an easily adaptable database schema design and the option of sharding over an increasing number of machines. This allows us to more easily respond to future changes in metadata, data types and technologies, and to distribute ever more complex queries from an increasing user base over a scalable system.

15 in total

1. Functional Annotation of Animal Genomes (FAANG): Current Achievements and Roadmap.

Authors: Elisabetta Giuffra; Christopher K Tuggle
Journal: Annu Rev Anim Biosci Date: 2018-11-14 Impact factor: 8.923

2. The ocean sampling day consortium.

Authors: Anna Kopf; Mesude Bicak; Renzo Kottmann; Julia Schnetzer; Ivaylo Kostadinov; Katja Lehmann; Antonio Fernandez-Guerra; Christian Jeanthon; Eyal Rahav; Matthias Ullrich; Antje Wichels; Gunnar Gerdts; Paraskevi Polymenakou; Giorgos Kotoulas; Rania Siam; Rehab Z Abdallah; Eva C Sonnenschein; Thierry Cariou; Fergal O'Gara; Stephen Jackson; Sandi Orlic; Michael Steinke; Julia Busch; Bernardo Duarte; Isabel Caçador; João Canning-Clode; Oleksandra Bobrova; Viggo Marteinsson; Eyjolfur Reynisson; Clara Magalhães Loureiro; Gian Marco Luna; Grazia Marina Quero; Carolin R Löscher; Anke Kremp; Marie E DeLorenzo; Lise Øvreås; Jennifer Tolman; Julie LaRoche; Antonella Penna; Marc Frischer; Timothy Davis; Barker Katherine; Christopher P Meyer; Sandra Ramos; Catarina Magalhães; Florence Jude-Lemeilleur; Ma Leopoldina Aguirre-Macedo; Shiao Wang; Nicole Poulton; Scott Jones; Rachel Collin; Jed A Fuhrman; Pascal Conan; Cecilia Alonso; Noga Stambler; Kelly Goodwin; Michael M Yakimov; Federico Baltar; Levente Bodrossy; Jodie Van De Kamp; Dion Mf Frampton; Martin Ostrowski; Paul Van Ruth; Paul Malthouse; Simon Claus; Klaas Deneudt; Jonas Mortelmans; Sophie Pitois; David Wallom; Ian Salter; Rodrigo Costa; Declan C Schroeder; Mahrous M Kandil; Valentina Amaral; Florencia Biancalana; Rafael Santana; Maria Luiza Pedrotti; Takashi Yoshida; Hiroyuki Ogata; Tim Ingleton; Kate Munnik; Naiara Rodriguez-Ezpeleta; Veronique Berteaux-Lecellier; Patricia Wecker; Ibon Cancio; Daniel Vaulot; Christina Bienhold; Hassan Ghazal; Bouchra Chaouni; Soumya Essayeh; Sara Ettamimi; El Houcine Zaid; Noureddine Boukhatem; Abderrahim Bouali; Rajaa Chahboune; Said Barrijal; Mohammed Timinouni; Fatima El Otmani; Mohamed Bennani; Marianna Mea; Nadezhda Todorova; Ventzislav Karamfilov; Petra Ten Hoopen; Guy Cochrane; Stephane L'Haridon; Kemal Can Bizsel; Alessandro Vezzi; Federico M Lauro; Patrick Martin; Rachelle M Jensen; Jamie Hinks; Susan Gebbels; Riccardo Rosselli; Fabio De Pascale; Riccardo Schiavon; Antonina Dos Santos; Emilie Villar; Stéphane Pesant; Bruno Cataletto; Francesca Malfatti; Ranjith Edirisinghe; Jorge A Herrera Silveira; Michele Barbier; Valentina Turk; Tinkara Tinta; Wayne J Fuller; Ilkay Salihoglu; Nedime Serakinci; Mahmut Cerkez Ergoren; Eileen Bresnan; Juan Iriberri; Paul Anders Fronth Nyhus; Edvardsen Bente; Hans Erik Karlsen; Peter N Golyshin; Josep M Gasol; Snejana Moncheva; Nina Dzhembekova; Zackary Johnson; Christopher David Sinigalliano; Maribeth Louise Gidley; Adriana Zingone; Roberto Danovaro; George Tsiamis; Melody S Clark; Ana Cristina Costa; Monia El Bour; Ana M Martins; R Eric Collins; Anne-Lise Ducluzeau; Jonathan Martinez; Mark J Costello; Linda A Amaral-Zettler; Jack A Gilbert; Neil Davies; Dawn Field; Frank Oliver Glöckner
Journal: Gigascience Date: 2015-06-19 Impact factor: 6.524

3. ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences.

Authors: Monica Santamaria; Bruno Fosso; Flavio Licciulli; Bachir Balech; Ilaria Larini; Giorgio Grillo; Giorgio De Caro; Sabino Liuni; Graziano Pesole
Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971

4. Common genetic variation drives molecular heterogeneity in human iPSCs.

Authors: Helena Kilpinen; Angela Goncalves; Andreas Leha; Vackar Afzal; Kaur Alasoo; Sofie Ashford; Sendu Bala; Dalila Bensaddek; Francesco Paolo Casale; Oliver J Culley; Petr Danecek; Adam Faulconbridge; Peter W Harrison; Annie Kathuria; Davis McCarthy; Shane A McCarthy; Ruta Meleckyte; Yasin Memari; Nathalie Moens; Filipa Soares; Alice Mann; Ian Streeter; Chukwuma A Agu; Alex Alderton; Rachel Nelson; Sarah Harper; Minal Patel; Alistair White; Sharad R Patel; Laura Clarke; Reena Halai; Christopher M Kirton; Anja Kolb-Kokocinski; Philip Beales; Ewan Birney; Davide Danovi; Angus I Lamond; Willem H Ouwehand; Ludovic Vallier; Fiona M Watt; Richard Durbin; Oliver Stegle; Daniel J Gaffney
Journal: Nature Date: 2017-05-10 Impact factor: 49.962

Review 5. UniEuk: Time to Speak a Common Language in Protistology!

Authors: Cédric Berney; Andreea Ciuprina; Sara Bender; Juliet Brodie; Virginia Edgcomb; Eunsoo Kim; Jeena Rajan; Laura Wegener Parfrey; Sina Adl; Stéphane Audic; David Bass; David A Caron; Guy Cochrane; Lucas Czech; Micah Dunthorn; Stefan Geisen; Frank Oliver Glöckner; Frédéric Mahé; Christian Quast; Jonathan Z Kaye; Alastair G B Simpson; Alexandros Stamatakis; Javier Del Campo; Pelin Yilmaz; Colomban de Vargas
Journal: J Eukaryot Microbiol Date: 2017-04-21 Impact factor: 3.346

6. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea.

Authors: Robert M Bowers; Nikos C Kyrpides; Ramunas Stepanauskas; Miranda Harmon-Smith; Devin Doud; T B K Reddy; Frederik Schulz; Jessica Jarett; Adam R Rivers; Emiley A Eloe-Fadrosh; Susannah G Tringe; Natalia N Ivanova; Alex Copeland; Alicia Clum; Eric D Becraft; Rex R Malmstrom; Bruce Birren; Mircea Podar; Peer Bork; George M Weinstock; George M Garrity; Jeremy A Dodsworth; Shibu Yooseph; Granger Sutton; Frank O Glöckner; Jack A Gilbert; William C Nelson; Steven J Hallam; Sean P Jungbluth; Thijs J G Ettema; Scott Tighe; Konstantinos T Konstantinidis; Wen-Tso Liu; Brett J Baker; Thomas Rattei; Jonathan A Eisen; Brian Hedlund; Katherine D McMahon; Noah Fierer; Rob Knight; Rob Finn; Guy Cochrane; Ilene Karsch-Mizrachi; Gene W Tyson; Christian Rinke; Alla Lapidus; Folker Meyer; Pelin Yilmaz; Donovan H Parks; A M Eren; Lynn Schriml; Jillian F Banfield; Philip Hugenholtz; Tanja Woyke
Journal: Nat Biotechnol Date: 2017-08-08 Impact factor: 54.908

7. The European Nucleotide Archive in 2018.

Authors: Peter W Harrison; Blaise Alako; Clara Amid; Ana Cerdeño-Tárraga; Iain Cleland; Sam Holt; Abdulrahman Hussein; Suran Jayathilaka; Simon Kay; Thomas Keane; Rasko Leinonen; Xin Liu; Josué Martínez-Villacorta; Annalisa Milano; Nima Pakseresht; Jeena Rajan; Kethi Reddy; Edward Richards; Marc Rosello; Nicole Silvester; Dmitriy Smirnov; Ana-Luisa Toribio; Senthilnathan Vijayaraja; Guy Cochrane
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

8. Minimum Information about an Uncultivated Virus Genome (MIUViG).

Authors: Simon Roux; Evelien M Adriaenssens; Bas E Dutilh; Eugene V Koonin; Andrew M Kropinski; Mart Krupovic; Jens H Kuhn; Rob Lavigne; J Rodney Brister; Arvind Varsani; Clara Amid; Ramy K Aziz; Seth R Bordenstein; Peer Bork; Mya Breitbart; Guy R Cochrane; Rebecca A Daly; Christelle Desnues; Melissa B Duhaime; Joanne B Emerson; François Enault; Jed A Fuhrman; Pascal Hingamp; Philip Hugenholtz; Bonnie L Hurwitz; Natalia N Ivanova; Jessica M Labonté; Kyung-Bum Lee; Rex R Malmstrom; Manuel Martinez-Garcia; Ilene Karsch Mizrachi; Hiroyuki Ogata; David Páez-Espino; Marie-Agnès Petit; Catherine Putonti; Thomas Rattei; Alejandro Reyes; Francisco Rodriguez-Valera; Karyna Rosario; Lynn Schriml; Frederik Schulz; Grieg F Steward; Matthew B Sullivan; Shinichi Sunagawa; Curtis A Suttle; Ben Temperton; Susannah G Tringe; Rebecca Vega Thurber; Nicole S Webster; Katrine L Whiteson; Steven W Wilhelm; K Eric Wommack; Tanja Woyke; Kelly C Wrighton; Pelin Yilmaz; Takashi Yoshida; Mark J Young; Natalya Yutin; Lisa Zeigler Allen; Nikos C Kyrpides; Emiley A Eloe-Fadrosh
Journal: Nat Biotechnol Date: 2018-12-17 Impact factor: 54.908

9. The SILVA and "All-species Living Tree Project (LTP)" taxonomic frameworks.

Authors: Pelin Yilmaz; Laura Wegener Parfrey; Pablo Yarza; Jan Gerken; Elmar Pruesse; Christian Quast; Timmy Schweer; Jörg Peplies; Wolfgang Ludwig; Frank Oliver Glöckner
Journal: Nucleic Acids Res Date: 2013-11-28 Impact factor: 16.971

10. EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies.

Authors: Alex L Mitchell; Maxim Scheremetjew; Hubert Denise; Simon Potter; Aleksandra Tarkowska; Matloob Qureshi; Gustavo A Salazar; Sebastien Pesseat; Miguel A Boland; Fiona M I Hunter; Petra Ten Hoopen; Blaise Alako; Clara Amid; Darren J Wilkinson; Thomas P Curtis; Guy Cochrane; Robert D Finn
Journal: Nucleic Acids Res Date: 2018-01-04 Impact factor: 16.971

36 in total

1. Scripting Analyses of Genomes in Ensembl Plants.

Authors: Bruno Contreras-Moreira; Guy Naamati; Marc Rosello; James E Allen; Sarah E Hunt; Matthieu Muffato; Astrid Gall; Paul Flicek
Journal: Methods Mol Biol Date: 2022

2. Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

Authors: Marcela Karey Tello-Ruiz; Pankaj Jaiswal; Doreen Ware
Journal: Methods Mol Biol Date: 2022

3. getSequenceInfo: a suite of tools allowing to get genome sequence information from public repositories.

Authors: Vincent Moco; Damien Cazenave; Maëlle Garnier; Matthieu Pot; Isabel Marcelino; Antoine Talarmin; Stéphanie Guyomard-Rabenirina; Sébastien Breurec; Séverine Ferdinand; Alexis Dereeper; Yann Reynaud; David Couvin
Journal: BMC Bioinformatics Date: 2022-07-08 Impact factor: 3.307

4. Trypanosoma cruzi iron superoxide dismutases: insights from phylogenetics to chemotherapeutic target assessment.

Authors: Silvane Maria Fonseca Murta; Laila Alves Nahum; Jéssica Hickson; Lucas Felipe Almeida Athayde; Thainá Godinho Miranda; Policarpo Ademar Sales Junior; Anderson Coqueiro Dos Santos; Lúcia Maria da Cunha Galvão; Antônia Cláudia Jácome da Câmara; Daniella Castanheira Bartholomeu; Rita de Cássia Moreira de Souza
Journal: Parasit Vectors Date: 2022-06-06 Impact factor: 4.047

Review 5. Ecosystem-specific microbiota and microbiome databases in the era of big data.

Authors: Victor Lobanov; Angélique Gobet; Alyssa Joyce
Journal: Environ Microbiome Date: 2022-07-16

6. To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Authors: R A Leo Elworth; Qi Wang; Pavan K Kota; C J Barberan; Benjamin Coleman; Advait Balaji; Gaurav Gupta; Richard G Baraniuk; Anshumali Shrivastava; Todd J Treangen
Journal: Nucleic Acids Res Date: 2020-06-04 Impact factor: 16.971

10. The FAANG Data Portal: Global, Open-Access, "FAIR", and Richly Validated Genotype to Phenotype Data for High-Quality Functional Annotation of Animal Genomes.

Authors: Peter W Harrison; Alexey Sokolov; Akshatha Nayak; Jun Fan; Daniel Zerbino; Guy Cochrane; Paul Flicek
Journal: Front Genet Date: 2021-06-17 Impact factor: 4.599