| Literature DB >> 29140475 |
Nicole Silvester1, Blaise Alako1, Clara Amid1, Ana Cerdeño-Tarrága1, Laura Clarke1, Iain Cleland1, Peter W Harrison1, Suran Jayathilaka1, Simon Kay1, Thomas Keane1, Rasko Leinonen1, Xin Liu1, Josué Martínez-Villacorta1, Manuela Menchi1, Kethi Reddy1, Nima Pakseresht1, Jeena Rajan1, Marc Rossello1, Dmitriy Smirnov1, Ana L Toribio1, Daniel Vaughan1, Vadim Zalunin1, Guy Cochrane1.
Abstract
For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.Entities:
Mesh:
Year: 2018 PMID: 29140475 PMCID: PMC5753375 DOI: 10.1093/nar/gkx1125
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Number of WGS sets accessioned each month at ENA.
Figure 2.Total number of public sequences contained in each release of ENA’s assembled and annotated sequence database from June 1982 to June 2017. From June 2004, when WGS sequences were first included in the release, the WGS proportion is also illustrated.
Figure 3.Recent growth in the Sequence Version Archive database. This is given as the compressed data size.
Main simplifications and changes in use of the new data discovery API compared to ENA’s advanced search
| Parameter | Change |
|---|---|
| query | This is now optional. If this parameter is not supplied, the full result set for the selected data type will be returned. |
| domain | This parameter is no longer needed/supported. |
| offset | This has been changed to the true offset and represents the number of records to skip rather than the number of the record from which to start the result page. |
| limit | The default remains at 100 000 records, but this can now be set to 0 to fetch all records for the search. |
| length | This parameter is no longer needed/supported. |
| format | Only metadata reports are currently supported therefore the meaning of this field has changed. This parameter directs whether the search report should be downloaded in TSV (default) or JSON format. |
| dataPortal | The API introduces the concept of data portals to support additional services to ENA. The current selection is: ena, pathogen, faang and metagenome. |