| Literature DB >> 26538599 |
Jon Ison1, Kristoffer Rapacki2, Hervé Ménager3, Matúš Kalaš4, Emil Rydza2, Piotr Chmura2, Christian Anthon5, Niall Beard6, Karel Berka7, Dan Bolser8, Tim Booth9, Anthony Bretaudeau10, Jan Brezovsky11, Rita Casadio12, Gianni Cesareni13, Frederik Coppens14, Michael Cornell15, Gianmauro Cuccuru16, Kristian Davidsen2, Gianluca Della Vedova17, Tunca Dogan18, Olivia Doppelt-Azeroual3, Laura Emery8, Elisabeth Gasteiger19, Thomas Gatter20, Tatyana Goldberg21, Marie Grosjean22, Björn Grüning23, Manuela Helmer-Citterich24, Hans Ienasescu25, Vassilios Ioannidis19, Martin Closter Jespersen2, Rafael Jimenez8, Nick Juty8, Peter Juvan26, Maximilian Koch8, Camille Laibe8, Jing-Woei Li27, Luana Licata13, Fabien Mareuil3, Ivan Mičetić28, Rune Møllegaard Friborg29, Sebastien Moretti30, Chris Morris31, Steffen Möller32, Aleksandra Nenadic6, Hedi Peterson33, Giuseppe Profiti12, Peter Rice34, Paolo Romano35, Paola Roncaglia8, Rabie Saidi18, Andrea Schafferhans21, Veit Schwämmle36, Callum Smith37, Maria Maddalena Sperotto2, Heinz Stockinger19, Radka Svobodová Vařeková38, Silvio C E Tosatto28, Victor de la Torre39, Paolo Uva16, Allegra Via40, Guy Yachdav21, Federico Zambelli41, Gert Vriend42, Burkhard Rost21, Helen Parkinson8, Peter Løngreen2, Søren Brunak43.
Abstract
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.Entities:
Mesh:
Year: 2015 PMID: 26538599 PMCID: PMC4702812 DOI: 10.1093/nar/gkv1116
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.EDAM concepts. EDAM includes four main sub-ontologies defining common concepts within bioinformatics: topics, operations, data (including identifiers, the fifth sub-ontology) and data formats. EDAM provides the core scientific concepts for describing registry entries.
Mandatory resource information. Of the 55 fields of information defined in biotoolsXSD—the resource description model used by the registry, 10 are mandatory and provide a minimum standard for annotation of registered resources
| Field | Description | Format |
|---|---|---|
| Name | The canonical name of the resource | Text |
| Homepage | Resource homepage | URL |
| Description (1 only) | Short textual description of the resource | Text |
| Resource type (1 or more) | Basic resource type | ‘Database’, ‘Tool’, ‘Service’, ‘Workflow’, ‘Platform’, ‘Container’, ‘Library’ or ‘Other’ |
| Interface type (1 or more) | Resource interface type | ‘Command-line’, ‘Web UI’, ‘Desktop GUI’, ‘SOAP WS’, ‘HTTP WS’, ‘API’ or ‘QL’ |
| Topics (1 or more) | General scientific domain(s) the software serves, e.g. ‘Proteomics’. | URI of EDAM Topic |
| Functions (1 or more) | Functions (1 or more), e.g. ‘Gene regulatory network prediction’ | URI of EDAM Operation |
| Input types (0 or more) | Type of data : primary input(s), e.g. ‘Protein sequences’ | URI of EDAM Data |
| Output types (0 or more) | Type of data : primary output(s), e.g. ‘Protein sequence alignment’ | URI of EDAM Data |
| Contact (1 or more) | Primary contact, e.g. a person, helpdesk, or mailing list | Email or URL |
Content of the Tools and Data Services Registry. The registry includes 1633 accessions with a total of 36,428 annotations. The table gives a breakdown of the types of resources and their interfaces, and the number of scientific annotations made using EDAM
| #entries or #annotations | |
|---|---|
| #entries, with breakdown by resource type | |
| #annotations, with breakdown by type | |
| #entries, with breakdown by interface type | |
| #EDAM annotations, with breakdown by type |
Resource providers. A non-exhaustive list of collections that have contributed or will contribute to the registry. The list includes a cross-section of bioinformatics service providers including other catalogues such as SEQwiki and BioCatalogue
| Name, URL, Short description |
|---|
| A collection of on-line prediction services from CBS-DTU. The resource contains 75 tools for gene finding and splice sites, post-translational protein modification, immunological features, protein function and structure, protein sorting, genomic epidemiology and more. The tools can be used via interactive input forms, with many available as software packages and SOAP Web services. |
| The data resource catalogue is a collection of metadata on bioinformatics Web-based data resources. The catalog contains over 600 resources including bioinformatics and biomedical databases, ontologies, taxonomies and catalogues. |
| BiBiServ is a collection of bioinformatics tools that emerged from the research at Bielefeld University. It contains over 40 mainly analysis and utility tools, including RNA structure prediction, metagenomics, genome rearrangement, alignments, evolutionary relationships, primer design and suffix trees. These are available as interactive web applications, HTTP Web services and downloadable software. |
| A collection of over 20 web services, databases and software packages from The Bioinformatics Centre at The University of Copenhagen. The resource covers sequence and structure analysis, prediction and modeling, gene regulation, population genetics and more. |
| ELIXIR-CZ Services collection |
| The Czech Bioinformatics Services resource is provided by members of ELIXIR CZ node. It contains over 30 bioinformatics tools and databases for analysis of sequence, topology and structure of nucleic acids and proteins to genomics, proteomics and benchmarks for small molecule interactions. The databases can be accessed via web GUIs while tools are available as web, standalone and command-line applications. |
| Orange data mining suite is an open source data visualization and analysis software for data mining through visual programming or Python scripting. It consists of over 100 components for machine learning and add-ons for bioinformatics and text mining. |
| GoMapMan is a database of gene functional annotations in the plant sciences based on the plant-specific MapMan ontology. |
| A collection of services provided by research institutions members of the ELIXIR Italian node. The resource includes databases and analysis tools developed and maintained by Italian bioinformatics groups and institutions. |
| A collection of 60 bioinformatics tools from the University of Padova.It includes databases for structural bioinformatics and genome sequences as well as tools for sequence analysis, phylogenetics, structure analysis, chemioinformatics and network analysis. |
| A collection of 22 predictors for subcellular localization, disease-related mutations and protein sequences annotation from Bologna Biocomputing Group. Most tools are accessible using a Web UI while some offer a command line interface. |
| A collection of 19 resources and tools for structural bioinformatics, immunoinformatics and genomics from the Sapienza University Biocomputing Group. |
| A collection of 7 databases and portals linking physically and functionally gene products. All databases and portals data can be searched, visualized and downloaded through Web UI interfaces. |
| A collection of tools dedicated to the analysis of protein structures, the identification of structure motifs and the comparison of RNA secondary structure. |
| Online services and open source software, mainly for NGS and EST analysis or to infer evolutionary histories in tumors. The majority of tools are used via a command line interface while the rest offer a graphical interface. |
| A collection of 300 bioinformatics tools covering various topics such as sequence analysis, phylogeny, integrated in an online workbench. The suite is a combination of tools developed at the Institut Pasteur and/or tools used by it, for research and education. |
| A collection of 260 bioinformatics tools, mainly dedicated to NGS analysis, and integrated into the Galaxy instance available at the Institut Pasteur. This instance is only available to Pasteur researchers and collaborators. |
| A collection of tools dedicated to the analyses of NGS data along with bioinformatics genomic databases hosted by GenOuest. Most tools can be used via command line, while the databases and some of the tools are available through a web interface. |
| The French Institute of Bioinformatics (IFB) is a national service infrastructure in bioinformatics that gathers together the bioinformatics platforms of the main French research organizations, CNRS, INRA, INRIA, CEA and INSERM, as well as CIRAD, the Pasteur and Curie Institutes and the French universities. IFB's principal mission is to provide basic services and resources in bioinformatics for scientists and engineers working in the life sciences. IFB is the French node of ELIXIR. |
| A collection of tools developed at Loschmidt Laboratories for protein design, engineering and analysis. The tools are mostly available via web interface or as command line application. |
| The core of the |
| The catalog of Spanish National Bioinformatics Institute. INB Services develops and provides software tools and web servers for the global life sciences research community. |
| A collection of tools developed at the department of Plant Systems Biology (VIB,Gent University). The tools cover topics such as comparative genomics, network analysis, genome prediction, annotation and visualization. The tools are available as web UI or command line applications. |
| Biocomputing infrastructure to primarily support analysis of data produced by the CRS4 NGS facility. The system integrates hundreds of tools into a web-based traceability framework that can handle the whole transformation process from raw data to downstream analysis. |
| A collection of computational tools for macromolecular X-ray crystallography, and other biophysical techniques. |
| A collection of computational tools for structural biology from Instruct. Instruct is a pan-European research infrastructure in structural biology, making high-end technologies and methods available to users. |
| Estonian bioinformatics services, tools and databases provided by ELIXIR-Estonia contain almost 20 tools and databases for several high-throughput analyses, enrichment analysis, network dissection, primer design approaches, as well as data visualisation applications. The resources are mainly available as interactive web applications and R packages. |
| ExPASy is the SIB bioinformatics resources portal which provides access to scientific databases and software tools (i.e. resources) in different areas of life sciences including proteomics, genomics, phylogeny, systems biology, population genetics, transcriptomics. |
| The SEQanswers wiki (SEQwiki) is a wiki database that is actively edited and updated by the members of the SEQanswers community ( |
| The resource is devoted to management and distribution on information on human and animal cell lines and other biological resources. The tools are usually available as a web interface or as REST and SOAP Web Services. |
| The BioCatalogue is a curated catalogue of 369 life science Web Services. Users and curators register metadata about Web Services. Web Services in the catalogue can be either SOAP or REST APIs. |
| Collection of tools and services developed and maintained at the University of Southern Denmark currently comprising 13 applications. Covered topics are cluster validation, proteomics, pathway and network processing, and omics analyses. |
| A list of over 30 tools, including web applications and Web services, provided by the universities in Norway affiliated with ELIXIR-NO. |
| A portfolio of bioinformatics tools to facilitate scientific discovery within the life sciences, provided by EMBL-EBI. |
| A collection of resources to perform data analysis using Gene Ontology (GO). Includes tools developed by GO Consortium members as well as some third-party resources. |
| EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards commercial software packages. |
| A versatile molecular modelling package that is specialized on working with proteins and the molecules in their environment like water, ligands, nucleic acids, etc. |
| Bio-Linux is an Ubuntu Linux-based distribution that adds more than 250 bioinformatics packages, providing around 50 graphical applications and several hundred command line tools, as well as the Galaxy environment for browser-based data analysis and workflow construction. |
| Debian Med is a project that aims at developing Debian into an operating system that is particularly well fit for the requirements for medical and biological research. The goal of Debian Med is a complete free and open-source system for all tasks in life-scientific research. To achieve this goal Debian Med integrates applicable software into Debian. |
| A collection of bioinformatics tools for the prediction and analysis of the aspects of protein structure and function, provided by the Rost lab at the Technical University of Munich and Columbia University of New York. |
| A registry and open-source library of JavaScript components to visualise biological data. |
Figure 2.ELIXIR registry query user interface. The query interface (https://bio.tools) provides features to search the registry, display what fields of information are shown, and filter and sort the results by various attributes.