Jon Ison1, Hans Ienasescu2, Emil Rydza3, Piotr Chmura3, Kristoffer Rapacki4, Alban Gaignard1,5, Veit Schwämmle6, Jacques van Helden1,7, Matúš Kalaš8, Hervé Ménager1,9. 1. CNRS, UMS 3601, Institut Français de Bioinformatique, IFB-core, 2 rue Gaston Crémieux, F-91000 Evry, France. 2. National Life Science Supercomputing Center, Technical University of Denmark, Building 208, DK-2800 Kongens Lyngby, Denmark. 3. Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200 København, Denmark. 4. Department of Health Technology, Ørsteds Plads, Building 345C, DK-2800 Kongens, Lyngby, Denmark. 5. L'institut du Thorax, INSERM, CNRS, University of Nantes, 44007 Nantes, France. 6. Department of Biochemistry and Molecular Biology and VILLUM Center for Bioanalytical Sciences, University of Southern Denmark, Campusvej 55, 5230 Odense, Denmark. 7. Département de Biologie, Aix-Marseille Université (AMU), 3 place Victor Hugo, 13003 Marseille, France. 8. Computational Biology Unit, Department of Informatics, University of Bergen, N-5008 Bergen, Norway. 9. Hub de Bioinformatique et Biostatistique-Département Biologie Computationnelle, Institut Pasteur, USR 3756, CNRS, Paris 75015, France.
Abstract
BACKGROUND: Life scientists routinely face massive and heterogeneous data analysis tasks and must find and access the most suitable databases or software in a jungle of web-accessible resources. The diversity of information used to describe life-scientific digital resources presents an obstacle to their utilization. Although several standardization efforts are emerging, no information schema has been sufficiently detailed to enable uniform semantic and syntactic description-and cataloguing-of bioinformatics resources. FINDINGS: Here we describe biotoolsSchema, a formalized information model that balances the needs of conciseness for rapid adoption against the provision of rich technical information and scientific context. biotoolsSchema results from a series of community-driven workshops and is deployed in the bio.tools registry, providing the scientific community with >17,000 machine-readable and human-understandable descriptions of software and other digital life-science resources. We compare our approach to related initiatives and provide alignments to foster interoperability and reusability. CONCLUSIONS: biotoolsSchema supports the formalized, rigorous, and consistent specification of the syntax and semantics of bioinformatics resources, and enables cataloguing efforts such as bio.tools that help scientists to find, comprehend, and compare resources. The use of biotoolsSchema in bio.tools promotes the FAIRness of research software, a key element of open and reproducible developments for data-intensive sciences.
BACKGROUND: Life scientists routinely face massive and heterogeneous data analysis tasks and must find and access the most suitable databases or software in a jungle of web-accessible resources. The diversity of information used to describe life-scientific digital resources presents an obstacle to their utilization. Although several standardization efforts are emerging, no information schema has been sufficiently detailed to enable uniform semantic and syntactic description-and cataloguing-of bioinformatics resources. FINDINGS: Here we describe biotoolsSchema, a formalized information model that balances the needs of conciseness for rapid adoption against the provision of rich technical information and scientific context. biotoolsSchema results from a series of community-driven workshops and is deployed in the bio.tools registry, providing the scientific community with >17,000 machine-readable and human-understandable descriptions of software and other digital life-science resources. We compare our approach to related initiatives and provide alignments to foster interoperability and reusability. CONCLUSIONS: biotoolsSchema supports the formalized, rigorous, and consistent specification of the syntax and semantics of bioinformatics resources, and enables cataloguing efforts such as bio.tools that help scientists to find, comprehend, and compare resources. The use of biotoolsSchema in bio.tools promotes the FAIRness of research software, a key element of open and reproducible developments for data-intensive sciences.
Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock Journal: Nat Genet Date: 2000-05 Impact factor: 38.330
Authors: Kathleen M Jagodnik; Simon Koplev; Sherry L Jenkins; Lucila Ohno-Machado; Benedict Paten; Stephan C Schurer; Michel Dumontier; Ruben Verborgh; Alex Bui; Peipei Ping; Neil J McKenna; Ravi Madduri; Ajay Pillai; Avi Ma'ayan Journal: J Biomed Inform Date: 2017-05-10 Impact factor: 6.317
Authors: Karen Eilbeck; Suzanna E Lewis; Christopher J Mungall; Mark Yandell; Lincoln Stein; Richard Durbin; Michael Ashburner Journal: Genome Biol Date: 2005-04-29 Impact factor: 13.583
Authors: Kenzo-Hugo Hillion; Ivan Kuzmin; Anton Khodak; Eric Rasche; Michael Crusoe; Hedi Peterson; Jon Ison; Hervé Ménager Journal: F1000Res Date: 2017-11-30
Authors: Olivia Doppelt-Azeroual; Fabien Mareuil; Eric Deveaud; Matúš Kalaš; Nicola Soranzo; Marius van den Beek; Björn Grüning; Jon Ison; Hervé Ménager Journal: Gigascience Date: 2017-06-01 Impact factor: 6.524
Authors: Mark D Wilkinson; Michel Dumontier; I Jsbrand Jan Aalbersberg; Gabrielle Appleton; Myles Axton; Arie Baak; Niklas Blomberg; Jan-Willem Boiten; Luiz Bonino da Silva Santos; Philip E Bourne; Jildau Bouwman; Anthony J Brookes; Tim Clark; Mercè Crosas; Ingrid Dillo; Olivier Dumon; Scott Edmunds; Chris T Evelo; Richard Finkers; Alejandra Gonzalez-Beltran; Alasdair J G Gray; Paul Groth; Carole Goble; Jeffrey S Grethe; Jaap Heringa; Peter A C 't Hoen; Rob Hooft; Tobias Kuhn; Ruben Kok; Joost Kok; Scott J Lusher; Maryann E Martone; Albert Mons; Abel L Packer; Bengt Persson; Philippe Rocca-Serra; Marco Roos; Rene van Schaik; Susanna-Assunta Sansone; Erik Schultes; Thierry Sengstag; Ted Slater; George Strawn; Morris A Swertz; Mark Thompson; Johan van der Lei; Erik van Mulligen; Jan Velterop; Andra Waagmeester; Peter Wittenburg; Katherine Wolstencroft; Jun Zhao; Barend Mons Journal: Sci Data Date: 2016-03-15 Impact factor: 6.444
Authors: Anna-Lena Lamprecht; Magnus Palmblad; Jon Ison; Veit Schwämmle; Mohammad Sadnan Al Manir; Ilkay Altintas; Christopher J O Baker; Ammar Ben Hadj Amor; Salvador Capella-Gutierrez; Paulos Charonyktakis; Michael R Crusoe; Yolanda Gil; Carole Goble; Timothy J Griffin; Paul Groth; Hans Ienasescu; Pratik Jagtap; Matúš Kalaš; Vedran Kasalica; Alireza Khanteymoori; Tobias Kuhn; Hailiang Mei; Hervé Ménager; Steffen Möller; Robin A Richardson; Vincent Robert; Stian Soiland-Reyes; Robert Stevens; Szoke Szaniszlo; Suzan Verberne; Aswin Verhoeven; Katherine Wolstencroft Journal: F1000Res Date: 2021-09-07
Authors: Daniel Garijo; Hervé Ménager; Lorraine Hwang; Ana Trisovic; Michael Hucka; Thomas Morrell; Alice Allen Journal: PeerJ Comput Sci Date: 2022-08-08