| Literature DB >> 30246165 |
Sandra Karcher1,2, Egon L Willighagen3, John Rumble4,5, Friederike Ehrhart3, Chris T Evelo3, Martin Fritts6, Sharon Gaheen6, Stacey L Harper7, Mark D Hoover8, Nina Jeliazkova9, Nastassja Lewinski10, Richard L Marchese Robinson11,12, Karmann C Mills13, Axel P Mustad14, Dennis G Thomas15, Georgia Tsiliki16,17, Christine Ogilvie Hendren2.
Abstract
Many groups within the broad field of nanoinformatics are already developing data repositories and analytical tools driven by their individual organizational goals. Integrating these data resources across disciplines and with non-nanotechnology resources can support multiple objectives by enabling the reuse of the same information. Integration can also serve as the impetus for novel scientific discoveries by providing the framework to support deeper data analyses. This article discusses current data integration practices in nanoinformatics and in comparable mature fields, and nanotechnology-specific challenges impacting data integration. Based on results from a nanoinformatics-community-wide survey, recommendations for achieving integration of existing operational nanotechnology resources are presented. Nanotechnology-specific data integration challenges, if effectively resolved, can foster the application and validation of nanotechnology within and across disciplines. This paper is one of a series of articles by the Nanomaterial Data Curation Initiative that address data issues such as data curation workflows, data completeness and quality, curator responsibilities, and metadata.Entities:
Keywords: Data integration; Databases; Nanoinformatics; Nanomaterials; Nanotechnology; Web services
Year: 2018 PMID: 30246165 PMCID: PMC6145474 DOI: 10.1016/j.impact.2017.11.002
Source DB: PubMed Journal: NanoImpact ISSN: 2452-0748
Fig. 1.Examples of use cases that can be addressed and might mutually benefit from data integration.
Integration capabilities of responding nanoinformatics resources.
| Nanotechnology resource | Integration capabilities |
|---|---|
| caNanoLab | Provides REST-based Web Services supporting general sample search and |
| CEINT NIKC (NanoInformatics Knowledge Commons) | Integration within the CEINT NIKC resource is achieved by cross-training lead |
| Center for Safety of Substances and Products, National Institute | Does not provide any Web Services. |
| DECHEMA | The DaNa project has been providing the Web Service for the NANORA project |
| eNanoMapper Database | There is a REST-based API and nanomaterials have URIs allowing a linked data |
| Nanomaterial Registry Websites | Integration with the Registry is achieved on a case by case basis. Future |
| Nanoparticle Information Library | Integration with the NIL is achieved on a case-by case-basis. |
Fig. 2.Technical and operational challenges impacting data integration.
Common web services envisaged by nanoinformatics stakeholders, as reported in the survey, as being needed to support integration of nanomaterial data in the biomedical nano-technology and nanosafety domains.
| Web service method | Description |
|---|---|
| Creation of an identifier | Creates a Universally Unique Identifier (UUID) for any entity such as a material, characterization, protocol, or publication |
| Characterization retrieval | Retrieves characterizations for a material by material type and characterization type (e.g. size) and returns characterization data in JSON and XML |
| Get data by DOI | Returns (pointers to) entries in the database with information about or from a specific publication |
| Get data by PubMed ID | Returns (pointers to) entries in the database with information about or from a specific publication |
| Get identifier | Retrieves a UUID for any entity such as a material, characterization, protocol, or publication |
| Get ISA-TAB-Nano file | Retrieves ISA-TAB-Nano files associated with a publication (DOI, PubMed) |
| Get investigation | Retrieves an investigation associated with a specific disease and/or nanomaterial type and returns an investigation in JSON or XML format; the JSON |
| Get protocol | Retrieves protocols by protocol type (e.g. in vitro) and returns a protocol document and list of materials characterized with the protocol if requested; |
| Get publication | Retrieves publications associated with a material, characterization, and/or protocol, and returns a DOI, PubMed ID, and/or URL to the publication |
| Get study | Retrieves a study associated with a specific assay type and/or nanomaterial type and re-turns a study in JSON or XML format; the JSON and XML |
| Search by chemistry | Retrieves nanomaterials based on chemical structure or chemical similarity. Supports a function such as: “Find the most similar structure in database |
Non-nanotechnology resources needed to support use case driving data integration.
| Non-nanotechnology resource | Description or example |
|---|---|
| Life Sciences and Chemistry Databases | Life science and chemistry databases in general, containing information about human biology (both experimental data, as well as |
| Image archives | The National Biomedical Imaging Archive (NBIA) ( |
| Image Contrast Agent Repository | The Molecular Imaging and Contrast Agent Database (MICAD) ( |
| Model Organisms Repository | The Mouse Genome Informatics (MGI) ( |
| Publication Sources | PubMed LinkOut or publication vendors to link nanomaterial data to nanomaterial publications; an example of this is the |
| Clinical Trials Management Systems (CTMS) | OpenClinica ( |
| Genomic Data/Biomarker Repositories | Repositories such as the NCI Genomic Data Commons ( |
| Chemical and Agent Repositories | Repositories such as PubChem, ChemSpider, ChEBI, and vendor repositories like Sigma Aldrich to obtain information on chemicals |
| Modeling tools | Modeling and simulation tools as well as 3D structural modeling tools. Integrating with modeling and simulation tools will assist in |
| Analysis and visualization tools | Includes various tools such as R ( |
| Ontology/Taxonomy Resources | To obtain an up-to-date database of ontologies in a table-type format so that one can easily review them. This includes resources |
Levels of equivalence. The equivalence strengths are meant to indicate how data are intended to be combined and do not specify why they should be linked in that manner.
| Equivalence | Semantic equivalence | Description | Example |
|---|---|---|---|
| Strong | Web Ontology Language | Two nanomaterials that share the same properties: all | A nanomaterial reported in a journal article for which |
| Moderate | Simple Knowledge | Two nanomaterials are said to be the “same” only for a | Two nanomaterials from the same production batch, in |
| Weak | SKOS “related match” | Two nanomaterials are merely linked together, with an | Two nanomaterials from the Joint Research Centre - Health, |
Fig. 3.Roadmap of recommendations for achieving data integration across nanomaterial and non-nanomaterial repositories.
Summary of stakeholder responses to upload, download, and mapping questions: Does the nanomaterial data resource provide the following?
| Nanomaterial | Uploading, downloading, or | Definitions of the | Controlled | Nanomaterial identifier | Integration with |
|---|---|---|---|---|---|
| caNanoLab | Web-based forms for | Extensive | Uses NPO and the | Uses a pattern containing | caNanoLab |
| CEINT | Mapping directly from NBI | Under development | Uses ontologies such | Nanomaterial associated to | Includes some |
| CSSP/NIPHE, | Commonly uses the OCHEM | Provides a list a | Uses field headings as | Identifier assigned based on | No |
| DECHEMA | No | Relational model | Uses the scientific | Not a central issue of the | No |
| eNanoMapper | Extends the OpenTox platform | Overview of the data | Uses the | Uses a substance UUID[ | Not currently |
| Nanomaterial | Export for physico-chemical | Nanomaterial | Uses a controlled | Uses unique numeric IDs[ | Not currently |
| Nanoparticle | Accomplished on a case-by- | Provided as drop- | Uses the NPO as well | Unique NIL entry numbers are | The NIL |
a The caNanoLab Design document (https://github.com/NCIP/cananolab/tree/master/docs/design) includes the object model which represents class names and attributes associated with the data model. All class names and attributes are maintained in the NCI caDSR (https://cdebrowser.nci.nih.gov/CDEBrowser/). Concepts are defined in the NCI Thesaurus (http://ncit.nci.nih.gov/ncitbrowser/pages/home.jsf?version=15.05d). caNanoLab also provides a user-friendly glossary (https://wiki.nci.nih.gov/display/caNanoLab/caNanoLab+Glossary).
b caNanoLab integrates with PubMed and ScienceDirect for access to publications, Elsevier for linking caNanoLab data to publications, PubChem for chemical information, The Collaboratory for Structural Nanobiology - CSN (https://ncifrederick.cancer.gov/dsitp/abcc/abcc-groups/simulation-and-modeling/collaboratory-for-structural-nanobiology/) for displaying 3D models of specific nanomaterials, and Nanotechnology Characterization Laboratory (NCL, http://ncl.cancer.gov/) assay cascade and JoVE (https://www.jove.com/) for nanotechnology protocols.
c DECHEMA has a very diverse target group ranging from interested laymen, stakeholders to other scientists; wording is adjusted in order to tell a comprehensive story without confusing the laymen and not losing the scientific correctness.
d eNanoMapper is based on semantic web technologies including referenceable Internationalized Resource Identifiers (IRIs) and the Resource Description Framework (RDF). The substance UUID does not reflect the uniqueness of the material structure, but is an identifier of the material in the database. The substances materials) are described with their composition (e.g. core, shell, and functionalization) and are linked to the chemical structures of their components. These can be used to decide if the nanomaterials are the same or similar.
e The NPO has been mapped to the Nanomaterial Registry and it was determined that approximately 8–10 terms used by the Registry are not yet part of the breadth of the NPO.
f It is the intent of the Nanomaterial Registry not to judge equivalence between any two nanomaterials from different data resources, as the characterization results can be wildly different based on sample medium and characterization protocol.
g The NIL integrates with the NIOSH Pocket Guide to Chemical Hazards (https://www.cdc.gov/niosh/npg/default.html) and with the Registry of Toxic Effects of Chemical Substances (RTECS, http://www.cdc.gov/niosh/rtecs), The current hosting, administration, and maintenance of the NIL web resource outside of the CDC/NIOSH website is being conducted by Oregon State University in conjunction with its program to characterize nanomaterials.
Web Services provided by caNanoLab (https://cananolab.nci.nih.gov/caNanoLab/#/).
| Search type | Possible search criteria |
|---|---|
| Protocol | Protocol name |
| Sample | Specific sample, composition, and/or characterization |
| Publication | Sample name. Nanomaterial characteristics |