| Literature DB >> 26425427 |
Christine Ogilvie Hendren1, Christina M Powers2, Mark D Hoover3, Stacey L Harper4.
Abstract
The Nanomaterial Data Curation Initiative (NDCI), a project of the National Cancer Informatics Program Nanotechnology Working Group (NCIP NanoWG), explores the critical aspect of data curation within the development of informatics approaches to understanding nanomaterial behavior. Data repositories and tools for integrating and interrogating complex nanomaterial datasets are gaining widespread interest, with multiple projects now appearing in the US and the EU. Even in these early stages of development, a single common aspect shared across all nanoinformatics resources is that data must be curated into them. Through exploration of sub-topics related to all activities necessary to enable, execute, and improve the curation process, the NDCI will provide a substantive analysis of nanomaterial data curation itself, as well as a platform for multiple other important discussions to advance the field of nanoinformatics. This article outlines the NDCI project and lays the foundation for a series of papers on nanomaterial data curation. The NDCI purpose is to: 1) present and evaluate the current state of nanomaterial data curation across the field on multiple specific data curation topics, 2) propose ways to leverage and advance progress for both individual efforts and the nanomaterial data community as a whole, and 3) provide opportunities for similar publication series on the details of the interactive needs and workflows of data customers, data creators, and data analysts. Initial responses from stakeholder liaisons throughout the nanoinformatics community reveal a shared view that it will be critical to focus on integration of datasets with specific orientation toward the purposes for which the individual resources were created, as well as the purpose for integrating multiple resources. Early acknowledgement and undertaking of complex topics such as uncertainty, reproducibility, and interoperability is proposed as an important path to addressing key challenges within the nanomaterial community, such as reducing collateral negative impacts and decreasing the time from development to market for this new class of technologies.Entities:
Keywords: curation; data integration; interoperability; nanoinformatics; nanomaterials
Year: 2015 PMID: 26425427 PMCID: PMC4578388 DOI: 10.3762/bjnano.6.179
Source DB: PubMed Journal: Beilstein J Nanotechnol ISSN: 2190-4286 Impact factor: 3.649
NDCI curation sub-topics.
| # | sub-topic area | planned focus |
| 1 | curation workflows | Addresses workflow aspects such as curation protocols for consuming data from primary literature as well as data transfers between repositories or between data customers and data consumers. Discusses mechanisms for both primary curation of data into repositories and interoperable sharing between resources. |
| 2 | data completeness and quality | Includes discussion of both data quality and data completeness. Completeness is a measure of the raw data, assays, processed data, or derived data. What are different ways data completeness could be defined, and are these completeness criteria shaped of the goals for the data being curated? |
| 3 | curation responsibilities | Covers curation responsibilities, including established and developing roles and division of curation labor and exploring the real challenges associated with quantity vs. quality of data entries. Curation training and performance expectations will also be addressed, as will the roles of other non-curators in defining the curation process (e.g. how might data “customers”, such as peer-reviewed journals, influence the process). |
| 4 | integration between databases and datasets | How do we define and operationalize integration between databases and datasets? What level of interoperability is required to support data integration in a way that supports various goals for comparison and analysis? |
| 5 | metadata | The way metadata are handled within a database and within data records is critical to every other nanotechnology data curation topic listed. |
Liaison question #1.
| liaison | affiliation | scope of data curation effort |
| Bill Zamboni | UNC | My research program at UNC is involved in the profiling and translational development of nanoparticle agents. My research program focuses on evaluating the pharmacokinetics (PK) and pharmacodynamics (PD) of nanoparticle agents in preclinical models and in patients. Specifically, we are involved in evaluating the factors that alter the function of the mononuclear phagocyte system (MPS) which then alters the PK and PD of nanoparticle agents in preclinical models and in patients. We have developed phenotypic probes of MPS function that predicts the PK and PD of nanoparticles in animals and patients. |
| Christoph Steinbach, | DaNa database NanoRA | The goal of our project is to provide impartial information and the real knowledge on safety aspects of (manmade) nanomaterials. DaNa in the acronym for DAtabase NAnomaterials but today we prefer talking about our Knowledgebase Nanomaterials and that describes our goals very well: We try to separate publications which are suitable for assessment of safety aspects of nanomaterials from those who are not suitable. So we try to collect not only arbitrary data but scientifically proven knowledge. The need to perform such kind of assessment is documented e.g., in a publication by Hristozov et al. [ |
| Marina (Nina) Vance | nanotechnology Consumer Products Inventory | Our curation effort is centered on the nanotechnology Consumer Products Inventory (CPI). The CPI was developed by the Woodrow Wilson International Center of Scholars in 2005 and it is currently the most comprehensive listing of consumer products that contain or claim to contain nanomaterials. The main goal of the CPI is to document the way in which nanotechnology is entering the consumer market. Specifically, we want to provide the science and regulatory communities, as well as consumers, with current and accurate information about nano-enabled consumer products and the nanomaterials they contain. |
| Christine Ogilvie Hendren | CEINT NIKC (Center for Environmental Implications of NanoTechnology NanoInformatics Knowledge Commons) | Our curation effort is centered around interrogating the data gathered from across the Center for Environmental Implications of Nanotechnology along with comparative literature from throughout the field external to the center. Though our controlled material sourcing has created a rich integrated dataset as a starting point, we have a wide range of data types and fields, representing our focus on complex environmental interactions and transformations as well as impacts across a biological continuum and including ecosystem-wide measures. Our central research goals driving the data integration process are to 1) Probe mechanistic relationships between material and system properties and their combined effects on nanomaterial fate and effect in the environment, 2) Organize our disparate data to provide directional guidance to risk assessors even prior to achieving goal 1, and 3) Test our hypotheses that a amassing data on a small number of semi-empirical functional assays measurements will allow us to further goals 1 and 2. Beyond supporting CEINT mission-focused research questions, two key goals of our data integration project are to build a cyberinfrastructure that captures the data in a way that enables reproducibility and quality control down the road, and to ultimately develop associated tools to involve researchers in self-curation of their data so they can shorten the curation timeline and realize the benefits of analyzing their data together with other comparable datasets. |
| Julio Cesar Facelli, | NanoSifter (University of Utah) | The purpose of the NanoSifter project here at the University of Utah is to create a natural language processing (NLP) tool which is capable of extracting nanoparticle data associated to nanoparticle properties directly from the primary literature. Currently, the tool can extract data associated to hydrodynamic diameter, particle diameter, molecular weight, zeta potential, cytotoxicity, IC50, cell viability, encapsulation efficiency, loading efficiency, and transfection efficiency. We plan to expand the information that NanoSifter can extract, while also improving the precision, recall, and f-measure of this tool. |
Liaison question #2.
| liaison | affiliation | major challenges to curation of nanomaterial data |
| Bill Zamboni | UNC | The complexity and high variability nature of MPS function in animal models and patients which results in high PK and PD variability of nanoparticles. |
| Christoph Steinbach, | DaNa database NanoRA | We think we are taking care of one of the most important challenges in nanomaterials data curation: separating valid from invalid data. In this regard, the major challenge is to gain information on the identity of a nanomaterial in a given study, which involves a careful physical-chemical characterization of a nanomaterial. Most of the data we consider invalid has a lack of information on material properties, which also hampers comparability of studies. |
| Marina (Nina) Vance | Nanotechnology Consumer Products Inventory | One major challenge we face is a general lack of support from the nanotechnology industry. Secrecy is inherent to the product development strategy of most companies, which makes it very difficult to provide a detailed characterization of industrial nanomaterials. A potential contributing factor to this problem, which applies specifically to the CPI, is a fear that association to the CPI may negatively affect the image of the consumer products. |
| Christine Ogilvie Hendren | CEINT NIKC (Center for Environmental Implications of NanoTechnology NanoInformatics Knowledge Commons) | Absence of established data-sharing protocols for existing measurement techniques (not to mention those that are currently being developed). |
| Julio Cesar Facelli, | NanoSifter (University of Utah) | In my opinion, there are a number of major challenges in nanoscience/nanotechnology data curation. The first is developing standards and protocols to report data in the literature which the nanoscience/nanotechnology community adheres to and follows. There are so many different ways that properties of nanoparticles can be reported in the literature, which makes the retrieval of such information quite cumbersome. Another major challenge is further development of the nanoparticle ontology (NPO) to add more functionality, metadata, and relationships to the ontology. |
Liaison Question #3.
| liaison | affiliation | data deemed necessary for nanomaterial comparison |
| Bill Zamboni | UNC | The need to be able to evaluate encapsulated/conjugated and released drug as part of formulation development and as part of in vivo PK studies. |
| Christoph Steinbach, | DaNa database NanoRA | A very good question which is extremely hard to answer: What does “same material” mean, not only from the informational point of view but also from the other side, the definition of “same material”? Which set of parameters do you need? Even if you change the size or shape of a particle totally different behavior can be achieved. We have developed a set of criteria (see http://www.nanopartikel.info/files/methodik/DaNa-Literature-Criteria-Checklist_Methodology.pdf) which need to be fulfilled that we accept a certain publication as “knowledge” in the meaning described in the answer to the first question. Here we also describe the material characterization criteria. In fact we are absolutely aware that this does not make finally sure, that we are always talking of the “same” material, but for our purposes it’s enough. We think that a lot of further research is necessary to determine the right “same material” parameters. |
| Marina (Nina) Vance | Nanotechnology Consumer Products Inventory | Within the CPI, it is very difficult to determine if a nanomaterial present in two or more products is, in fact, the same. We can group nanomaterials of the same composition together, but without a detailed description from the manufacturer, that would be impossible. In order to directly compare nanomaterials within consumer products, we would need, in the very least, the following: Composition, Shape, Size, Composition of coatings, Crystallinity |
| Christine Ogilvie Hendren | CEINT NIKC (Center for Environmental Implications of NanoTechnology NanoInformatics Knowledge Commons) | This depends on the level of granularity in the comparison. We believe that in order to support comparison and analysis in support of our research goals (elucidate mechanisms governing nanomaterial behavior and translate this into forecasts of risk), what is absolutely required are intrinsic characteristics of the nanomaterial, the surrounding system characteristics (e.g., be the system lab controlled, environmental media, biological systems), and system-dependent or "extrinsic" material characteristics. Only when all of these aspects, and their appropriate corresponding metadata describing preparation and testing protocols, are consistently reported can we know that direct comparison of two datasets is possible. |
| Julio Cesar Facelli, | NanoSifter (University of Utah) | The data (information) that is most necessary to directly compare nanomaterials and determine if they are the same material are the molecular descriptors and biochemical activity of the nanomaterials. The molecular descriptors (e.g., molecular weight, hydrodynamic diameter) and biochemical activity (e.g., cytotoxicity, cell viability, transfection efficiency) of the nanomaterials can be used by data mining and machine learning methods to compare materials and determine their similarity if the materials are discrete compounds. If the materials are not discrete compounds (i.e., polymers), properties such as molecular weight distribution and polydispersity will be the properties to assess for comparison of materials. |