| Literature DB >> 33101400 |
Ge Peng1,2, Jeffrey L Privette2, Curt Tilmes3, Sky Bristol4, Tom Maycock1, John J Bates5, Scott Hausman2, Otis Brown1, Edward J Kearns6.
Abstract
Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making. Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting. However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement. They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept. This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.Entities:
Keywords: Domain Stewards; Enterprise Framework; Information Management; Maturity Matrix; Open Data; PDCA-cycle; Research Data; Scientific Data Stewardship
Year: 2018 PMID: 33101400 PMCID: PMC7580807 DOI: 10.5334/dsj-2018-015
Source DB: PubMed Journal: Data Sci J ISSN: 1683-1470
Figure 1:Diagram of pathway from raw material to informed decisions, adapted from Figure 1.1 of Mosely et al. (2009) with permission.
Examples of NFRs on federally funded digital scientific data and our mapping to information quality dimensions defined by Lee et al. (2002) and Ramapriyan et al. (2017).
| NFRs | Description | Dimension based on Lee et al. (2002) | Dimension based on Ramapriyan et al. (2017) |
|---|---|---|---|
| Accessibility | The quality or fact of being accessible | Accessibility | Stewardship; Service |
| Accuracy | The quality or fact of being correct | Intrinsic | Science |
| Availability | The quality or fact of being available | Accessibility | Product |
| Completeness | The quality or fact of being complete | Contextual | Product |
| Findability | The quality or fact of being findable | N/A | Stewardship; Service |
| Integrity | The quality or fact of being intact | Intrinsic | Product; Stewardship; Service |
| interoperability | The quality or fact of being interoperable | Representational | Product; Stewardship; Service |
| Objectivity | The quality or fact of being objective | Intrinsic | Science |
| Preservability | The quality or fact of being preservable | N/A | Stewardship |
| Reproducibility | The quality of fact of being reproducible | N/A | Product; Stewardship |
| Representativeness | The quality of fact of being representational | Representational | Product; Stewardship |
| Security | The quality or fact of being secure | Accessibility | Stewardship; Service |
| Sustainability | The quality or fact of being sustainable | N/A | Product; Stewardship; Service |
| Timeliness | The quality or fact of being done at a useful time | Contextual | Product; Service |
| Traceability | The quality or fact of being traceable | N/A | Product; Stewardship; Service |
| Transparency | The quality or fact of being transparent | N/A | Product; Stewardship |
| Usability | The quality or fact of being easy to understand and use; being usable | Representational | Product; Stewardship; Service |
| Utility | The quality or fact of being utilized | Intrinsic | Product |
Figure 2:Conceptual diagram of proposed data-centric, enterprise scientific data stewardship framework. The staggered pyramid on the left represents interconnection between federal regulations, mandatory controls, recommendations, and instructions. The MM-tags beneath the pyramid represent quality assessments through the entire data product life cycle. The text on the right represents each step of the PDCA cycle and a summary of high-level outcomes
A summary of the PDCS cycle as defined by Nayab and Richter (2013) and adapted by ESDSF.
| The PDCA Cycle based on Nayab and Richter (2013) | The PDCA Cycle adapted by ESDSF |
|---|---|
| Plan/Define | Integrated non-functional requirements from federal directives, agency policies, organizational strategy, and user requirements (referred to as the requirements) are defined and documented. (They may be referred to “mission parameters".) |
| Functional areas, controls, and standards necessary for compliance with the requirements are defined and documented. (They are required changes.) | |
| They are communicated within the organization across different entities. | |
| Do/Create | The guidelines, processes, procedures, and best practices to enable the compliance with the requirements are created, documented, and implemented. |
| They are communicated within the organization across different entities to ensure consistency and efficiency. | |
| Check/Assess | Check the results of implementations of processes and procedures using consistent assessment models that are based on community best practices, yielding quantifiable evaluation results. |
| The results are captured and presented in ways suitable to both human and machine end-users. | |
| Areas for improvement are identified with a roadmap forward based on where they are and where they need to be. | |
| Act/Improve | Steps are taken based on the roadmap forward to improve current processes, procedures, and practices, circling back to the Do/Create stage if necessary. |
| If the requirements need to be updated, circle back to the Plan/Define stage. | |
| The processes, procedures, and practices of implementations are standardized within the organization once a desired maturity for the requirements is achieved. Monitoring is in place to trigger a new PDCA improvement cycle if a new requirement or a new area of improvement has been identified. | |
Figure 3:Diagram of an integrated team of stewards from multiple fields serving as a centralized knowledge and communication hub for effective long-term scientific data stewardship. SMEs denote domain subject matter experts. The concept of this diagram is based on Peng et al. (2016a)
Roles, Knowledge, and Capability: Provided or Required (with input from Chisholm 2014).
| Role | Minimum Knowledge Required | Minimum Responsibility or Capability Provided |
|---|---|---|
| Basic, very limited knowledge in a particular subject | Serving as a focal point of information concerning an activity or program; limited knowledge input | |
| Highly skilled with extensive knowledge in a particular subject | POC + good subject knowledge input | |
| Extensive knowledge and expertise in a specific domain | POC + extensive subject or domain knowledge input | |
| Extensive knowledge and expertise in a specific domain and general knowledge in other relevant domains, e.g., science/business and technology. | SME + effective trans-disciplinary communication + mindset of caring and improving other’s assets + prompting for good stewardship practices | |
Maturity Assessment Categories and Descriptions.
| Category Number | Description |
|---|---|
| Category 1 | No assessment done. |
| Category 2 | Self-assessment—preliminary evaluation carried out by an individual for internal or personal use; abiding to non-disclosure agreement. |
| Category 3 | Internal assessment—complete evaluation carried out by an individual non certified entity (person, group, or institution) and reviewed internally with the assessment results (ratings and justifications) publicly available for transparency. |
| Category 4 | Independent assessment—Category 3 + reviewed by an independent entity, that has expertise in the maturity model utilized for the evaluation. |
| Category 5 | Certified assessment—Category 4 + reviewed and certified by an established authoritative entity. Maturity update frequency is defined and implemented. |