| Literature DB >> 33980500 |
Ben Gordon1, Jake Barrett2, Clara Fennessy2, Caroline Cake2, Adam Milward3, Courtney Irwin3, Monica Jones2,4, Neil Sebire2.
Abstract
OBJECTIVES: The value of healthcare data is being increasingly recognised, including the need to improve health dataset utility. There is no established mechanism for evaluating healthcare dataset utility making it difficult to evaluate the effectiveness of activities improving the data. To describe the method for generating and involving the user community in developing a proposed framework for evaluation and communication of healthcare dataset utility for given research areas.Entities:
Keywords: BMJ health informatics; health care sector; information management; information science; information systems
Mesh:
Year: 2021 PMID: 33980500 PMCID: PMC8117992 DOI: 10.1136/bmjhci-2020-100303
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
Figure 1Bar chart showing breakdown of interviewees by sector.
Figure 2Bar chart showing breakdown of survey respondents by sector.
Table showing number of interview respondents commenting on each item from the original framework, to gain an understanding of an end users view of the initial framework categories
| Category from initial framework | Dimension | Definition | Times mentioned in interviews |
| Description | Metadata completeness | Level of metadata completed | 2 |
| Metadata quality | Richness of metadata completion—including within required formats and quality of qualitative fields | 5 | |
| Characteristics and Service | Data source | The modality or source of data (eg, Electronic Health Record, study specific) | 14 |
| Data model | The data model or schema used by the dataset (eg, Observational Medical Outcomes Partnership (OMOP), Informatics for Integrating Biology and the Bedside (i2b2)) | 16 | |
| Data dictionary | Provided documented data dictionary and terminologies | 9 | |
| Provenance | The original source or jurisdiction of the dataset | ||
| Usage restrictions | The df to use the data for different purposes (eg, commercial licences, consent, expiry) | 6 | |
| Format | The technical presentation of the data format (eg, Digital Imaging and Communications in Medicine (DICOM) images vs Portable Graphics Format (PNG)) | 3 | |
| Timeliness | How quickly the data can be provided—in a useful timescale | 7 | |
| Fairness of the data | Extent to which the data are findable, accessible, interoperable and reusable | 0 | |
| Phenome | Extent and description of included patients/conditions (links with Phenome work re standards) | 2 | |
| Scale | Coverage | No of individuals, data points, lab tests, images, etc included in the dataset | 9 |
| Duration | Length of time to which the data relates | 3 | |
| Depth | Amount of information available per individual (eg, number of fields/records, types of data) | 3 | |
| Quality | Completeness | The proportion of data entries that should be populated are populated (and inverse—proportion that should not be populated are not) | 11 |
| Missing data handling | Description of missing value handling and default values | 4 | |
| Consistency/uniformity | Data are presented in the required format and a similar wayfor example, field types, date formats | 1 | |
| Uniqueness | Lack of duplication | 3 | |
| Validity | Data are valid based on acceptable ‘rules’ for example, age between 0 and 120, pregnancy in male patients, physiological readings within normal ranges | 7 | |
| Accuracy/verification | The extent to which the data reflects the ‘real-world’, for example, level of certainty that fields are accurate | 6 | |
| ‘Usefulness’ | Qualitative, subjective measure by user (eg, Net Promoter Score/star rating) | 12 | |
| Added value | Linkage/mapping | Ability to link with other datasets | 10 |
| Transformations/derivations | Level of derived data and descriptions, manual versus Natural Language Processing, etc | 1 | |
| Accuracy/verification | Level of manual verification/sampling | 0 | |
| Annotation | Additional fields added to provide further information, including phenotyping | 1 |
Figure 3Final version of the proposed data utility framework based on data user feedback.