| Literature DB >> 32337371 |
Jennifer C Goldsack1, Andrea Coravos1,2,3, Jessie P Bakker1,4, Brinnae Bent5, Ariel V Dowling6, Cheryl Fitzer-Attas7, Alan Godfrey8, Job G Godino9, Ninad Gujar10,11, Elena Izmailova1,12, Christine Manta1,2, Barry Peterson13, Benjamin Vandendriessche14,15, William A Wood16, Ke Will Wang5, Jessilyn Dunn5,17.
Abstract
Digital medicine is an interdisciplinary field, drawing together stakeholders with expertize in engineering, manufacturing, clinical science, data science, biostatistics, regulatory science, ethics, patient advocacy, and healthcare policy, to name a few. Although this diversity is undoubtedly valuable, it can lead to confusion regarding terminology and best practices. There are many instances, as we detail in this paper, where a single term is used by different groups to mean different things, as well as cases where multiple terms are used to describe essentially the same concept. Our intent is to clarify core terminology and best practices for the evaluation of Biometric Monitoring Technologies (BioMeTs), without unnecessarily introducing new terms. We focus on the evaluation of BioMeTs as fit-for-purpose for use in clinical trials. However, our intent is for this framework to be instructional to all users of digital measurement tools, regardless of setting or intended use. We propose and describe a three-component framework intended to provide a foundational evaluation framework for BioMeTs. This framework includes (1) verification, (2) analytical validation, and (3) clinical validation. We aim for this common vocabulary to enable more effective communication and collaboration, generate a common and meaningful evidence base for BioMeTs, and improve the accessibility of the digital medicine field.Entities:
Keywords: Research data; Scientific community
Year: 2020 PMID: 32337371 PMCID: PMC7156507 DOI: 10.1038/s41746-020-0260-4
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Existing definitions of V&V or similar concepts in a selection of reference and guidance documents from disciplines contributing to digital medicine.
| Source of guidance document | IEEE (2016)[ | BEST (2018)[ | CTTI (2018)[ | SaMD (2017)[ | FDA (2002)[ | NASEM (2017)[ |
|---|---|---|---|---|---|---|
| Intended audience for document | System, software, and hardware suppliers, acquirers, developers, maintainers, V&V practitioners, operators, users, and managers in both the supplier and acquirer organizations | Broad stakeholder group (e.g., regulators, medical product manufacturers, patients) | Biotech & pharmaceutical sponsors, contract research organizations (CROs) and outsourced electronic service vendors, such as mobile technology manufacturers | International Regulatory Community | • Persons subject to the medical device quality system regulation • Persons responsible for the design, development, or production of medical device software • Persons responsible for the design, development, production, or procurement of automated tools used for the design, development, or manufacture of medical devices or software tools used to implement the quality system itself • FDA investigators • FDA compliance officers • FDA scientific reviewers | Multi-stakeholder community engaged in genetic and diagnostic testing |
| Are terms V&V defined? | ||||||
| Verification | Yes | No | Yes | In prerequisite documents | Yes | No |
| Validation | Yes (does not split out analytical vs. clinical) | Yes (splits out analytical vs. clinical) | Yes (refers to analytical validation only) | Yes (splits out analytical vs. clinical validation; also includes clinical association/scientific validity) | Yes | Yes (splits out analytic vs clinical validation; also includes clinical utility) |
| What’s the context of V&V definitions? | Provides standards for V&V of software, hardware, and systems | Gives definitions & examples of biomarkers and surrogate endpoints; additional focus on COA (clinical outcome assessment)— specific validation (e.g., construct, content & criterion) | Advancing the use of mobile technologies for data capture & improved clinical trials | Describes an approach for planning the process for clinical evaluation of a SaMD (software with a medical purpose) | Describes how provisions of the medical device quality system regulation apply to software and the FDA’s approach to evaluating a software validation system | Developed in the context of providing recommendations to advance the development of an adequate evidence base for genetic tests to improve patient care and treatment. Uses the CDC’s ACCE model of 44 targeted questions |
| What’s missing from V&V definitions? | Data processing algorithm Clinical validation | Data processing algorithm | Relationship of digital metric to a meaningful clinical state or experience Clinical care applications | Hardware (decoupled from software) View of full data- supply chain | Hardware (decoupled from software) View of full data supply chain Clinical validation | Sensor hardware |
Fig. 1The stages of V3 for a BioMeT: Verification, analytical validation, and clinical validation of BioMeTs is a multi-step process.
The stages of V3 for a BioMeT.
Fig. 2The “Raw” data dilemma: defining sample-level data in the data supply chain in a uniaxial MEMS accelerometer.
Acceleration results in physical motion of the equivalence of a spring and proof mass, which in turn results in changes of electrical properties that can be captured by electrical property sensors. Electrical signals are then converted from analog to digital signals and stored and transmitted via the microprocessor on a wristband or mobile device. Through BLE, data are then processed and compressed multiple times for transmission and storage through mobile devices or cloud storage. This figure summarizes the steps of data collection and manipulation into a daily step count metric and illustrates that “raw” data could refer to different stages of the data collection and manipulation process and have different meanings. For more details of the data types and technologies involved in each step, please refer to Supplementary Table 2. Here, two arrows are highlighted with asterisks, which signify steps in the data supply chain where the “raw data dilemma” usually occurs. What is defined and clarified as “sample-level data” are the primary and processed digital signals marked by asterisks.
Summary of verification.
| Who? | Engineers, data & computer scientists |
| What? | Generation and preliminary processing of sample-level data |
| When? | Prior to testing the technology in human subjects |
| Where? | At the bench |
| Why? | To evaluate the performance of a sensor technology (1) against pre-specified criteria and (2) to demonstrate that the sample-level data generated is correct within the limits of the pre-specified conditions. |
Verification in practice.
| Documentation you can expect | Manufacturer should provide evidence of their BioMeT’s: • Performance specifications for the integrated hardware • Output data specifications • Overview of software system tests • Limitations to the verification testing • e.g., specific known items that were not tested during verification |
| Clinical users’ questions answered by verification | Is the performance of this BioMeT and each of its components sufficient to generate sample-level data of acceptable quality such that it can be used as an input to generate the processed data and downstream clinical measurement that I am interested in? |
Summary of analytical validation.
| Who? | Engineers, data scientists/analysts/statisticians, physiologists, behavioral scientists, and clinical researchers |
| What? | Protocol for data capture from a human participant. Algorithms applied to sample-level data to yield measurements that are indicative of clinical concepts. |
| When? | First use in human subjects. |
| Where? | Research or clinical laboratories. |
| Why? | To evaluate the performance of the algorithm, and its ability to measure, detect, or predict physiological or behavioral metrics. |
Analytic validation in practice.
| Documentation you can expect | Description of analytical validation studies conducted according to the requirements of Good Clinical Practice (GCP). This description can be in any one or more of the following forms: • Internal documentation • Regulatory submission (510 k) • White paper • Published journal article In the documentation, the evidence for every algorithmic output in their system: • Description of the output metric • Overview of how the metric was calculated, including specific details where possible • Which reference standard was used as the comparator to validate the metric • Results from a direct comparison between calculated metric and reference standard, including statistical analysis methods • Description of the human subjects population and experimental conditions and protocol used in the aforementioned direct comparison testing If this validation testing was undertaken as part of a clinical trial with human subjects, then the Institutional Review Boards (IRBs) or Ethics Committees (ECs) documentation should also be provided. |
| Clinical users’ questions answered by analytical validation | Can an algorithm acceptably measure, detect, or predict the presence or absence of a phenotype or clinical condition when that algorithm is applied to sample-level data captured by a verified sensor in accordance with a specific data collection protocol in a particular population? |
Summary of clinical validation.
| Who? | Clinical teams planning to use and generate scientific evidence based on the BioMeT in a stated context of use (which includes specifying the patient population). |
| What? | Well-designed clinical study protocols with appropriate inclusion/exclusion criteria, measurements, and outcomes to ensure assessment of content validity. |
| When? | After both verification of the data generated by the BioMeT and analytical validation of the data collection protocol and data processing by software algorithms is complete. |
| Where? | In the environment where the digital tool will be used. This will likely include data captured outside of the clinical or research laboratory environment during participants’ activities of daily living. |
| Why? | To evaluate whether the BioMeT acceptably identifies, measures, or predicts a meaningful clinical, biological, physical, functional state, or experience in the specified (1) population and (2) context of use. |
Clinical validation in practice.
| Documentation you can expect | Documentation of studies should include one or more of: • Clinical study report (CSR) • Regulatory submission (FDA or EMA) • White paper • Published conference proceeding • Published journal article Protocols and study reports should also be made publicly available. The Institutional Review Boards (IRBs) or Ethics Committees (ECs) documentation for the study should also be provided. |
| Questions answered by clinical validation | Can a BioMeT-derived measurement that has undergone verification and analytical validation steps be used to answer a specific clinical question? |
Questions that verification, analytic validation, and clinical validation answer in example use cases.
| Example use cases | Questions VERIFICATION answer: | Questions ANALYTICAL VALIDATION answer: | Questions CLINICAL VALIDATION answer: |
|---|---|---|---|
| Heart rate variability (HRV) from a commercial chest strap | Is the raw data from the ECG sensor on the commercial chest strap accurate, precise, and consistent? Are the processed RR intervals from the ECG sensor and post-processing on-board algorithms accurate with low errors[ | Does the HRV measured from the commercial chest strap ECG sensor provide clinical-grade accuracy of HRV (compared with a traditional ECG and Kubios clinical-grade software[ Does HRV from the commercial chest strap meet standards set by the HRV Task Force[ Does HRV analysis meet the needs of users using the commercial chest strap (high accuracy under daily activities and during movement)[ | Can heart rate variability identify the presence of autism spectrum disorder in 8-year-old children[ |
| Gait speed from a commercial accelerometer | Is the accelerometer sensor accurate and precise within predetermined uncertainty? Is the accelerometer sensor raw data uniform and consistent? | Do the accelerometer sensor and processing algorithms provide clinical-grade accuracy of gait speed (compared to clinical automatic timing system used for gait speed analysis[ | Can gait speed predict the onset of dementia in older adult patients[ |
| Arrhythmia detection | Is the heart rate sensor (optical heart rate or ECG) accurate, precise, and consistent? Does the post-processing algorithm for arrhythmia detection provide high sensitivity and specificity with low errors? | Does the arrhythmia detector (sensor and algorithms) meet the standards set by the FDA Class II Special Controls Guidance Document: arrhythmia detector and Alarm[ | Does the product acceptably detect atrial fibrillation (AF) in adults? |
| Closed-loop continuous glucose monitor (CGM)/glucose pump systems | Is the CGM sensor accurate, precise, and consistent with low errors? Is the pump system accurate, precise, and consistent with low errors? Does the closed-loop feedback algorithm provide timely, accurate feedback from the CGM to the pump consistent with FDA Considerations for Closed-Loop Controlled Medical Devices[ | Does the closed-loop CGM/pump system provide similar accuracy when compared with the current standard (system with multiple devices and manual calibration throughout the day?[ Do the closed-loop system components (CGM, pump, and feedback algorithm) meet specifications set by the FDA Regulatory Considerations for Physiological Closed-Loop Controlled Medical Devices Used for Automated Critical Care[ | Does this hybrid closed-loop system acceptably monitor glucose and automatically adjust the delivery of long acting or basal insulin based on the user’s glucose reading in the pre-specified context of use and patient population[ |
| Cuffless blood pressure (CBP) monitoring | Is the sensor used for CBP monitoring accurate, precise, and consistent with low errors? Is the algorithm used for determining BP accurate, precise, and consistent with low errors? | Does CBP monitoring provide clinical-grade accuracy (when compared to a traditional cuff BP monitor)[ Does the CBP device meet the standards for wearable devices issued by the Institute of Electrical and Electronics Engineers (IEEE 1708–2014[ | Do parameters of in-clinic blood pressure monitoring still apply to ambulatory/remotely captured blood pressure when considering the use of blood pressure as a prognostic biomarker for cardiovascular outcomes[ |
Fig. 3V3 in practice: The verification, analytical validation, and clinical validation process in the real world.
The V3 process in practice.
Fig. 4The role of the different disciplinary experts in the V3 process: Verification, analytical validation, and clinical validation processes are typically conducted by experts across disciplines and domains.
V3 processes are typically conducted by experts across disciplines and domain.
Illustrative examples of consequences where V3 evaluation does not occur.
| Illustrative examples | Consequences |
|---|---|
| Cuffless blood pressure measurement | If the software for blood pressure estimation through a cuffless wearable was not carefully verified and validated, inaccurate blood pressure estimations used in clinical decisions may result in misdiagnosis and improper treatment that can result in patient harm. |
| Heart rate monitoring | Inaccurate heart rate monitoring could lead to improper conclusions about a patient’s risk for life-threatening cardiac events. Either over- or under-treatment in this scenario would likely result in patient harm and misallocation of health resources[ |
| Tapping on a smartphone to measure dementia | A BioMeT designed to detect dementia based on tapping patterns on a smartphone can diagnoses dementia in a healthy person if an older smartphone is used with a newer operating system because the delays and irregular tapping patterns are observed and misinterpreted by the BioMeT[ This example was witnessed firsthand by a member of our team. |