| Literature DB >> 33523400 |
Kristof Huysentruyt1, Oeystein Kjoersvik2, Pawel Dobracki3, Elizabeth Savage4, Ellen Mishalov5, Mark Cherry6, Eileen Leonard7, Robert Taylor8, Bhavin Patel9, Danielle Abatemarco7.
Abstract
Pharmacovigilance is the science of monitoring the effects of medicinal products to identify and evaluate potential adverse reactions and provide necessary and timely risk mitigation measures. Intelligent automation technologies have a strong potential to automate routine work and to balance resource use across safety risk management and other pharmacovigilance activities. While emerging technologies such as artificial intelligence (AI) show great promise for improving pharmacovigilance with their capability to learn based on data inputs, existing validation guidelines should be augmented to verify intelligent automation systems. While the underlying validation requirements largely remain the same, additional activities tailored to intelligent automation are needed to document evidence that the system is fit for purpose. We propose three categories of intelligent automation systems, ranging from rule-based systems to dynamic AI-based systems, and each category needs a unique validation approach. We expand on the existing good automated manufacturing practices, which outline a risk-based approach to artificially intelligent static systems. Our framework provides pharmacovigilance professionals with the knowledge to lead technology implementations within their organizations with considerations given to the building, implementation, validation, and maintenance of assistive technology systems. Successful pharmacovigilance professionals will play an increasingly active role in bridging the gap between business operations and technical advancements to ensure inspection readiness and compliance with global regulatory authorities.Entities:
Mesh:
Year: 2021 PMID: 33523400 PMCID: PMC7892696 DOI: 10.1007/s40264-020-01030-2
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.606
Classification of automated pharmacovigilance systems
| Classification | Definition | Status/validation framework |
|---|---|---|
| Rule-based static systems | Automation is achieved via static rules designed to obtain the desired outcome | Status: established |
| Examples include expedited reporting rule configuration, auto coding, and robotic process automation for case intake | Validation framework: exists today | |
| AI-based static systems | System configuration includes components that are AI informed but subsequently "frozen," i.e., systems based on AI or ML that do not adapt in production (after "go-live"). These are also called "locked" models | Status: emergent |
Re-training of the model is not applied automatically and is limited to the occurrence of events/triggers that require modification (e.g., the output needs change, expansion of training data set to improve quality) Examples include an auto-translation system of source documents and a model based on ML for causality assessment of individual case safety reports | Validation framework: existing frameworks can be extended to cover these systems | |
| AI-based dynamic systems | System configuration includes components that are AI informed and can adjust their behavior based on data after initial implementation in production, using a defined learning process [ | Status: eventual |
| These are like AI-based static systems but are continually updated once in production, based on a set cadence or trigger, to include new source data. These systems are sometimes referred to as online algorithms | Validation framework: more thorough review of validation frameworks will be needed to cover these systems |
AI artificial intelligence, ML machine learning
Fig. 1A General Approach for Achieving Compliance and Fitness for Intended Use (ISPE). Source: Figure 4.1, GAMP 5: A Risk-Based Approach to Compliant GxP Computerized Systems, © Copyright ISPE 2008 [7]. All rights reserved. https://www.ISPE.org.
Used with permission from ISPE (ISPE GAMP® 5, Figure 4.1)
Fig. 2Overlay of the US FDA's total product lifecycle approach on artificial intelligence/machine learning workflow.
Source: US FDA Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)—based software as a Medical Device (SaMD), Discussion Paper and Request for Feedback, 2019 [13]
Fig. 3Proposed ISPE GAMP® 5 methodology for validating artificial intelligence (AI)-based static systems in pharmacovigilance
Validation considerations for artificial intelligence-based static systems
| Area | Potential challenges when validating AI-based static systems using existing frameworks | Considerations for extending existing frameworks for AI-based static systems | Documentation considerations |
|---|---|---|---|
| Planning | The validation approach and strategy require careful planning | Gather knowledge and create the overall approach for validating an AI-based static system in the organization (master strategy, validation framework, etc.). The overall approach should refer to examples, best practices, templates, etc. Define performance thresholds and fallback plans/BCPs The activity scope will vary based on whether you are validating a new AI-based static system or adding an AI functionality within an existing validated system | The master strategy for AI-based static systems should include guidelines for the objective, scope, key activities, BCP/fallback plan, responsibilities, and acceptance criteria |
| Risk management | New and unknown types of errors and risks areas may not be considered in existing validation frameworks: Data and Environments Knowledge and Experience Complexity Availability Trust Regulations | New types of risk may need to be considered, or existing risk types may be emphasized. At a minimum, these risks need to be assessed individually for each system and mitigated appropriately Risks related to the quality of data include potential biases, completeness, sources, etc. Inaccuracies in the model due to production data changes over time ("model drift") There may be conditions within which the model works and those in which it does not (i.e., the operating envelope) Systemic rather than random errors may occur for fully automated systems, potentially introducing a bias Additional risks related to a system not reaching 100% accuracy, which may require additional assessment External vendor/supplier involvement Critical activity, downtime, or any other implications related to a deficiency in the model Required level of transparency/explainability of decisions versus required accuracy Existing risk identification and mitigation methodologies such as failure modes effect and criticality analysis could still be used | New risk categories can be incorporated into existing risk documentation deliverables (e.g., vendor risk assessment, functional risk assessment) A separate deliverable focusing on data risk assessment may be introduced or incorporated within existing documents |
| Requirements and specifications | Performance metrics must be proactively defined with consideration given to the system's intended use. | An acceptable performance level may be driven by the intended use of the system. Define the intended use and criticality of each individual component of the system (note: The system could be a combination of any automated and manual activities) Important risk-based considerations may be the impact (and quality) of targeted automation step(s) on the overall outcome of the process and the intended degree of automation. A good understanding is needed for cases where the system is effective vs. where it is not (i.e., consider the criticality of misses) Specific requirements may need to be included for exception flagging with respective confidence scores to facilitate performance verification during validation | Desired performance level can be defined using different metrics and is not necessarily limited to a level of accuracy The most appropriate measure may consist of a combination of criteria (e.g., overfitting to manage false positives vs. false negatives when predicting case validity) |
| Data selection | AI-based static systems are dependent on data selection, whereas existing validation frameworks are dependent on process | The methodology for the data collection, selection, and cleaning, should consider the following aspects: Collection of data representative of the process, ensuring correct distribution among the data classes, categories, or labels Assignment of "ground truth" Data quality, completeness, and lifetime accuracy considerations Close collaboration amongst pharmacovigilance professionals and technical experts Allow models to handle new categories in production, ensure a small sample training set contains instances with an “unknown” category Data used for training and validation of the model should be independent of and separate from the test data Frequent updates of the test data set to ensure it is not stale and remains representative of the sample population Ensure that the test data set receives the same data pre-processing treatments applied to training data Ensure the model does not rely on personal identification or other irrelevant fields for making predictions | Data selection and preparation approach, covering: Audit and sequestration of training/validation and test data Collection protocols Reference standard determination Quality assurance Data pre-processing |
(model build, train and test, model validation) Note: "Model validation" refers to the definition used in ML – see footnote | Training, selection, and validation of an AI model is a new and delicate process | Training of AI models requires a combination of expertise in AI/ML with a deep understanding of the data and related process Best practices for model training (i.e., good ML practices) need to be integrated into the validation framework Utilize pre-trained models and transfer learning, where possible, as these methods are proven to be less sensitive to implicit biases in the data Utilize techniques that lower any potential bias, e.g., regularization, dropouts, and cross-validation Define the methodology to assess model performance using an independent team to avoid bias (e.g., blinding) Study the basic metrics (including precision, recall, accuracy, and F-score) to define model performance | A test planning, strategy, and evaluation plan, including the following: Model selection (e.g., protocol, rationale, and experimentation for the choice of model) Adherence to good ML practices Validation protocol (methodology, splitting strategy) |
It is not usually possible to list and test all potential scenarios that are handled by such a model | A tailored approach to testing (e.g., statistical and risk-based blinding methodology and test environment(s)) is typically necessary Test model performance on incomplete, missing, delayed, corrupt, noisy, invalid, and unknown data points | A test planning, strategy, and evaluation plan, including the following: Test planning and strategy (methodology and test environment(s)) Evaluation plan (metrics definition, acceptance thresholds) Versioning performance records Specific test data inclusion | |
Need transparency in how the AI model arrived at a certain decision | Careful model designing should be implemented to maintain a clear understanding of the models in any instance | Ensure complete documentation of the AI model at each version | |
As part of the design phase, consideration should be given to the required level of transparency and explainability concerning how the AI model arrived at a decision. There could be a trade-off in that more complex AI models involving deep learning may have a higher degree of accuracy but are inherently less transparent in how a decision was made [ Avoid the "Black box” effect | Document ML architecture, hyper-parameters, and parameters | ||
| Acceptance testing | The integration of an AI model into the entire system (user interface, dashboards, reports, interfaces) may be a challenging activity | Specialized resources trained in both testing and the business area are essential to verify acceptance and the intended use of the entire system The AI model can work as intended, but the acceptance level of testing should verify its integral performance Planning of respective levels of integration testing and acceptance testing may require advanced test techniques (e.g., unit testing, integration, test automation, security, usability, etc.) Consider the latest approaches (e.g., US FDA’s CSA) [ Test model roll-back capability in case of an error in production | Test strategy and planning should consider: Validation of the AI model vs. entire system acceptance testing Test types and techniques Risk-based testing (CSA) Model rollbacks |
| Address the impact of external intervention on monitoring performance of the model | Quality control activities and measurements for model performance must be inserted at points in which the findings will be representative of the model and not representative of the model combined with potential external intervention Monitoring approaches for AI-based static systems: Concurrent activities of the model, along with the historical approach, allow for comparison, adjudication, and identification of potential performance issues | Test strategy and planning should consider: Validation of the AI model vs. entire system acceptance testing Test types and techniques Risk-based testing (CSA) Model rollbacks | |
| System deployment and monitoring | Although training data should be representative of the real world, training data cannot be exhaustive, may be limited, may exclude a rare scenario, or may not be representative of all situations | Pre-defined robust monitoring plan with periodic quality control measures is recommended. Strong consideration must be given to the perceived level of risk and potential impact of performance degradation on the quality and integrity of the entire system | Monitoring plan with pre-established: Periodicity Performance metrics Acceptable quality standards and thresholds Investigation measures to understand and isolate the root cause of performance degradation Retraining plan with defined triggers for retraining (e.g., suboptimal performance for three straight periods) Criteria for implementation of optimized models into production |
| Post-deployment monitoring and early detection of model performance degradation are critical to maintaining expectations from a regulatory and business perspective | Various circumstances can trigger model changes: quality standards/thresholds not being met, an opportunity to create new ground truth improving model performance, change in data input, or a change in business or regulatory requirements. The monitoring plan should include the threshold and criteria for (1) when thresholds are not met and (2) when to consider a root cause or impact assessment | ||
| Appropriately challenge performance. Understand model capabilities and limitations to avoid potential bias of assuming accuracy, which can result in reliance on a model with suboptimal performance | When evaluating a potential missed scenario, it is important to understand whether a given scenario was included within the scope of the training material or whether it is an entirely new scenario The model may occasionally require retraining, and the reasons for this retraining should be documented as they arise | ||
| Change management | Version control of an AI model is based on specific training data and is a part of the entire system (also version controlled) | Consider a retrained model as a new model independent from the previous versions. All model versions should be recorded and documented for reproducibility and audit purposes | The change should be documented, including objectives of the change; models and features should be version controlled |
| Apply software change management principles when there is a need for end-to-end system change. This should be separate from changes to the ML subcomponents | All versions should be recorded and documented for reproducibility and audit purposes. Track model parameters, training dataset, and algorithm details for each model version | ||
| Consider using tools to build AI models that allow for version control (i.e., model management platforms) | The quality plan must be evaluated for potential changes (e.g., changes in parameters or configuration) that may require modification over time |
Specific document requirements will be dependent on system complexity and intended use. In machine learning, model validation is the process in which a trained model is evaluated with a testing data set. The main purpose of using the testing data set is to test the generalization ability of a trained model [16]. Visit TransCelerate's website for more detail [17]
AI artificial intelligence, BCPs business continuity plans, CSA computerized system assurance, ML machine learning
| With the widespread adoption of technology in the pharmacovigilance space, pharmacovigilance professionals must understand and guide the building, validation, and maintenance of artificial intelligence (AI)-based pharmacovigilance systems. This is essential to ensure inspection readiness as we integrate automation systems across the pharmacovigilance value chain. |
| Intelligent automation systems can be grouped into three categories: rule-based static systems, AI-based static systems, and AI-based dynamic systems. Validation frameworks currently exist for rule-based static systems but not AI-based systems. |
| We propose validation considerations for a risk-based approach to compliant GxP AI-based static systems, which can be implemented within the good automated manufacturing practices framework. |