Literature DB >> 33234535

Ensuring machine learning for healthcare works for all.

Liam G McCoy^1,2, John D Banja³, Marzyeh Ghassemi^4,5,6, Leo Anthony Celi^7,8,9.

Abstract

Entities: Chemical Disease Gene Species

Keywords: health care; medical informatics; patient care

Year: 2020 PMID： 33234535 PMCID： PMC7689076 DOI： 10.1136/bmjhci-2020-100237

Source DB: PubMed Journal: BMJ Health Care Inform ISSN： 2632-1009

× No keyword cloud information.

Introduction

Machine learning, data science and artificial intelligence (AI) technology in healthcare (herein collectively referred to as machine learning for healthcare (MLHC)) is positioned to have substantial positive impacts on healthcare, enhancing progress in both the acquisition of healthcare knowledge and the implementation of this knowledge in numerous clinical contexts. However, there are concerns that have been identified with these technologies regarding their potential for negative impacts.1–7 In particular that they may damage health equity by either introducing novel biases, or uncritically reproducing and magnifying existing systemic disparities. These concerns have led to a growth of scholarship on the intersection of ethics, AI and healthcare,1–7 as well as significant restrictions on the use of patient data for MLHC research.8 9 Unfortunately, modern healthcare is already rife with treatments that fail to live up to evidentiary scrutiny,10 while evidence behind their use is riddled with biases that further deepen health inequities.11 Against this backdrop, it becomes clear that urgent and substantial change is needed, and that MLHC offers one of the most promising avenues toward achieving this end. Ethical concerns regarding the impact of this technology should be addressed and made foundational to the development of MLHC in meaningful ways. However, those concerns must not act to affect the field in a manner that perpetuates the structural inequalities that presently exist. Through the conceptual lens of MLHC, this paper will explore various flaws of healthcare’s current approaches to evidence, and the ways in which insufficient evidence and bias combine to lead to ineffective and even harmful care. We examine the potential for data science and AI technologies to address some of these issues, and we tackle commonly raised ethical concerns in this space. Ultimately, we provide a series of recommendations for reform in policies around MLHC which will facilitate the development of systems that provide a public benefit for all.

Bias and insufficiency of evidence in healthcare

Many common interventions in healthcare are performed without good evidence to support them. A 2012 National Academy of Medicine report noted that high quality evidence is lacking or even non-existent for many clinical domains,12 and a similar investigation from the UK’s National Institute for Health and Care Excellence and the BMJ found that 50% of current treatments have unknown effectiveness, 10% are still in use despite being ineffective or even harmful and only 40% have some evidence for effectiveness.13 As Prasad et al have found, studies that contradict previous research and lead to ‘medical reversal’ changes to practice standards are common—comprising up to 40% of papers that evaluated current standard of care in the New England Journal of Medicine from 2001 to 2010,14 and many papers in JAMA and The Lancet.10 It is clear that many interventions have insufficient evidence but continue to be adopted and propagated based on expert opinion typically backed by professional societies. Even when prospective randomised controlled trials are performed, they are subject to numerous opportunities for bias—and even outright conflict of interest—which can impact the quality and transferability of results.15 16 The burdens of medicine’s failures in evidentiary quality and applicability are not borne equally.11 17–19 The historical and ongoing omission in research of certain groups, including women and underserved populations, has skewed our understanding of health and disease.11 The concerns that exist regarding the generation of algorithms on racially biased datasets17 are unfortunately far from being new, but represent a continuation of a long-standing history of minority groups being under-represented or entirely unrepresented in foundational clinical research.11 18 The Framingham study, for example, generated its cardiovascular risk scores from an overwhelmingly white and male population, and has subsequently been inaccurate when uncritically used on black populations.19 Similarly, women have been and continue to be heavily under-represented in clinical trials.11 20 21 These problems extend to the global health context as well, as the trials used to inform clinical practice guidelines around the world tend to be conducted on a demographically restricted group of patients in high-income countries (mainly white males in the USA)11 These issues are compounded by structural biases in medical education,22 and the biases of the healthcare providers tasked with interpreting and implementing this medical knowledge in the clinical context.23

Can MLHC help, or will it harm?

The question is whether MLHC will help to remedy these shortcomings or exacerbate them. Models that are trained uncritically on databases embedded with societal biases and disparities will end up learning, amplifying and propagating those biases and disparities under the guise of algorithmic pseudo-objectivity.2 17 24 25 Similarly, gaps in quality of care will be widened by the development and use of tools that are only beneficial to a certain population—such as a melanoma detection algorithm trained on a dataset containing mostly images of light toned skin.26 Concerns also exist around patient privacy and safeguarding sensitive data (particularly for vulnerable groups such as HIV positive patients).27 Finally, there are structural concerns related to the possibility that the information technology prerequisites for implementing MLHC will only be available to already privileged groups.5 7 Yet, and as recent scholarship has indicated, the potential for MLHC to counter biases in healthcare is considerable.3 28 Data science methods can be used to audit healthcare datasets and processes, deriving insights and exposing implicit biases so they might be directly investigated and addressed.1 3 29 While much has been made of the ‘black box’ characteristics of AI, it may be argued that human decision making in general is no more explainable.30 31 This is particularly true in the context of the sort of implicit gender and racial biases that influence physicians' decisions but are unlikely to be consciously admitted.23 As checklist studies in healthcare have demonstrated,32 it may be possible to reduce these biases through the use of standardised prompts and clinical decision support tools that move clinical decisions closer to the data—and further from the biasing subjective evaluations. At the structural level, there is hope that AI will drive down the costs of care, increasing access for groups that have been traditionally underserved, and enabling greater levels of patient autonomy for self-management.4 5 Further, MLHC technologies may be able to address issues of disparity in the clinical research pipeline.33 Improvements in the use and analysis of electronic health records and mobile health technology herald the possibility of mobilising massive amounts of healthcare data from across domestic and global populations. The prospect of using ‘big data’ (ie, large and comprehensive datasets involving many patient records) that better represents all patients for health research may hold promise for counteracting issues of evidentiary insufficiency and limitations. As shown by the ‘All of Us’ programme, biological information database initiatives can be specifically tailored toward the active inclusion of traditionally under-represented groups.34 Recent progress in the ability to emulate a ‘target trial’ when no real trial exists may even enable scientists to regularly obtain real-world evidence and evolve insights about the effectiveness of treatments in groups absent from initial clinical trials.35

Ensuring MLHC works for all

Despite this potential, MLHC is far from a magical solution, and should not be seen as such. Embracing it must not lead subsequently to the neglect of the role played by other structural factors such as economic inequities36 and implicit physician bias.23 No simple set of data-focused technical interventions alone can effectively deal with complex sociopolitical environments and structural inequity,37 and simple ‘race correction’ methods can be deeply problematic.38 The potential for ‘big data’ synthetic clinical trials, for example, must come as a supplement to and not a replacement for efforts to improve the diversity of clinical trial recruitment. Similarly, issues of structural bias must be acknowledged and addressed at all levels of the MLHC development pipeline,17 39 from assessing the quality of the input data to ensuring adequate funding for the information technology needed to implement MLHC in underserved areas. If MLHC is to be successful at reducing health disparities, it must reflect this function in its form. The troubling lack of diversity both in the field of AI40 and in biomedical research generally41 raises concerns about the perpetuation of biased perspectives in development, and the historical and ongoing flaws of healthcare and its research communities have led to distrust among minority communities.42 The onus is on the MLHC community to rebuild this trust and embrace structural reform. Inclusion and active empowerment of members of marginalised communities is essential, and concepts around individual or collective data ownership and sovereignty43 deserve further exploration. At the same time, we must not forget the biases exerted by the status quo, which we cannot allow to slow the sort of progress that is necessary to address these problems. Problems evolving from the systematic exclusion of vulnerable populations from research will not be solved by the continued exclusion of these populations. While work certainly must be done to ensure that minoritised patients do not need to be saved from MLHC research, work must also be done to remedy disparities and improve outcomes for minoritised patients through MLHC research. The vigorous discussions surrounding ethical issues in MLHC must be translated into active efforts to construct the field from the ground up. Both the field itself and the outputs it creates must be ethical and equitable at their core, with these concerns rendered structurally integral rather than addressed post hoc. An emphasis is already growing throughout the field on the establishment of codes of conduct,44 and practical procedures6 33 for the ethical and equitable implementation of AI in healthcare. As outlined in table 1, we identify a number of critical areas of emphasis in the development of MLHC that fosters this vision. Just as the potential for problematic bias in MLHC has no single cause, the onus for achieving these recommendations does not fall on any single actor in the MLHC space. Open collaboration between universities, technology companies, ministries of health, regulators, patient advocates and individual clinicians and data scientists will be essential to its success.

Table 1

Areas of emphasis for ensuring machine learning for healthcare (MLHC) works for all

Area of emphasis	Recommendations
Ensure MLHC is equitable by design	Develop pipelines for the promotion of diverse teams in all aspects of MLHC Ensure the inclusion of data from a broad range of groups, in a broad range of contexts Incorporate global partners to ensure health data science promotes global health equity.
Encourage public and open MLHC research	Fund both direct MLHC research and research into ethical aspects of MLHC Harmonise ethical oversight between public and private research domains
Ensure adequate access to health information technology (IT) infrastructure	Ensure all are included in the datasets by funding health data gathering infrastructure in underserved communities Develop MLHC products with an awareness of the broad range of health IT contexts for deployment
Ensure MLHC is clinically effective and impactful	Ensure the presence of multidisciplinary teams that represent both clinical and data science perspectives Promote pathways for interdisciplinary training Hold MLHC innovations to the same standards as other healthcare interventions, including requirements for prospective validation and clear demonstration of impact
Audit MLHC on ethical metrics	Mandate assessments of the performance of novel MLHC technology for impacts on marginalised and intersectional groups. Record the data necessary to perform these audits in an ongoing fashion
Mandate transparency in data collection, analysis and usage	Build patient trust by ensuring that protocols for the collection, analysis and usage of data are transparent and open
Promote inclusive and interoperable data policy	Ensure the existence of clear and ethical methods for ensuring the sharing of data between different sources while protecting patient rights and privacy Improve the standardisation of medical data generation and labelling across contexts Ensure that global partners are included, so that interoperability barriers do not hinder inclusive global collaboration

Areas of emphasis for ensuring machine learning for healthcare (MLHC) works for all Develop pipelines for the promotion of diverse teams in all aspects of MLHC Ensure the inclusion of data from a broad range of groups, in a broad range of contexts Incorporate global partners to ensure health data science promotes global health equity. Fund both direct MLHC research and research into ethical aspects of MLHC Harmonise ethical oversight between public and private research domains Ensure all are included in the datasets by funding health data gathering infrastructure in underserved communities Develop MLHC products with an awareness of the broad range of health IT contexts for deployment Ensure the presence of multidisciplinary teams that represent both clinical and data science perspectives Promote pathways for interdisciplinary training Hold MLHC innovations to the same standards as other healthcare interventions, including requirements for prospective validation and clear demonstration of impact Mandate assessments of the performance of novel MLHC technology for impacts on marginalised and intersectional groups. Record the data necessary to perform these audits in an ongoing fashion Build patient trust by ensuring that protocols for the collection, analysis and usage of data are transparent and open Ensure the existence of clear and ethical methods for ensuring the sharing of data between different sources while protecting patient rights and privacy Improve the standardisation of medical data generation and labelling across contexts Ensure that global partners are included, so that interoperability barriers do not hinder inclusive global collaboration

Conclusion

The gaps in the medical knowledge system stem from the systematic exclusion of the majority of the world’s population from health research. These gaps combined with implicit and explicit biases lead to suboptimal medical decision making which negatively impact health outcomes for everyone, but especially those in groups typically under-represented in health research. Recent developments in machine learning and AI technologies hold some promise to address the issues with the generation of scientific evidence and human decision making. They also, however, have spurred concerns about their potential to maintain if not exacerbate these problems. These concerns must be aggressively addressed by adopting necessary structural reforms to ensure that the field is both equitable and ethical by design. Claims of ‘doing better’ have, of course, come before in healthcare with respect to bias, and the burden is on MLHC as a field to grow in a fashion that is deserving of the hype it has received. MLHC is not a magic bullet, nor can it address issues of structural health inequity by itself, but its potential may be substantial. Healthcare is flawed, and it must be reformed so that it equitably benefits all. Effective and equitable machine learning, data science and AI will be an essential component of these efforts.

22 in total

Review 1. Bias in clinical intervention research.

Authors: Lise Lotte Gluud
Journal: Am J Epidemiol Date: 2006-01-27 Impact factor: 4.897

2. Can AI Help Reduce Disparities in General Medical and Mental Health Care?

Authors: Irene Y Chen; Peter Szolovits; Marzyeh Ghassemi
Journal: AMA J Ethics Date: 2019-02-01

3. Diversity: Boost diversity in biomedical research.

Authors: Michael V Drake
Journal: Nature Date: 2017-03-29 Impact factor: 49.962

4. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available.

Authors: Miguel A Hernán; James M Robins
Journal: Am J Epidemiol Date: 2016-03-18 Impact factor: 4.897

5. Treating health disparities with artificial intelligence.

Authors: Irene Y Chen; Shalmali Joshi; Marzyeh Ghassemi
Journal: Nat Med Date: 2020-01 Impact factor: 53.440

Review 6. Physicians and implicit bias: how doctors may unwittingly perpetuate health care disparities.

Authors: Elizabeth N Chapman; Anna Kaatz; Molly Carnes
Journal: J Gen Intern Med Date: 2013-04-11 Impact factor: 5.128

Review 7. Women in clinical trials: a review of policy development and health equity in the Canadian context.

Authors: Alla Yakerson
Journal: Int J Equity Health Date: 2019-04-15

8. Assessment of Accuracy of an Artificial Intelligence Algorithm to Detect Melanoma in Images of Skin Lesions.

Authors: Michael Phillips; Helen Marsden; Wayne Jaffe; Rubeta N Matin; Gorav N Wali; Jack Greenhalgh; Emily McGrath; Rob James; Evmorfia Ladoyanni; Anthony Bewley; Giuseppe Argenziano; Ioulios Palamaras
Journal: JAMA Netw Open Date: 2019-10-02

9. Structural racism in precision medicine: leaving no one behind.

Authors: Lester Darryl Geneviève; Andrea Martani; David Shaw; Bernice Simone Elger; Tenzin Wangmo
Journal: BMC Med Ethics Date: 2020-02-19 Impact factor: 2.652

10. Diversity in Clinical and Biomedical Research: A Promise Yet to Be Fulfilled.

Authors: Sam S Oh; Joshua Galanter; Neeta Thakur; Maria Pino-Yanes; Nicolas E Barcelo; Marquitta J White; Danielle M de Bruin; Ruth M Greenblatt; Kirsten Bibbins-Domingo; Alan H B Wu; Luisa N Borrell; Chris Gunter; Neil R Powe; Esteban G Burchard
Journal: PLoS Med Date: 2015-12-15 Impact factor: 11.069

3 in total

1. Operationalising fairness in medical algorithms.

Authors: Sonali Parbhoo; Judy Wawira Gichoya; Leo Anthony Celi; Miguel Ángel Armengol de la Hoz
Journal: BMJ Health Care Inform Date: 2022-06

2. A novel decentralized federated learning approach to train on globally distributed, poor quality, and protected private medical data.

Authors: T V Nguyen; M A Dakka; S M Diakiw; M D VerMilyea; M Perugini; J M M Hall; D Perugini
Journal: Sci Rep Date: 2022-05-25 Impact factor: 4.996

3. Validating machine learning models for the prediction of labour induction intervention using routine data: a registry-based retrospective cohort study at a tertiary hospital in northern Tanzania.

Authors: Clifford Silver Tarimo; Soumitra S Bhuyan; Quanman Li; Michael Johnson J Mahande; Jian Wu; Xiaoli Fu
Journal: BMJ Open Date: 2021-12-02 Impact factor: 3.006

3 in total