Literature DB >> 25767300

Big data in global health: improving health in low- and middle-income countries.

Rosemary Wyber¹, Samuel Vaillancourt², William Perry³, Priya Mannava³, Temitope Folaranmi³, Leo Anthony Celi⁴.

Abstract

Over the last decade, a massive increase in data collection and analysis has occurred in many fields. In the health sector, however, there has been relatively little progress in data analysis and application despite a rapid rise in data production. Given adequate governance, improvements in the quality, quantity, storage and analysis of health data could lead to substantial improvements in many health outcomes. In low- and middle-income countries in particular, the creation of an information feedback mechanism can move health-care delivery towards results-based practice and improve the effective use of scarce resources. We review the evolving definition of big data and the possible advantages of - and problems in - using such data to improve health-care delivery in low- and middle-income countries. The collection of big data as mobile-phone based services improve may mean that development phases required elsewhere can be skipped. However, poor infrastructure may prevent interoperability and the safe use of patient data. An appropriate governance framework must be developed and enforced to protect individuals and ensure that health-care delivery is tailored to the characteristics and values of the target communities.

Entities: Disease Gene Species

Mesh：

Year: 2015 PMID： 25767300 PMCID： PMC4339829 DOI： 10.2471/BLT.14.139022

Source DB: PubMed Journal: Bull World Health Organ ISSN： 0042-9686 Impact factor: 9.408

Introduction

The delivery of health care is a complex endeavour at both individual and population levels. At the clinical level, the tailored provision of care to individuals is guided, in part, by medical history, examination, vital signs and evidence. In the 21st century these traditional tenets have been supplemented by a focus on learning, metrics and quality improvement. The collection and analysis of data of good quality are critical to improvements in the effectiveness and efficiency of health-care delivery. A substantial fraction of the waste in health-care expenditure results from not knowing what works for particular patients in particular clinical contexts. Interventions that appear effective in population-based studies are often widely implemented without any monitoring or any attempt to identify the patients more likely to benefit from – or to be harmed by – the interventions. The challenges of generating, analysing and applying clinical data are particularly acute in low- and middle-income countries. Given the sheer size of the human population and the incredible complexity of health-care delivery – with thousands of diseases and thousands of medications and interventions – the reconciliation of data-driven improvements in clinical medicine with good population health is complex. However, the recent development of new methods to collect, analyse and apply data on an unprecedented scale – the so-called big data approach – may allow the gap between health-care delivery and population health to be bridged and many health outcomes to be improved. These new methods of collecting, curating and conceptualizing numbers offer similar advantages to populations as those offered to patients by medical imaging – i.e. they offer the opportunity to see with greater precision. In many fields, over recent decades, the generation and use of data have rapidly expanded and new data applications have been developed. However, although there has been a concurrent increase in the production of data in the health sector, there has been little corresponding change in the use of such data to improve health care. Some proof-of-concept applications have been created for – and enthusiastically received by – health professionals but, so far, they have resulted in very few tangible, system-wide data initiatives. In this article we explore some current and potential applications of big data to public health and health-care delivery in low- and middle-income countries. We searched the PubMed and Embase databases and the grey literature for relevant articles and checked the references of selected articles for other sources of relevant information. We explored the benefits, risks and opportunities for big data in health and made recommendations for the use of big data in the delivery of health-care services in low- and middle-income countries. The multiple applications of big data to genomics and life sciences have been widely reviewed elsewhere and are not addressed in this article.

Definition

The term big data refers to the emerging use of rapidly collected, complex data in such unprecedented quantities that terabytes (1012 bytes), petabytes (1015 bytes) or even zettabytes (1021 bytes) of storage may be required. The unique properties of big data are defined by four dimensions: volume, velocity, variety and veracity. As more information is accruing at an accelerating pace, both volume and velocity are increasing. Use of a variety dimension marks a shift from data as information that is collected directly to information that is assimilated from multiple sources. Big data outputs tend to increase in value as sources become more diverse. At a population level, traditional health data included information from vital statistics registries and hospital admission statistics. In the last few decades, however, more health data have been assimilated from electronic medical records, mobile phone and purchase records, geographical positioning systems, social media and beyond. The veracity dimension refers to the uncertainty around data and their collection, standardization and validation. As the quantification and articulation of the uncertainty in reported data have been a part of health-care research and practice for many years, health practitioners may be more familiar with data veracity than many other users of big data. Uncertainty and confidence intervals are now commonly reported in projects that use these data. Together, these dimensions enable a big data approach to health, in which health priorities and policies are driven by analytics of large data sets.

Applications

The analysis of linked data sets from different sectors can provide new opportunities to improve health outcomes for populations. In the United States of America, for example, health-care and city authorities in Durham County, North Carolina, decided to pool data to direct social and health care. They created an integrated data system that allowed for the coordination of efforts to improve emergency department services and the care of patients with diabetes. By geographically pooling census data, tax payments and lead concentrations detected in blood tests, it was also possible to use the integrated system to map and stratify risks of lead exposure. Screening for lead exposure could then be focused on the high-risk areas and this led to vast improvements in the detection and management of childhood lead exposure. Such sharing and analysis of information can bridge the chasm that has traditionally divided population health from clinical medicine at individual level. The early adoption of the big data approach in well resourced settings highlighted some logistical, technical, ethical and governance challenges. In Iceland in 1998, for example, the health records – including genetic data – were declared to be a national resource by the government and made available to private industry without the consent of individuals. However, national and international opposition prevented any data transfer and the project had collapsed by 2003. Best et al. stated that “outside the world of carefully-controlled trials, between 50 and 80 per cent of electronic patient record projects fail”. Petabytes of health data have been collected, but have not been used. Although the use of big data in low- and middle-income countries is particularly complex, it also offers the greatest potential rewards. Most such countries have vertical programmes for the control of human immunodeficiency virus, tuberculosis, malaria and other infections. These programmes have detailed information requirements that need to be met by community health workers. There is often a mismatch between the information needs of the programme and the capacity of the associated field personnel to collect data with sufficient quality for reporting, tracking and – more importantly – learning. However, the advent of electronic tools is circumventing some of the logistic and quality issues in data collection. Community health workers can use mobile phones, tablets and computers for research and patient care., One of the most promising examples of big data in global health may emerge from India’s ambitious personal identification programme. Since 2010 the government of India has been issuing Aadhaar cards and unique identifying numbers to all 1.2 billion of its citizens. The cards, numbers and associated biometric identification offer the possibility of generating and monitoring health and social data – including electronic medical records and information on health insurance for low-income families – on a huge scale., Even if limited in its current reach, the Aadhaar system forms a backbone that could allow the more reliable and extensive collection of health statistics. This, in turn, could lead to dramatic improvements in the planning and delivery of public health interventions. Recent cross-sectoral data sharing is stimulating rapid innovation and growth. In 2013, for example, the government of Côte d’Ivoire consented to the release of five months’ of anonymized mobile phone data. These data were then used to develop a model for containing the spread of epidemics. This kind of data philanthropy or liberation – in which corporate entities or governments share anonymized information of potential public health significance – may provide new synergies between the for-profit sector and other stakeholders.

Use

The diffusion of new technology in low- and middle-income countries is commonly a mix of appropriation, diffusion and, often, the skipping of some of the intermediate development phases observed in high-income countries. One example of leapfrogging is the penetration of mobile phones – before the development of a widespread wired phone system – in low-resource settings over the past 15 years. The opportunities offered by the many mobile phones in such settings in the improvement of health delivery have spurred widespread enthusiasm for so-called mobile health – or m-health – projects. Several small-scale m-health programmes have demonstrated proof of concept, but few of the tested interventions have been implemented on a large scale. There are differences between the typical m-health and big data approaches. In general, m-health projects are relatively democratized, have low barriers to entry and capitalize on the mobile phones already owned by individuals. Although the impact of such projects is often rapidly appreciable, the tangible rewards are often very limited. In contrast, the big data approach inherently demands more technical skills, specialized equipment, interoperability standards, coherent data collection and analysis systems and regulatory oversight. Beyond the technical aspects, an organizational culture of quality is one of the key drivers of an effective health information system. Health-care providers and system administrators in most countries have not been trained in data science. To support good data collection, interoperable information systems based on fixed standards also need to be put in place. Global norms should be established before the highly regulated and path-dependent legacy systems found in high-income countries become fragmented in low and middle-income settings. In the smart use of data to support implementation science, initiatives such as the results-focused approach to data analysis and sharing implemented at the high-volume, low-cost Narayana Hrudayalaya Heart Hospital in India have improved health outcomes. Although the pooling of data across systems presents some major obstacles, there are emerging signs of data ownership in some low- and middle-income countries. One such sign is the launch of a repository to collate health data.

Challenges

There are many challenges to the full-scale implementation of big data systems in low- and middle-income countries. The collection of information from individuals – a prerequisite for any big data approach – is fraught with ethical, regulatory and technological issues. Given the increasing complexity of the field, the protection of individuals and populations must move from purpose-specific consent to emphasize appropriate use, risk assessment and risk minimization. The anonymization of data must be robust, monitored and enforced. Appropriate use must remain coherent with evolving societal values. Furthermore, the big data approach can amplify the existing difficulties associated with health-care delivery in settings with scarce resources. In such settings, it may be impossible for front-line health workers to extend their remit to the non-essential collection of data. Some policy-makers view the big data approach simply as a distraction for low- and middle-income countries. Others consider big data to be a critical milestone on the path towards the improvement of such countries (Box 1). In low- and middle-income countries, the future could go well or badly for the big data approach.

Dystopian views

In the worst-case scenario, big data would be an expensive distraction driven by high-income countries, focused on disease-specific outcomes and unintelligible to those who most need data access. The assimilation of fragmented data – which cannot be readily shared or compared – could undermine the relatively fragile global health community. Breaches of data security could threaten personal safety and lead to discrimination and genocide and other violence. The global health community could oversee the spending of huge amounts of money on big data, with potentially little to show for the investment. In brief, the big data approach could be associated with: the diversion of focus and resources away from interventions that are more needed; poor data governance – with databases held by private companies, frequent leaks and no recourse for citizens; the offloading of consent through poorly designed consent systems, which could threaten the safety of individuals; a lack of interoperability, with balkanized information systems that cannot be aggregated; and information that is poorly presented and analysed, considered illegible or not credible.

Utopian views

Conversely, the big data era could represent a major and beneficial turning point in the improvement of global health. Decision-makers in low- and middle-income countries could develop a “demand-side” platform to identify the information they need most. Partnerships formed with academia, industry, governments, international organizations and the non-profit sector could help develop innovative solutions. Although this idealized approach is optimistic, it is no less ambitious than achieving the Millennium Development Goals, eradicating polio or controlling malaria. The development of a “best-case” model for deploying big data may help us achieve all of these targets. In brief, the big data approach could be associated with: health data that are owned by patients; robust governance processes that have been developed to ensure respect of values and principles in the use of data, with an emphasis on risk minimization; data that are aggregated automatically, with little effort and decreasing cost; interoperability standards that allow data to be seamlessly pooled and connected; laws that, while establishing adequate safeguards, allow the sharing and pooling of anonymized data in real time; and data that are presented in a usable format to patients, health-care providers, entrepreneurs and policy-makers. Even in the best of cases, threats to the privacy of personal health information will remain. This concern is amplified when the information relates to individuals in vulnerable populations and communities. Even basic health data – e.g. on ethnicity, reproductive health, sexually transmitted infections, diseases with a genetic basis and risk exposures for disease – can be misused and lead to discrimination and reductions in personal safety. Any electronic database can be hacked. The risk of accidental or intentional breaches of data security may be particularly high in settings with high levels of illiteracy and corruption that are undergoing rapid technological transition. In many such settings, legislation supporting the privacy and security of information services is frequently underdeveloped and rarely enforced. Even in high-income countries, purpose-specific informed consent is increasingly being rendered meaningless by high levels of complexity in the ways that collected data are – or might be – used. Privacy protection is a right and the preservation of public trust is a necessity. However, as the full potential of the big data approach to improve health becomes clearer, there is also a right for populations to reap all of the potential benefits of such an approach. The use of anonymized data for the greater good of populations needs to be incorporated into the process of risk minimization. There is an increasing need for traditional consent protocols to be replaced by – or supplemented with – transparent and effective processes for data governance. The values and concerns of the target populations need to be translated into best practices that balance the benefits and risks of data use. Concerns persist about data sharing and appropriate use. The promise of big data is tempered by the weak health systems and limited governance structures to be found in most low-income countries. Many of the countries in greatest need of health metrics struggle to collect statistics on births and deaths. The epidemiological data collected in these countries are of variable reliability, have often only been collected at small sentinel sites and are rarely digitized. Improvements in the provision of food, water and sanitation remain the top priorities for over two billion people. In many low-income countries, data collection may only be possible at the expense of tangible health services. As reported by the United Nations, “it is important to recognize that big data and real-time analytics are no modern panacea for age-old development challenges”. However, as the cost decreases of aggregating and coordinating resources and services electronically, the big data approach may deliver large benefits to low- and middle-income countries. The more limited the resources for interventions, the more important the targeting and focusing of interventions become. The persistent tension between vertical or disease-specific programmes and horizontal or health-system-focused approaches remains unresolved. The big data approach fits a horizontal programme better than a vertical programme and could potentially improve the control and treatment of all human disease. At the moment, global health remains driven by disease-specific interests and disease-specific advocacy groups may well head the queue for big data – risking further fragmentation of the health community.

The next step

The role of big data collection – whether it is perceived as a tool or a threat – remains unclear. For positive outcomes, informed, reflective and resourced stewardship of data is critical. At the moment, the structures for global health governance remain relatively fragile. In 2009, the United Nations established the Global Pulse initiative. “Its mission is to accelerate discovery, development and scaled adoption of big data innovation for sustainable development and humanitarian action.” Unfortunately, the current data protection standards for Global Pulse are badly outdated as they are grounded in guidelines – for the regulation of computerized personal data files – that were published in 1990. Some guidance on the collection and use of health data was provided within the World Economic Forum’s Global health data charter, as part of the Forum’s vision of “better data for better health”. For health data, the charter identified eight key challenges and highlighted several enabling activities. The expansive scope of big data requires the cooperation of multiple stakeholders. Universities, professional societies, government agencies and research-driven companies are examples of organizations that could develop and operate data systems to support health care. A clear governance and decision-making framework is needed to inform each stakeholder of its accountability and responsibility for each process. There needs to be transparency in addressing and troubleshooting any issues until major decisions are made. Issues often persist for lack of ‎clear agreements on who should resolve them and how they should be resolved. In an emerging field such as big data, where protocols are still being developed, governance plays a major role in assuring stakeholders that there is a system for resolving issues. However, the global health community has a patchy record on governance of technological developments. Optimizing the application of big data will involve much more than confidentiality safeguards and minimum standards. A broad effort to establish interoperability standards is imperative to maximize the benefits of big data. Global health governance needs to move from a reactive model to a proactive, norm-forming approach.

Conclusion

In the field of health-care delivery, the big data approach may represent a major milestone – facilitating the development of learning systems of care and enabling more precise management of individuals to improve the health of entire populations. Sheer size increases both the potential risks and potential benefits of the approach. Although the approach may have most value in low-resource settings, it is also most vulnerable to fragmentation and misuse in such settings. Collaborative governance, careful analysis and technical partnerships are needed to minimize the risks. The complexities should not be underestimated. In low- and middle-income countries, the shepherding of the transition from paper records to petabytes of digital storage provides another opportunity for global health institutions to offer useful governance.

13 in total

1. The Lancet-University of Oslo Commission on Global Governance for Health, in collaboration with the Harvard Global Health Institute.

Authors: Ole Petter Ottersen; Julio Frenk; Richard Horton
Journal: Lancet Date: 2011-11-05 Impact factor: 79.321

2. Sharing health data: good intentions are not enough.

Authors: Elizabeth Pisani; Carla AbouZahr
Journal: Bull World Health Organ Date: 2010-06 Impact factor: 9.408

3. Paperless registration during survey enumerations and large oral cholera mass vaccination in Zanzibar, the United Republic of Tanzania.

Authors: Mohammad Ali; Jaqueline L Deen; Ahmed Khatib; Godwin Enwere; Lorenz von Seidlein; Rita Reyburn; Said Mohammed Ali; Na Yoon Chang; Valérie Perroud; Frédérique Marodon; Abdul A Saleh; R Hashim; Anna Lena Lopez; James Beard; Benedikt N Ley; Kamala Thriemer; Mahesh K Puri; Binod Sah; Mohamed Saleh Jiddawi; John D Clemens
Journal: Bull World Health Organ Date: 2010-07-01 Impact factor: 9.408

4. Short report: Piloting paperless data entry for clinical research in Africa.

Authors: Michel A Missinou; Christopher H O Olola; Saadou Issifou; Pierre-Blaise Matsiegui; Ayola A Adegnika; Steffen Borrmann; David Wypij; Terrie E Taylor; Peter G Kremsner
Journal: Am J Trop Med Hyg Date: 2005-03 Impact factor: 2.345

5. Quantitative assessment of the benefits of specific information technologies applied to clinical studies in developing countries.

Authors: William Avilés; Oscar Ortega; Guillermina Kuan; Josefina Coloma; Eva Harris
Journal: Am J Trop Med Hyg Date: 2008-02 Impact factor: 2.345

6. Geographic health information systems: a platform to support the 'triple aim'.

Authors: Marie Lynn Miranda; Jeffrey Ferranti; Benjamin Strauss; Brian Neelon; Robert M Califf
Journal: Health Aff (Millwood) Date: 2013-09 Impact factor: 6.301

7. INDEPTH launches a data repository and INDEPTHStats.

Authors: Osman Sankoh; Abraham J Herbst; Sanjay Juvekar; Stephen Tollman; Peter Byass; Marcel Tanner
Journal: Lancet Glob Health Date: 2013-07-01 Impact factor: 26.763

Review 8. Large-system transformation in health care: a realist review.

Authors: Allan Best; Trisha Greenhalgh; Steven Lewis; Jessie E Saul; Simon Carroll; Jennifer Bitz
Journal: Milbank Q Date: 2012-09 Impact factor: 4.911

9. Innovative health service delivery models in low and middle income countries - what can we learn from the private sector?

Authors: Onil Bhattacharyya; Sara Khor; Anita McGahan; David Dunne; Abdallah S Daar; Peter A Singer
Journal: Health Res Policy Syst Date: 2010-07-15

10. Scaling up mHealth: where is the evidence?

Authors: Mark Tomlinson; Mary Jane Rotheram-Borus; Leslie Swartz; Alexander C Tsai
Journal: PLoS Med Date: 2013-02-12 Impact factor: 11.069

35 in total

Review 1. MyDiabetesMyWay: An Evolving National Data Driven Diabetes Self-Management Platform.

Authors: Deborah J Wake; Jinzhang He; Anna Maria Czesak; Fezan Mughal; Scott G Cunningham
Journal: J Diabetes Sci Technol Date: 2016-08-22

Review 2. Big Data in Health: a Literature Review from the Year 2005.

Authors: Isabel de la Torre Díez; Héctor Merino Cosgaya; Begoña Garcia-Zapirain; Miguel López-Coronado
Journal: J Med Syst Date: 2016-08-13 Impact factor: 4.460

3. Safeguards for the use of artificial intelligence and machine learning in global health.

Authors: Amy K Paul; Merrick Schaefer
Journal: Bull World Health Organ Date: 2020-01-27 Impact factor: 9.408

Review 4. Big data and data processing in rheumatology: bioethical perspectives.

Authors: Amaranta Manrique de Lara; Ingris Peláez-Ballestas
Journal: Clin Rheumatol Date: 2020-02-15 Impact factor: 2.980

5. "Big data" in economic history.

Authors: Myron P Gutmann; Emily Klancher Merchant; Evan Roberts
Journal: J Econ Hist Date: 2018-04-03

6. A Systematic Framework for Collecting Site-Specific Sampling and Survey Data to Support Analyses of Health Impacts from Land-Based Pollution in Low- and Middle-Income Countries.

Authors: Katherine von Stackelberg; Pamela R D Williams; Ernesto Sánchez-Triana
Journal: Int J Environ Res Public Health Date: 2021-04-28 Impact factor: 3.390

7. A multidimensional framework for rating health system performance and sustainability: A nine plus one ranking system.

Authors: Laura Müller; Reida El Oakley; Mohammed Saad; Ali H Mokdad; Giamal A Etolhi; Antoine Flahault
Journal: J Glob Health Date: 2021-05-08 Impact factor: 4.413

8. Establishing a National Maternal Morbidity Outcome Indicator in England: A Population-Based Study Using Routine Hospital Data.

Authors: Manisha Nair; Jennnifer J Kurinczuk; Marian Knight
Journal: PLoS One Date: 2016-04-07 Impact factor: 3.240

9. Developing the ethics of implementation research in health.

Authors: Vijayaprasad Gopichandran; Valerie A Luyckx; Nikola Biller-Andorno; Amy Fairchild; Jerome Singh; Nhan Tran; Abha Saxena; Pascal Launois; Andreas Reis; Dermot Maher; Mahnaz Vahedi
Journal: Implement Sci Date: 2016-12-09 Impact factor: 7.327

10. What Role for Law, Human Rights, and Bioethics in an Age of Big Data, Consortia Science, and Consortia Ethics? The Importance of Trustworthiness.

Authors: Edward S Dove; Vural Özdemir
Journal: Laws Date: 2015-09-01