Literature DB >> 32733683

Data protection and ethics requirements for multisite research with health data: a comparative examination of legislative governance frameworks and the role of data protection technologies.

James Scheibner¹, Marcello Ienca¹, Sotiria Kechagia², Juan Ramon Troncoso-Pastoriza³, Jean Louis Raisaro⁴, Jean-Pierre Hubaux³, Jacques Fellay^4,5,6, Effy Vayena¹.

Abstract

Personalised medicine can improve both public and individual health by providing targeted preventative and therapeutic healthcare. However, patient health data must be shared between institutions and across jurisdictions for the benefits of personalised medicine to be realised. Whilst data protection, privacy, and research ethics laws protect patient confidentiality and safety they also may impede multisite research, particularly across jurisdictions. Accordingly, we compare the concept of data accessibility in data protection and research ethics laws across seven jurisdictions. These jurisdictions include Switzerland, Italy, Spain, the United Kingdom (which have implemented the General Data Protection Regulation), the United States, Canada, and Australia. Our paper identifies the requirements for consent, the standards for anonymisation or pseudonymisation, and adequacy of protection between jurisdictions as barriers for sharing. We also identify differences between the European Union and other jurisdictions as a significant barrier for data accessibility in cross jurisdictional multisite research. Our paper concludes by considering solutions to overcome these legislative differences. These solutions include data transfer agreements and organisational collaborations designed to `front load' the process of ethics approval, so that subsequent research protocols are standardised. We also allude to technical solutions, such as distributed computing, secure multiparty computation and homomorphic encryption.

Entities: Chemical Disease Gene Species

Keywords: Advanced cryptography; Biomedical data; Data protection; Data sharing; Multisite research; Personalised healthcare

Year: 2020 PMID： 32733683 PMCID： PMC7381977 DOI： 10.1093/jlb/lsaa010

Source DB: PubMed Journal: J Law Biosci ISSN： 2053-9711

Introduction

Personalized medicine has the potential to improve healthcare by providing targeted and more effective preventive and therapeutic strategies for individuals. For these benefits to be fully realized, doctors, healthcare providers, researchers, and government agencies must maximize sharing and linkage of patient and administrative data for multisite research. However, data sharing for medical data is subject to privacy and security risks, which may have two contrasting yet equally detrimental effects. On the one hand, privacy and security concerns can undermine public confidence and participation in research, degrading the overall value flowing from data linkages. On the other hand, overreacting to privacy and data security concerns could undermine attempts to share data for research, clinical, and public health purposes. Furthermore, biomedical research is increasingly dependent on multisite research, requiring the transfer of different forms of data between different jurisdictions and accelerating the risks described above. In this paper, we address two research questions. First, we wish to identify the regulatory obstacles, both ethical and legal, associated with multisite research with health-related data, including data sharing and linkage. Second, we wish to delineate possible organizational and technical solutions to overcome these obstacles, with a special focus on advanced privacy-enhancing technologies. This tight link between ethical, legal, and technical requirements is grounded in the observation that any technical solutions must be reinforced by an ethically and legally compliant governance framework. Accordingly, our paper is organized into three sections. The first section examines existing international frameworks governing research with health data. These include research ethics agreements and declarations, as well as international and supranational agreements governing data protection. Outside of these legal agreements, the first section also considers reports and governance documents that have attempted to define the scope of health data research. These include documents published by research consortiums as well as other entities such as the Organization for Economic Cooperation and Development (OECD). The second section then compares the national data protection and research ethics laws in seven OECD jurisdictions using five criteria for data accessibility published by the OECD. These criteria are used to assess four features of each national regime. The third section identifies the regulatory obstacles for data accessibility on a cross jurisdictional basis. It concludes by illustrating how organizational and technical solutions can be used to overcome these regulatory obstacles.

International Frameworks Governing Research with Health Data

By examining international research guidelines, a number of principles for research involving health-related data emerge. First, all biomedical and health-related research requires the free and informed consent of research participants. Second, researchers and research organizations must guarantee and be held accountable for the confidentiality of research participant data, as well as ensuring risk mitigation. Third, research participants must be compensated for and allowed to benefit from research involvement. Finally, efforts must be made to ensure the accessibility and maximum reuse of data collected from research. Although research ethics committees play an important role in protecting research participants, they must also balance these principles against each other, despite potential conflicts. For example, the Declaration of Taipei limits the use of biobanking data to where specific consent has been obtained, which may prevent general consent for longitudinal research. These conflicting principles can be resolved by ethics committees requiring researchers to implement a specific study protocol to minimize the risk to participants. Nevertheless, there is significant conflict between protecting privacy and maximizing the use of research materials due to the technical difficulty of anonymization or removal of identifiers, so individuals cannot be re-identified. Ohm argues that true anonymization of personal data is functionally impossible and that releases of data should instead be contextual. Multiple studies, since Ohm’s study, have proven the technical ineffectiveness of standard de-identification and anonymization techniques both for omics and clinical data. These questions of confidentiality and security for biomedical research data dovetail into the issue of data protection. The 2013 OECD Privacy Guidelines establish a series of eight data protection principles that underpin data protection laws in OECD jurisdictions. Furthermore, the 2015 OECD report on best practices for health data governance generated six data accessibility criteria. Five of these are relevant to our paper: (i) Identifiable data are shared with other data custodian or government entities. (ii) University and non-profit researchers may be approved access to de-identified data. (iii) Healthcare providers may be approved access to de-identified data. (iv) For-profit businesses may be approved access to de-identified data. (v) Foreign government, university, or non-profit researchers may be approved access to de-identified data. Moreover, the OECD 2015 report defines ‘de-identified data’ as ‘data that cannot be used to identify an individual directly’. The OECD distinguishes ‘de-identified data’ (used here synonymously with anonymized data) from ‘pseudonymization’, or techniques where identifying information is converted to meaningless values. Outside the OECD, the Global Alliance for Genomics and Health, a consortium of 580 research organizations, published their ‘Framework for the Responsible Sharing of Genomic and Health Related Data’. This framework highlights the importance of the right to free scientific inquiry, including data accessibility. This final principle mandates that researchers make data and results that widely accessible through publishing findings and digitally disseminating results while minimizing obstacles to data sharing. Therefore, we use the criteria of ‘data accessibility’ for comparing four features of the regulatory regime in each jurisdiction. These four features are responsibility for processing, the data types available for processing, the consent or approval required for processing, and where health data can be transferred. We restricted our analysis to OECD jurisdictions, as all have modeled their privacy laws around the OECD Privacy Guidelines. We included Switzerland as part of our study, which features historically strong data protection laws with high degrees of data reuse for research. We then extended our analysis to include three jurisdictions compliant with the European Union (EU) General Data Protection Regulation (GDPR): Italy, Spain, and the UK. We chose these jurisdictions as they have implemented specific rules for the processing of biological and health-related data as permitted by the GDPR. We also included the USA and Canada as part of our analysis. We made this decision as these two jurisdictions have historically leaned toward industry self-regulation or more fragmented approaches as opposed to the rights-based approach in Europe. Finally, we included Australia due to the high rates of adoption of electronic health record systems and research data linkage. An overview of the key terms that are considered as part of our analysis is contained in Table 1.

Table 1

A summary of commonly used terms in national and supranational data protection laws

Data controller	The natural or legal person who decides how data should be used or processed
Data processor	The natural or legal responsible for processing the data at the request of the data controller
Data custodian	The natural or legal person responsible for handling or holding the data
Data subject	The natural person whose data is collected, stored or otherwise processed
Personal data/information	Any information relating, directly or indirectly, to an identified or identifiable person. Includes information inferred about a person
PHI	Any information about health status, provision of healthcare, or payment of healthcare. Protected under HIPAA in the USA only
Anonymization	The process of rendering data so that it cannot be used to identify an individual directly or indirectly. Not regulated by data protection legislation
Pseudonymization	The process of removing identifiers from data so that data cannot be re-identified without these identifiers. Regulated by data protection legislation
De-identification	The process by which identifiers are removed from the health information, which mitigates privacy risks to individuals and thereby supports the secondary use of data for comparative effectiveness studies, policy assessment, life sciences research, and other endeavors
Specific consent	Explicit consent to use data for a particular purpose or research project. Data obtained under this consent cannot be reused for another purpose
General consent	Explicit consent to use data in a particular field. Data obtained under this consent can be reused for further purposes in the same field
Exchange	To physically move data from one organization to another
Transfer	To physically move personal data from one jurisdiction to another

Note that these terms reflect those used in the OECD, Health Data Governance: Privacy, Monitoring and Research (2015) report, as well as other legislation (such as HIPAA).

A summary of commonly used terms in national and supranational data protection laws Note that these terms reflect those used in the OECD, Health Data Governance: Privacy, Monitoring and Research (2015) report, as well as other legislation (such as HIPAA).

A Comparison of National Data Protection and Research Ethics Laws

Responsibility for Processing

The Federal Act on Data Protection (FADP) and the Ordinance to the FADP (OFADP) govern personal data processing in Switzerland by private persons or federal bodies, in the private or public sector. The Human Research Act (HRA) and the Human Research Ordinance (HRO) also govern research involving human beings. Likewise, the GDPR applies to all personal data processing by EU-established controllers or processors. Similarly, Canadian privacy legislation operates on multiple jurisdictional levels. Federal Canadian legislation applies to commercial personal information processing by private-sector organizations, particularly for interprovincial transfers and where provincial legislation is not substantially similar. In addition, the Privacy Act 1985 governs data processing by governments and public-sector organizations. Outside these acts, provincial data protection legislation in Canada displaces or supplants the federal legislation on a sectorial basis. All Canadian provinces have separate legislation governing the processing of health-related personal data. In Alberta, New Brunswick, Newfoundland, Nova Scotia, and Ontario, ‘health information custodians’, or entities handling health data, are also held responsible for data processing. These custodians include entities that process health data for commercial and non-commercial purposes, therefore supplanting federal legislation. In contrast, in the USA, the Health Insurance Portability and Accountability Act 1996 (HIPAA), the Privacy Rule, and the Security Rule extend responsibility to three types of covered entities. These entities include healthcare providers, health plan providers, and healthcare clearinghouses (that process non-standard health information). Although not all research laboratories are covered entities, laboratories become covered entities if they process patient billing or insurance information. Likewise, in Australia, ‘APP entities’ must comply with ‘Australian Privacy Principles’ (APPs) when processing personal information. APP entities include Australian government agencies or ‘organizations’ as defined as any natural or legal person making over 3 million dollars a year.

Data Available for Processing

The FADP defines personal data as ‘information about an identified or identifiable person’, with sensitive personal data including ‘data on… health’. Furthermore, the HRA and HRO apply to research involving human tissue or/and health-related data. These definitions encompass the processing of biological or medical data for any purpose, research-related, or otherwise and prohibit the processing of health-related data ‘without justification’. However, these regulations do not apply to research involving anonymized biological material or health data. The HRO describes anonymization as the irreversible masking or the deletion of all items that would enable the data subject to be identified without disproportionate effort. However, the HRA and HRO still apply to coded biological material and health-related data, which should be considered analogous to pseudonymized data. The GDPR adopts an equivalent definition of personal data and lists ‘genetic data’, ‘biometric data’, and ‘data concerning health’ as special categories of personal data. Furthermore, in a similar fashion to the FADP and HRA, the GDPR does not apply to anonymized data. Anonymized data are defined as information not relating to an identified or identifiable natural person. In contrast, pseudonymized data, or data no longer attributable to an individual without other data, remain personal data and are protected under the GDPR. In a working paper, the former Article 29 Working Party held that no anonymization or pseudonymization techniques completely protects against re-identification. As discussed previously, this conclusion is supported by technical literature. Therefore, the working paper suggests adopting a contextual, relative-based approach to prevent released data from being re-identified. This contextual-based approach is supported by EU Court of Justice (CJEU) precedent that IP addresses are personal data if combined with additional data to re-identify users. While this decision ostensibly extends the relative approach from the former Article 29 Working Group, it retains a relatively broad interpretation of personal data. In addition, EU member states that implementing GDPR into national law can also introduce further conditions, including limitations on processing biometric, genetic, or health-related data. In both Italy and Spain, relatives of deceased persons can exercise rights with respect to the processing of the deceased’s data. Although personal data under the Data Protection Act 2018 (UK), in practice, the Medical Research Council and the National Health Service (NHS) treat data from deceased persons as personal data. While the Italian Privacy Code is congruent with the GDPR regarding anonymization and pseudonymization, the Spanish legislation requires separation between the pseudonymized data and the entity conducting the pseudonymization. Historically, UK courts have adopted a more flexible definition of anonymization and pseudonymization. For example, in Common Services Agency v Scottish Information Commissioner, Baroness Hale held that if the data were delivered to an entity that had no means to re-identify the data, it would remain anonymized. Likewise, the High Court in R (Department of Health) v Information Commissioner held that low cell count statistics on abortion were anonymized when released without other identifying information. However, when these cases were decided, the Data Protection Act 1998 did not define anonymized data, with Baroness Hale referring to the Data Protection Directive. In contrast, the current Data Protection Act 2018 uses the same definition of anonymized as the GDPR. Furthermore, more recent guidance from the UK Information Commissioner’s Office (ICO) has realigned UK standards on anonymization with those of the EU Article 29 Working Party. Accordingly, pseudonymized data transferred to a third party without identifiers will no longer constituted the anonymized data under UK law. In the USA, HIPAA defines ‘protected health information’ (PHI) to include information regarding health status, provision of healthcare or healthcare payment, or genetic information of an individual. PHI retains this status for up to 50 years after the death of that individual. Nevertheless, this definition has a limited scope compared with the broad definition under European law. First, PHI does not extend to information which may nevertheless be used to draw conclusions about a patient’s health indirectly, such as life insurance or purchase data. Furthermore, HIPAA does not apply to health data generated about an individual without the use of healthcare, such as data generated using healthcare mobile applications. Finally, HIPAA provides three methods for de-identification: by expert determination, by removing 18 identifiers (including genetic information), or by removing 16 identifiers from a dataset for research purposes. These identifiers include any unique identifying number, character, or code. Furthermore, the covered entity de-identifying the data must have no actual knowledge or information that could re-identify an individual in the dataset. However, the covered entity is under no obligation to assess the potential for re-identification for the remaining data when it is placed in another environment. Therefore, the HIPAA standard of de-identification is arguably less strict than that found under Swiss and EU law. Under Canadian law, ‘personal information’ refers to any information about an identifiable individual. Canadian courts have defined ‘personal information’ expansively to include information that relates to or concerns the subject. PIPEDA also defines ‘personal health information’ as including physical or mental health information, tissue and information collected incidentally to providing health services, and health information from deceased persons. The question of whether the anonymized data, particularly anonymized genomic data, remain personal data under Canadian law that has not been addressed in federal legislation. However, in Gordon v Canada (Minister for Health), the Federal Court held that the test for anonymization is whether there is a ‘serious possibility’ of re-identification. This test has been confirmed as the appropriate pan-Canadian test for anonymization by the Office of the Privacy Commissioner of Canada. Nevertheless, some provincial legislation also provides separate standards for anonymization or de-identification. In Alberta, New Brunswick, and Newfoundland, health data custodians are empowered to ‘strip, encode, or otherwise transform’ health information to make it ‘non-identifying’. In contrast, the equivalent legislation in Ontario requires the removal of any information that could be used to identify the individual. In Australia, ‘personal information’ is defined information about an identified individual or an individual who is reasonably identifiable with ‘sensitive information’ including health, genetic, or biometric information. It is ambiguous as to whether this definition applies to just identifiable data or also aggregate data. Case law has failed to clarify this question with respect to IP addresses that may be re-identified using with other information. If this definition includes aggregate data that could re-identify an individual using other data, aggregate data would remain personal data. Finally, there is no clear definition in any of the jurisdictions under study as to whether encrypted data represent a special category of data or whether it falls into the categories supplied already.

Consent or Approval Required for Processing

In Switzerland, sensitive data cannot be disclosed unless the data subject has given consent or has made their data generally accessible and not prohibited disclosure. A person may be only involved in a research project if they have given consent in writing. Furthermore, a person must receive written information on the nature of the project, the burdens and benefits taken, and data protection strategies. A person must have also been given adequate time to decide whether to participate in the project. If the researcher intends to make further use of the data, they must seek general consent from the person when data are collected. The relevant ethics committee should approve any further use for research purposes in case of lack of consent or information to the right to dissent that has not been obtained. The HRO describes ‘further use’ as collecting, cataloguing, storing, making accessible, or transferring biological material or health-related data. This broad definition of further use encompasses linkages between different datasets. The HRA then specifies the type of consent available for different types of data, which are described in Table 2.

Table 2

Comparison of different forms of consent that are available for different types of data and data protection measures in Switzerland

	Types of data
	Further research use of biological material and genetic data (Art. 32 HRA)	Further use of non-genetic health-related personal data (Art. 33 HRA)
Data protection measure	Permissible Forms of Consent
Personal identifying data	Informed consent for specific research project(s) required (Art. 32(1) HRA, Art. 28 HRO)	General consent required (Art. 33(1) HRA, Art. 31 HRO)
Coded	General consent required (Art. 32(2) HRA, Art. 29 HRO)	General consent required (Art. 33(2) HRA, Art. 32 HRO)
Anonymized	Data subjects have to be informed about anonymization and should not object to it	Not regulated by HRA

Comparison of different forms of consent that are available for different types of data and data protection measures in Switzerland The Swiss Academy of Medical Sciences’ general consent form satisfies the requirements under the HRA and HRO. The General Consent Form sets out the principles for the reuse of materials where patients have provided general consent for their tissue, data, or other samples to be made available for research. Only authorized hospital personnel are permitted to access identifiable records, with researchers restricted to coded data. When informed consent requirements have not been met, researchers may use biological material or health-related data if consent is disproportionately difficult to obtain and there is no documented refusal. Furthermore, the interests of research must outweigh the interests of the person concerned in deciding on the further use of his or her biological material and data. Under the FADP, researchers can disclose sensitive data for scientific reasons, as well as process it for further purposes. However, researchers must ensure that the data are rendered anonymous as soon as possible, so data subjects cannot be re-identified. The GDPR requires express consent for the processing of sensitive personal data (Art. 9(2)(a)). Although the GDPR ostensibly recognizes the possibility for general consent via Recital 33, the former Article 29 Working Party held that only general consent for a specific research project is valid. However, the GDPR also creates a number of exceptions for processing sensitive data. First, sensitive data can be processed for preventative medicine, medical diagnosis, or providing healthcare services, subject to professional standards of secrecy. Second, sensitive data may be processed for reasons of public interest in the area of public health, such as protecting against threats to public health or ensuring medical device quality. Third, sensitive data may be processed for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes. However, any statistical or research-related processing requires researchers to first determine whether they can work without personal data or with anonymized data. If anonymized or non-personal data are not sufficient, researchers should determine whether processing can be conducted with pseudonymized data. Finally, under the GDPR, ‘scientific research purposes’ and ‘statistical purposes’ are not limited to research or processing in the public interest. Accordingly, privately funded research may be permissible under the exception. In addition, this exception may also permit a significantly broader range of processing on aggregate data on the grounds of ‘statistical processing’ without adequate ethics approval. This exception may have the effect of allowing larger data processors with technical infrastructure to bypass scrutiny while exposing smaller researchers to heightened compliance costs. In Italy, the Privacy Code requires processing of genetic, biometric, and health-related data to comply with best practices from European Data Protection Board (EDPB) (the replacement for the Article 29 Working Party), as well as any scientific and technological advances. The Privacy Code also requires a data privacy impact assessment and an ethics committee assessment before any processing of health-related data can occur. Finally, the Privacy Code requires technical measures for security, access control, and data minimization. However, the Privacy Code also permits data to be used for further purposes beyond those initially consented to by data subjects. In Spain, consent alone is not sufficient to lift the prohibition on processing special categories of data mandated by Art. 9(1) of the GDPR. Instead, processing requires the subject’s consent and a purpose defined in Arts. 9(2)(g), (h), or (i) of the GDPR, unless for cases of exceptional relevance and seriousness to public health. The reuse of health and biomedical data for scientific research must be ‘scientifically integrated’ into the initial study and approved by an independent ethics committee. In the UK, reuse of data for research is managed via NHS Digital, previously known as the Health and Social Care Information Centre. NHS Digital has the authority to distribute patient data for public health reasons, including efficient use of National Health System (NHS) resources, or with patient consent. Because NHS Digital has a broad ambit to disclose patient data for particular reasons, this data can be accessed by both private- and public-sector researchers. In the USA, PHI may only be disclosed to third parties without written consent in three circumstances: for law enforcement, administrative purposes, or to facilitate treatment, payment, or healthcare. In the alternative, data may be disclosed in de-identified form. The ‘Common Rule’ then requires all human research funded by the US federal government to comply with relevant national laws. The Common Rule also requires all relevant research to be subject to approval by an institutional review board (IRB). In addition, the US National Institute of Health (NIH) has published its own Genomic Data Sharing (GDS) Policy. This policy applies to all NIH funded research that involves generating or using human genomic data, including for secondary purposes or data sharing. The NIH GDS Policy implicitly encourages adopting tiered or dynamic consent and requires institutions that generate large quantities of data to supply it to a data repository. However, the NIH GDS Policy, while permitting such sharing, also makes the data processor and not the owner of the repository liable for any breaches. This displacement of liability might discourage further data linkage with overseas research institutes. In Canada, under Federal legislation, commercial organizations can only collect personal information with the valid consent of an individual or if an exception applies. However, PIPEDA provides that where certain circumstances exist, data controllers can use or disclose information without knowledge or consent of the subject. These include for emergency medical treatment and research purposes where it would be impracticable to obtain consent (provided the research has been disclosed to the Privacy Commissioner). Personal information that is no longer required to fulfil the identified purposes should be destroyed, erased, or made anonymous when held by a private organization. Likewise, under the Privacy Act, government agencies can only use personal information consistent with the purpose for which it was collected. This personal information can then only be disclosed for a consistent purpose or if an exception applies. As for PIPEDA, these exceptions include for research related purposes, provided the research could not be completed without identifying information, and no individuals will be re-identified. Finally, the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (‘TCPS2’) mandates that all research involving human participants in Canada require ethics committee approval. Public-sector data do not require ethics approval for use, except where data linkage could result in re-identification. The TCPS2 only permits the use of health-related data without consent for secondary purposes if it is impossible to seek patient’s consent and if identifiable information is essential to research. Nevertheless, these requirements will not apply to healthcare institutes or research agencies that are bound by provincial health and freedom of information legislation. These legislative instruments create additional consent or ethics approval requirements for processing or disclosing health-related data for research. For example, legislation in a number of provinces requires an agreement between a researcher and custodian or trustee specifying terms of use before the data transfer occurs. In Australia, APP entities may only use or disclosure personal health information for research where it is impractical to obtain consent or a permitted health situation applies. Permitted health situations include providing health-related services, public health management, or research defined under Sections 95A and 95A. These sections allow the National Health and Medical Research Council (NHMRC) to establish rules for the processing personal data for research. The NHMRC National Statement on Ethical Conduct in Research then allows researchers to link identifiable patient data for scientific research purposes. In particular, the Statement explicitly distinguishes between individually identifiable data and re-identifiable data (as opposed to ‘de-identified’ data due to the ambiguity of the latter term). However, the Statement does not define these terms. Furthermore, as part of the informed consent process, participants must be informed of the potential for data linkage. Finally, all research involving genomics must be subject to ethics committee review. However, if no linkage of genomic data is required, the project may proceed at low risk. An ethics committee might also waive the requirements for consent where there are no suitable arrangements for obtaining consent. In light of these more flexible data linkage policies, there are notable research projects that have been established to support the linkage of public-sector data. These projects will be discussed in further detail in the second and third sections of this paper.

Transfers of Personal Data across Borders

Reuse and transfer of data across borders under the GDPR are limited to countries assessed as providing adequate protection. At the time of writing, Switzerland has been assessed as offering adequate protection, with transfers limited to private organizations in Canada and the USA. Alternatively, transfers may be subject to appropriate organizational safeguards, or exceptions such as express consent or limited numbers of transfers. Given that only a handful of jurisdictions has been assessed as offering adequate protection, organizational safeguards are currently preferred as the mechanism of transfer. To this end, the EDPB has prepared guidelines on how to develop codes of conduct for transfer. These guidelines address the requirements for data transfer in medical research, and recommend data controllers install safeguards that allow data subjects to exercise their rights. In addition, these guidelines require data controllers and processors to introduce data minimization, security, and retention measures to protect patient data. PIPEDA requires equivalent protection in the target jurisdiction for transfer, although Canadian researchers may also be bound by provincial legislation that applies to public researchers. For example, limits on transfer by custodian may be determined by an ethics committee in Alberta. In contrast, personal health data cannot be disclosed outside New Brunswick without the express consent of the relevant individuals. Any transfer of personal data from Australia to a third-party jurisdiction requires compliance with Australian privacy law. In the alternative, the recipient jurisdiction must offer the same degree of protection as Australia or the data subjects must provide their explicit consent to transfer. The exception to these strict data localization guidelines on transfer exists within the USA. Specifically, PHI may be reused for secondary purposes and transferred to other organizations, although approval from an IRB as required by the Common Rule is necessary. A comparison between each of the key principles discussed in this section is contained in Table 3.

Table 3

Impact of data protection laws and research ethics laws on data accessibility

	Identifiable data are shared with other data custodian or government entities	University and non-profit researchers may be approved access to de-identified data	Health care providers may be approved access to de-identified data	For-profit businesses may be approved access to de-identified data	Foreign government, university, or non-profit researchers may be approved access to de-identified data
Switzerland	Can only be shared with other entities with consent (Art. 17, FADP)	May access data if anonymized as soon as practicable and disclosed in anonymous form (Art. 22)	See for university researchers	See for university researchers	Transfer of identifiable data forbidden without consent or appropriate protection (Art. 6)
GDPR	Can only be shared with other entities with (Art. 9(2)(a), GDPR)	May rely on the research exception to access pseudonymized health-related and biomedical data (Art. 9(2)(j)). Note this exception must be implemented into national law and therefore may not be available	May rely on public interest, diagnosis, or public health exceptions to access data with consent and confidentiality (Art. 9(2)(g), (h), (i))	May rely on the research exception to access pseudonymized health-related and biomedical data (Art. 9(2)(j)). Note this exception must be implemented into national law and therefore may not be available	Can only be transferred to a country with an adequacy determination, subject to safeguards or limited exceptions (Art. 45, 46, 48)
Italy	Can only be shared with consent, cannot be disseminated to third parties (Art. 167bis)	As for government analysts	As for government analysts	As for government analysts	As for GDPR (Art. 43–45); subject to criminal sanctions
Spain	Personal data cannot be shared without consent and a legitimate reason for exchange	As for GDPR, consent/ethics committee approval is required (Art. 9(2), 17^th additional provision)	As for university and non-profit researchers	As for university and non-profit researchers	As for GDPR (Art. 40–43)
UK	Can only be shared with consent or unless a legitimate reason for sharing applies (Schedule 2(1))	As for government analytics	As for government analytics	As for government analytics	As for GDPR (section 18, Data Protection Act)
USA	Can only be shared for law enforcement, administrative or procedural purposes	De-identified data must be subject to expert determination, safe harbor or limited data set de-identification	As for university and non-profit researchers	As for university and non-profit researchers	Does not apply overseas (but see research ethics laws below)
Canada	Can only be shared with consent unless exception applies, de-identified after processing	Should be de-identified as soon as possible, but accessible without consent and ethics committee approval (note provincial legislation)	As for university researchers	As for university researchers	Requires equivalent level of protection for transfer (Schedule 1, Clause 4.1.3)
Australia	Can only be shared with consent or where a permitted health situation creates an exception (ss16A, 16B)	May access where a permitted health situation applies subject to NHMRC rules (s16B, s95, s95A)	As for university researchers	As for university researchers	Only available if reasonable steps have been taken to ensure compliance, consent for transfer or requesting jurisdiction ensures adequate protection
New Zealand	Can only be used or disclosed with consent (s6, cl. 11 Privacy Act, s5 cL 10 and 11, HIPC)	Not reasonably identifiable can be used without consent subject to ethics approval (s5 cL 10 and 11, HIPC)	See for university researchers	Not reasonably identifiable can be used without consent subject to ethics approval (s6 cL. 10 and 11, Privacy Act)	Only available if reasonable steps have been taken to ensure compliance, consent or adequate protection (Part 11A, Privacy Act 1993)

Impact of data protection laws and research ethics laws on data accessibility

Potential Regulatory Obstacles to Multisite Research and Appropriate Regulatory or Technical Solutions

Regulatory Convergence

The comparison of legislation above demonstrates a number of impediments between each of the four features of each regulatory regime. First, with respect to entities that carry out processing, despite the different legislative approaches to data protection, there is considerable convergence toward a broad approach to processing liability. That is, processors of personal data will still be responsible under data protection law, irrespective of whether they are private individuals or working for public- or private-sector organizations. This responsibility extends to both the recipients and the providers of personal data. Two exceptions exist in the USA and Australia, where researchers that are not covered entities or APP entities, respectively, are not held responsible in either jurisdiction for data processing. The ambiguity regarding regulatory limits could manifest where researchers in the USA or Australia, who assuming they are not liable, link or reuse data without awareness as to their obligations. Likewise, researchers from outside the USA working with US datasets need to be mindful of whether their research is funded by the US Federal government and is therefore protected by the Common Rule. This question of data linkage dovetails into the question of what constitutes personal data and, particularly, whether data from deceased persons or anonymized data are ‘personal data’. With respect to data from deceased persons, there appears to be a convergence that personal data include data from deceased persons. The exception exists in Australia, where personal information does not include information from deceased person. However, as discussed previously, the NHMRC National Statement on Ethical Conduct in Research provides a mechanism for using data without obtaining consent where impractical to do so. Accordingly, this consent waiver might provide a pathway for researchers in Australia to reuse health-related data from the records of deceased individuals.

Regulatory Divergence

Nevertheless, a key point of ambiguity identified in each jurisdiction remains the question of what will satisfy the requirements for anonymization. Furthermore, the divergence between each jurisdiction is more severe when considering the question of anonymization and pseudonymization. As alluded to, this divergence is particularly pronounced when comparing the approach adopted in the USA under the Privacy Rule and those approaches adopted in Switzerland, Italy, and Spain. In the USA, if a covered entity has removed the requisite set of identifiers from the data, they are no longer under an obligation to identify any privacy risks that emerge from the remaining dataset. Therefore, from a legal perspective, this de-identified data may not meet the requisite standard of anonymization imposed by Swiss or EU data protection law. Furthermore, the standard of anonymization imposed in the UK remains uncertain as well. As discussed above, NHS Digital is responsible for disclosing anonymized or identifying patient information upon request. However, neither the Health and Social Care Act 2012 nor NHS Digital provides a definition of anonymization, instead relying on the ICO Anonymization Code of Practice. The Anonymization Code of Practice itself does not require any single one technique, but instead, it requires data processors to consider whether there is an ‘unacceptable risk of re-identification’. However, the UK Anonymisation Network (UKAN) is critical of this approach constituting ‘anonymization’, on the grounds that anonymization under UK law also requires considering the environment where data are released. Furthermore, because NHS Digital releases anonymized data where justified, as opposed to specific parties, there is uncertainty as to whether these environmental factors are considered for a release. Whether the Swiss or EU standard of anonymisation or pseudonymisation is congruent with the equivalent standard under Canadian law is unclear. This uncertainty is primarily due to the overlapping jurisdiction of federal and provincial laws. In particular, the ability to ‘strip, encode, or otherwise transform personal health information’ to create de-identified information appears equivalent to anonymization. However, guidance documents from these provinces provide varying anonymization. For example, the Alberta Health Services Guidelines distinguish between anonymous, aggregate, and de-identified data (which is roughly analogous to coded or pseudonymized data). The Guidelines then proceed to describe the second method of de-identification under HIPAA as the correct form of rendering personal information non-identifying. This approach can be contrasted by the approach adopted in Ontario, which appears to be closer to the contextual standard of anonymization required under Swiss and EU data protection law. The uncertainty about the scope of anonymization within each province has a consequent effect on reusing data or transferring data between jurisdictions. Furthermore, the TCPS2 lacks an express provision on the data sharing requirements, particularly between institutions in Canada. Therefore, this absence of data sharing guidance may undermine attempts to balance data protection against the need to ensure data accessibility and widely disseminate results. In Australia, the oblique precedent concerning the scope of personal data obscures what constitutes the processing of anonymized data. As discussed previously, linkages of patient data are permissible under the NHMRC National Statement, although the Statement does not define whether non-identifiable data remain personal data. The risks in Australia are increased by the fact that in practice, government agencies decide whether to open their datasets without reference to those with relevant expertise. However, it can be very difficult to preserve confidentiality in the absence of a universal patient identifier while matching patient data. This heightened risk of re-identifiability is due to the larger range of fields in the final linked dataset. The Office of the Australian Information Commissioner (OIAC) has published additional guidance on what constitutes appropriate anonymization or de-identification. Specifically, the OIAC has published an Australian adaptation of the UKAN Anonymisation Decision-Making Framework as a guide for anonymization and contextual control. The second relates to the circumstances in which health records can be reused for research-related purposes. As discussed in the introduction, there is a considerable benefit to linking data from electronic health records for generating new insights in public health or personalized medicine. Nevertheless, unless general consent for future has been sought from each individual before their records were processed, researchers may not have lawful grounds to process these data. Australia, Canada, Switzerland, and the USA all feature either federated or code of conduct driven regulations of general consent and data linkage. As discussed previously, in the context of anonymization, this style of regulation provides significant flexibility for both regulators and researchers to respond to technological change. However, it also provides regulators with the capacity to introduce new forms of consent mechanism to permit the reuse of data. This flexibility can be contrasted with the ostensible unavailability of general consent under the GDPR, which may place significant limits on the reuse of personal data. Although Italy, Spain, and the UK have created derogated exceptions for consent in biomedical research, there remain outstanding questions on whether these derogations will remain compatible with the GDPR. Furthermore, it may be impossible to seek general consent for research in countries that do not have these derogated exceptions. Outside of the risks of re-identification, the regulatory frameworks do not impose significant obstacles on the transfer of data between organizations. However, transfer across jurisdictions constitutes a significantly more severe regulatory to obstacle data sharing and data linkage. Aside from the USA, data cannot be transferred out of any of the jurisdictions discussed above without equivalent protection in the target jurisdiction or safeguards to ensure data protection. Both EU and Swiss law provide a narrow set of criteria under which personal data can be transferred in the absence of adequate legislative, including explicit consent of the subject. For example, the former Article 29 Working Party held that transfers of private data to Australia could only occur if EU equivalent safeguards were provided for this data in Australia. In part, the need for adequate protection was due to the fact that some sectors and activities were excluded from the protection of the act, as alluded to previously. Furthermore, the provincial framework in Canada places significant limitations on transfer even between different provinces in Canada. On the one hand, some provincial legislation provides ethics committees with a broad discretion to permit the exchange of data across borders. Others do not specify when data can be transferred or require appropriate consent for transfer. Furthermore, these frameworks override the operation of PIPEDA or the Privacy Act 1985. Due to the difficulty in re-obtaining consent for research as previously discussed, these divergences may therefore place significant limits on the exchange of data across jurisdictional borders without appropriate ethics approval. This legal conclusion is supported by survey and interview evidence from the OECD report discussing delays in the authorization of exchange of de-identified health-related data. Therefore, the greatest regulatory obstacles identified in this paper are as follows: inconsistent definitions regarding de-identification, divergences in consent requirements, and the difficulty in ensuring adequate protection for transfer between jurisdictions (Table 4).

Table 4

Comparison of transfer between jurisdictions and in particular whether data can be transferred without contractual mechanisms

From	To
From	Switzerland	EU-GDPR	US	Canada	Australia
Switzerland	N/A	Adequate	Adequate^a	Contract	Contract
EU-GDPR	Adequate	N/A	Adequate^a	Adequate^a	Contract
US	Adequate^b	Adequate^b	N/A	Adequate^c	Contract
Canada	Contract^d	Contract^d	Contract^d	N/A	Contract^d
Australia	Contract^e	Contract^e	Contract	Contract^e	N/A

Key: Adequate = can be transferred without contractual mechanisms. Contract = can only be transferred with contractual or corporate governance mechanisms.

aOnly transferable to private organizations (privacy shield).

bTransferrable, but requires compliance with common rule for federally funded research.

cTransferrable, but requires compliance with PIPEDA depending on the entity that is sending the data; while commercial entities must comply with PIPEDA, non-profit or government entities may be uncovered.

dDepends on provincial privacy laws, may require contractual mechanisms or consent.

eRequires contractual or corporate governance mechanisms or equivalence.

Comparison of transfer between jurisdictions and in particular whether data can be transferred without contractual mechanisms Key: Adequate = can be transferred without contractual mechanisms. Contract = can only be transferred with contractual or corporate governance mechanisms. aOnly transferable to private organizations (privacy shield). bTransferrable, but requires compliance with common rule for federally funded research. cTransferrable, but requires compliance with PIPEDA depending on the entity that is sending the data; while commercial entities must comply with PIPEDA, non-profit or government entities may be uncovered. dDepends on provincial privacy laws, may require contractual mechanisms or consent. eRequires contractual or corporate governance mechanisms or equivalence.

Appropriate Regulatory and Technical Solutions

Before offering appropriate regulatory and technical solutions, it is important to consider that the research ecosystem is in a constant state of regulatory flux. Traditionally, researchers submit a project protocol requesting access to health data stored in hospitals or some other facility. The research ethics committee at that institute approves the research protocol, checks whether consent extends to the purposes requested, and monitors the project. The introduction of electronic patient data storage facilitates not only the accessing but also the reuse of patient data. In the current research landscape, there are multiple internal organizations that often have overlapping roles in managing the release of data for secondary purposes. In universities, data access committees (DACs) play an increasingly important role in designing access request policies for biomedical data, including genomic data. These DACs are complemented by data access control offices, which were first established as part of the International Cancer Genome Consortium (ICGC) project. The purpose of this office was to standardize the access process for ICGC data across participating institutions. The change in the current research cycle is demonstrated in Diagram 1.

Diagram 1

The clinical research cycle.

The clinical research cycle. However, this ease of access and reuse raises privacy and confidentiality concerns that warrant organizational solutions and technological advances to guarantee privacy. Furthermore, as Shabani, Knoppers, and Borry notes, for these mechanisms to work, there must be clearly defined organizational responsibilities.

Organizational Solutions

A significant number of the legal and ethical obstacles discussed above result from a lack of focus upon the context for processing, disclosure, and use of health-related data. On one hand, the broad, absolute definition of personal data contained within EU and Swiss data protection law does not ostensibly reflect the potential for re-identification from a particular dataset. In other words, completely preventing any potential re-identification may reduce the usefulness of the data set entirely. In turn, useless data undermines the rationalization for using data for research purposes or may even result in poor conclusions being drawn from data. On the other hand, the narrow definition of PHI contained within HIPAA and the Security Rule may fail to prevent the disclosure of personally identifying information via inferences. Prima facie, both regimes do not ultimately acknowledge that the risk of re-identification may vary significantly depending on the content of each dataset. Therefore, one strategy to resolve this problem involves the use of contextual disclosure based on organizational agreements between agencies. This strategy reflects the best practices in the UKAN Anonymisation Decision-Making Framework to consider both technical and contextual risks associated with reuse and disclosure. Furthermore, the limited existing case studies of successful linkages of medical data already rely on organizational links for multisite research. For example, one of the oldest and most successful ongoing projects involving data linkage is the Western Australian Data Linkage System (WADLS), established in 1998. Initially, the WADLS only contained data generated from local hospital records. However, Australia features a hybrid federated system of governance, where both Federal and State governments are responsible for healthcare. Any healthcare provided to patients by state hospitals would nevertheless be funded by Australia’s national healthcare System Medicare and the Pharmaceutical Benefits Scheme. Linking this data together could provide rich information for longitudinal health studies, but it could also raise significant risks of re-identification or other breaches of confidentiality. Accordingly, a research protocol was developed and approved by the then Australian Privacy Commissioner to link these datasets to patient datasets from state-funded healthcare providers. First, this protocol involved a memorandum of understanding where the responsibilities of each party were established and ethics approval for creating linked data files was obtained from both State and Federal committees. Additional ethics approval for the use of linked data files in individual research projects then requires minimal bureaucratic delay, as researchers receiving the data only to see the unique identifiers. Although ethics approval was sought to reuse the data without consent, the previous benefits of longitudinal health projects served to engender community trust in health data linkage. Therefore, similar organizational strategies need to be adopted for health data linkage projects to ensure the successful reuse of data. First, data can only be used or shared between trusted organizations. The identity and responsibilities of these trusted organizations must be determined with the assistance of participating institutions. These include the data custodians such as hospitals and physicians who collect the data from patients, as well as the research institutions responsible for processing data. Furthermore, research ethics committees can play an important role in determining the responsibilities of each institution. In addition, research ethics committees should provide guidance on developing an initial ethics approval to minimize the bureaucratic burden for ongoing uses of data. Where the responsible institutions operate in separate jurisdictions, this process should account for any legislative divergence and can be reinforced by binding contractual mechanisms (as discussed below). Finally, research ethics committees should provide guidance to researchers on mechanisms under which the relevant data can be made available for reuse. These strategies can include promulgating general consent forms to patients to make data available for further use, or determining whether a waiver can be sought for the ongoing use of data. An additional but equally important step is ensuring that public trust is obtained for the use of the data in this fashion. Although the WADLS has been enormously successful, other projects such as the care.data program in the UK have been cancelled due to a lack of outreach and resultant community outrage. In particular, the exchange of patient data to private companies for commercial purposes can easily result in patients refusing to participate. Accordingly, data linkage projects must reflect the fact that they are ultimately designed to ensure the quality and effectiveness of public health services. There are a number of strategies that can be employed in this context to improve patient trust and reciprocity. The first is the use of dynamic or granular consent, where patients can opt in and out of certain research projects. Although dynamic consent has been considered in-depth within the legal and bioethical literature, Estonia has introduced this mechanism into their national electronic health record system. Via their web accessible record, patients are constantly notified as to whether they wish to continue participating in particular projects. A second important strategy is the use of patient education and awareness campaigns. Robertson et al. noted that expert interviewees identified the lack of public awareness about the care.data program as a key reason the program failed (compared with the WADLS). Furthermore, as Gille, Smith, and Mays note, public trust derives from a number of sources, including the existence of regulatory systems to enforce rights. Therefore, any public awareness campaign must be designed so that patients are aware of the ability to enforce their rights, as well as the permissible ethical uses of their data.

Contractual Solutions

Another strategy that has been increasingly used to promote data accessibility, particularly for international research projects, involves contractual solutions to support research and processing. In particular, standard form agreements for data use and transfer help demonstrate satisfactory protection for data transferred between organizations, as well as data transferred across borders. For example, the Swiss, EU, Italian, Spanish, UK, Australian, and some Canadian provincial legislation reference agreements for the transfer of data. These agreements set minimum standards of security and processing that all parties involved in data linkage must comply with. For countries with more flexible data processing standards, such as Australia, contracts can provide more explicit guidance to researchers in those jurisdictions as to how to meet their obligations. Finally, the existence of contracts for processing further reinforces the social license required from members of the public to use their data for processing. This question of public license also raises the related question of data ownership. Although not explicitly explored within the previous comparison of legislative frameworks, collections of health information also involve numerous stakeholders who each hold specific rights over their data. Specifically, any solution requires considering not only data protection laws and research ethics regulations but also intellectual property laws (such as copyright), and data use or material transfer agreements (MTAs). In the USA, there is no copyright protection for raw data, but only products of data. Similar limitations exist on copyright protection for compilations of data in Australia, although Canadian courts have acknowledged copyright for data compilations. In addition, although raw data are unlikely to receive copyright protection, copyright could extend to curated patient records and associated notes. To resolve these differences in intellectual property protection, standardized contractual mechanisms can be used to determine the rights held by each participating stakeholder. In biomedical research, MTAs are frequently used to govern the exchange of human tissue and data between institutions and to ensure provenance. Therefore, similar agreements have been attempted for data transfer and use agreements (DTUA). However, designing a uniform DTUA carries a number of additional challenges beyond a uniform MTA. The first is resolving the conflict between different data protection and privacy regimes. The analysis above demonstrates that data can only be transferred to jurisdictions where adequate protection is supplied. Therefore, any MTA or DTUA must ensure that arrangements for data sharing are ethically robust and legally compliant. However, one complicating factor is additional intellectual property rights that may vest in data. In the EU (but not Switzerland), copyright protection is further complemented by the sui generis regime for databases. Nevertheless, this right only protects the non-technical structure of the database. Furthermore, the Database Directive prevents rights holders from introducing specific contractual terms that would fall outside the exceptions to restricted acts. The effect of this provision is that database rights may act to prevent standardizing contractual terms. Accordingly, when developing collaboration agreements, institutions should consider all applicable intellectual property rights.

Technical Solutions

The previous contractual and organizational measures can support data sharing, but alone can neither entirely prevent intentional or unintentional re-identification. Instead, these solutions must be paired with technical solutions supporting data linkage and data reuse in multisite research while technically minimizing the risk of re-identification. In the example of WADLS above, linkage was achieved by matching together each dataset probabilistically and by replacing all personal details with a unique link ID. These files were created via isolated secure computer systems, with the technicians responsible for linkage not allowed to participate in data analysis and personal demographic data being destroyed afterwards. Furthermore, as mentioned previously, traditional de-identification and pseudonymization techniques (implemented as the removal of common identifiers) are not enough to properly minimize the re-identification risk. This risk is particularly heightened when dealing with clinical and omics data. However, advanced cryptographic privacy-preserving solutions have the potential to play a major role in the significant reduction of the risk of re-identification. The DPPH project specifically focuses on the practical application of these technologies in medical research. The purpose of cryptography historically has been to hide the content of information by transforming it into ciphertexts through a hard-to-invert or one-way function. However, the greatest challenge with encryption technology reflects that discussed by Ohm regarding anonymization, that is, ensuring anonymized data is both safe and useful. Traditional encryption is profusely used nowadays to protect data in transit and at rest. Data in transit can be encrypted being communicated between two end-points, protecting it against eavesdroppers. Data at rest can be encrypted while stored in an untrusted environment, to protect it against unauthorized actors. Nevertheless, data reuse for research goes beyond communication and storage, involving a significant re-identification risk during computation, in which traditional encryption cannot be mitigated. For this purpose, more advanced cryptographic techniques can enable the computation of some functions on data without disclosing these data to the processor. These techniques are aligned with the minimization and purpose principles. Specifically, the technology makes it unnecessary to provide the recipient with access to the data, but only to the computation results needed for the established purpose. These techniques can be classified into software- and hardware-based solutions. Software-based solutions comprise homomorphic encryption and secure multi-party computation (SMPC). Hardware-based solutions comprise trusted execution environments (enclaves) where data are decrypted inside a tamper-proof enclave and processed in the clear only inside the enclave. In this section, we focus on software-based solutions. Recent advances in fully homomorphic encryption, which can handle an arbitrary number of encrypted operations, are promising. Unfortunately, their computational overhead is still far from practical. However, the so-called Somewhat Homomorphic Encryption (SHE) or Practical Homomorphic Encryption can efficiently enable a limited set of operations on encrypted data without the need to decrypt them first. This technology protects the data from an untrusted processor (or from attacks happening at the processor premises). Nevertheless, the versatility of these algorithms is limited, and they have to be paired with other techniques in order to develop fully functional end-to-end systems. Conversely, SMPC enables several parties to evaluate a function on private data coming from distinct data sources without aggregating or sharing the input data. At the end of the protocol, the parties learn nothing more but the value of the function. Various privacy-preserving supervised machine learning algorithms have been proposed and analyzed in the SMPC setting. In addition, both training and prediction algorithms for deep learning have been developed. Crucially, the combination of the aforementioned techniques may provide the means to allow computation without needing to transfer data between jurisdictions. These cryptographic techniques minimize the re-identification risk for data at rest, in transit, and during computation. However, the re-identification risk is not decreased for disclosed computation outputs that are disseminated for research, and sufficient information is available to identify the individual. In other words, the same legal and ethical limits that apply to the exchange of data will continue to apply to these research results. Specifically, repeated query requests to these data can be used to conduct inference attacks against individual records. However, statistical protection techniques, such as k-anonymity, t-closeness, and, in particular, differential privacy, can be used to address this issue. These techniques operate by minimizing the re-identification risk based on mathematical guarantees at the cost of a controlled reduction on data utility. Concrete examples of these technologies in multi-site data sharing scenarios have been proposed to enable queries on a set of independent databases while protecting privacy and confidentiality. These solutions also aim at protecting the privacy of each individual storing its data in these shared databases. To compare genomic data among different patients in an honest-but-curious model, one secure two-party computation method involves patients encoding their genomic variants. The system ensures that query outputs only some statistical information about the queried variants and nothing else about the patients. Recently, several frameworks have been described which use homomorphic encryption or multiparty computation in a decentralized way. These frameworks enable statistical queries on distributed databases while avoiding single points of failure for the security and privacy guarantees. Within the DPPH project, we have also developed an operational system, called MedCo. MedCo protects the confidentiality of the patients while enabling very efficient data discovery and feasibility analyses on clinical and genomic data. Our system achieves this by combining homomorphic encryption, multiparty computation, and differential privacy. At the time of writing, this system is being tested at the Swiss university hospitals as a pilot deployment.

Conclusion

This paper highlights the key regulatory jurisdictional divergences in data protection and research ethics laws for health-related data through the lens of data accessibility. Specifically, we identify three key differences between Switzerland, the GDPR (and its national implementation in Italy, Spain, and the UK), the USA, Canada, and Australia. The first key difference concerns inconsistent and technically vague definitions of anonymization, pseudonymization, and de-identification. As a consequence, these definitions obfuscate whether anonymization and de-identification can be satisfied algorithmically or whether changing the environment where the data are stored is also necessary. Furthermore, it is unclear whether encrypted data constitute their own category of data or should be assigned to one of the existing categories. The second key difference concerns the potential for secondary uses of data beyond that for which consent has been obtained. In particular, the GDPR places significant limitations on the use of data for secondary purposes. National implementations of the GDPR explored in this paper permit secondary processing of genetic and other health-related data. Nevertheless, there remain outstanding questions whether these laws comply with the GDPR’s consent requirements and whether they will create further regulatory fragmentation in the EU. The third key difference dovetails into this issue and concerns the requirements for transfer of health-related data between jurisdictions. With the exception of transfers within the EU and transfers between the EU and Switzerland, there are conditions on the exchange of data between jurisdictions. For transfers to the USA and Canada, transfer on the grounds of adequacy is limited to private companies. Accordingly, for transfers from the EU to public research institutes in the USA or Canada, either binding contractual and corporate rules or standard data protection clauses are required. Furthermore, these safeguards are required for transfers to and from Australia to all other jurisdictions considered in our paper. Therefore, in this paper we suggest a number of organizational, contractual approaches which we believe are necessary to increase data accessibility for multi-site research consortia. Organizational approaches include clarifying the roles of each participant in the consortia, prioritizing ethics requirements prior to research, and obtaining patient trust for research. Contractual approaches include continuing to utilize and improve standardized DTUA for the exchange of data between institutions. These agreements can provide explicit and uniform protection for patient rights across jurisdictions while clearly allocating intellectual property rights between institutions. Nevertheless, we concede that these organizational and contractual solutions alone are not sufficient to entirely protect against the risk of re-identification. Accordingly, in this paper, we propose a number of software-based technologies to protect the privacy of patient data while ensuring data accessibility. In particular, these technologies include variants of homomorphic encryption, such as fully and SHE, and secure multiparty computation. Crucially, these solutions diverge from traditional encryption in that they extend beyond protecting data at rest and in transit, and instead only disclose the results of computation. This protection is achieved either by encrypting the inputs of computation or ensuring that each party can process data without the need for access to all the inputs. To decrease the risk of potential re-identification from results, secure computation can be coupled with further privacy guarantees such as differential privacy. Future research should explore, in-depth, the compatibility between these advanced privacy preserving technologies and the existing ethical and legal frameworks, as well as future standards and soft law. Finally, any use of these technologies must be backed by an appropriate education and public awareness campaign. The technological solutions that have been proposed in this paper are conceptually complicated, increasing the difficulty of explaining the benefits of this technology. Therefore, any public education campaign should be carefully crafted so as to reflect the computer literacy of members of the general public. Furthermore, this education campaign should be framed in such a way as explain to users how to exercise their rights with respect to the technology in question.

7 in total

Review 1. Rules of engagement: Promoting academic-industry partnership in the era of digital pathology and artificial intelligence.

Authors: Liron Pantanowitz; Marilyn M Bui; Chhavi Chauhan; Ehab ElGabry; Lewis Hassell; Zaibo Li; Anil V Parwani; Mohamed E Salama; Manu M Sebastian; David Tulman; Suryanarayana Vepa; Michael J Becich
Journal: Acad Pathol Date: 2022-05-30