Jennifer B McCormick1, Margaret A Hopkins1. 1. Department of Humanities, College of Medicine, Pennsylvania State University, Hershey, Pennsylvania, USA.
Abstract
OBJECTIVE: Researchers are increasingly collecting large amounts of deidentified data about individuals to address important health-related challenges and answer fundamental questions. Current US federal regulations permit researchers to use already collected and stored deidentified health-related data from a variety of sources without seeking consent from patients. The objective of this study was to investigate public views on the policies and processes institutions have in place for accessing, using, and sharing of data. MATERIALS AND METHODS: We conducted 5 focus groups with individuals living within a 20-mile radius of the local academic medical center. We also held a focus group with undergraduates at a local university. RESULTS: A total of 37 individuals participated, ages 18-76. Most participants were not surprised that researchers accessed and used deidentified personal information for research, and were supportive of this practice. Transparency was important. Participants wanted to know when their data were accessed, for what purpose, and by whom. Some wanted to have some control over the use of their data valuing the chance to opt-out. Finally, participants supported establishment of an advisory council or group with responsibility for deciding what data were used, who was accessing those data, and whether data could be shared. DISCUSSION AND CONCLUSIONS: The trust people have in their local institutions should be considered fragile, and institutions should not take that trust for granted. How institutions choose to govern patients' data and what voices are included in decisions about use and access are critical to maintaining the trust of the public.
OBJECTIVE: Researchers are increasingly collecting large amounts of deidentified data about individuals to address important health-related challenges and answer fundamental questions. Current US federal regulations permit researchers to use already collected and stored deidentified health-related data from a variety of sources without seeking consent from patients. The objective of this study was to investigate public views on the policies and processes institutions have in place for accessing, using, and sharing of data. MATERIALS AND METHODS: We conducted 5 focus groups with individuals living within a 20-mile radius of the local academic medical center. We also held a focus group with undergraduates at a local university. RESULTS: A total of 37 individuals participated, ages 18-76. Most participants were not surprised that researchers accessed and used deidentified personal information for research, and were supportive of this practice. Transparency was important. Participants wanted to know when their data were accessed, for what purpose, and by whom. Some wanted to have some control over the use of their data valuing the chance to opt-out. Finally, participants supported establishment of an advisory council or group with responsibility for deciding what data were used, who was accessing those data, and whether data could be shared. DISCUSSION AND CONCLUSIONS: The trust people have in their local institutions should be considered fragile, and institutions should not take that trust for granted. How institutions choose to govern patients' data and what voices are included in decisions about use and access are critical to maintaining the trust of the public.
Biomedical and public health sciences researchers are increasingly collecting and
combining large amounts of data about individuals to address important
health-related challenges and answer broad fundamental questions about healthy state
versus diseased state. Sometimes referred to as “big data,” these are
typically deidentified data and mined from both healthcare- and
nonhealthcare-related sources. Those sources include electronic medical records, biobanks,
and insurance claims as well as nonhealthcare-related sources such as life-style
questionnaires and social media sites.Researchers have benefited from having access to big data in multiple ways. By
combining research participants’ data into large repositories, investigators
can examine differences, for example, between healthy populations and populations
with a specific disease phenotype. Because they can access sample sizes larger than
what they could collect on their own, researchers also can investigate variations
among populations of different ethnicities, socioeconomic backgrounds, and geocodes.
This volume of data also can enable deeper understanding of how different molecules
within a biological system interact and influence healthy and diseased states. In fact, the National
Institutes of Health All of Us program is based on using large volumes of data to
benefit public health.While current U.S. federal regulations permit researchers to use collected and
combined deidentified health-related data without seeking explicit consent from
patients, they often are unaware that patient health
information, which was collected for clinical use, might be repurposed for
research., Nonetheless, studies have found broad support for this
practice, as evidenced by a systematic review of 25 qualitative studies. Studies have shown that
some research participants support data sharing as a means of contributing to
advancements in healthcare—the “greater good”—while
others see the potential to learn health information about themselves or to help
others with the same health condition or disease.Still, research has shown that in many cases, participants’ support is
conditional. Researchers have found, for instance, that participants want
“granular control”—that is, they will share some information
but not necessarily information they consider “sensitive.” Furthermore, they want to
choose or control with whom information is shared., Other studies have found that
participants’ willingness for their data to be shared is linked to being
consulted or consented,, to the healthcare systems in which they are patients,
or with nonprofits.,
However, participants are less inclined to want their health data shared if the
entities involved are private companies such as insurers or government
agencies.,, Perhaps not surprisingly, study participants
have fewer reservations when their data are anonymized although some people hesitate
even then.,, It should be noted that many of these studies
specifically ask participants about biobank samples and data in the electronic
medical record.A frequent concern cited by study participants involves the security of their
health-related data shared, or security breaches that would allow for inadvertent
sharing of their data., That concern is heightened when the sharing of data is
across healthcare organizations or with entities not directly delivering patient
care., Patients want to know more about the kinds of
protections in place to protect their information. Researchers also report that some
patients are more concerned about security of financial information or personal
identity than they are about their health data. In some studies, participants raise
concerns that broad data sharing could lead to stigmatization of communities or
negative treatment and discrimination of certain ethnicities.,,While multiple studies have explored patients’ views on the sharing of their
health-related data, few have investigated patient views on the policies
institutions have in place or should have in place for the sharing of those data.
Such policies would include who has oversight responsibility for the sharing of
deidentified patient data; who is deciding about what entities can access data; and
what mechanisms are in place to ensure that shared data are used appropriately.
These are questions about data governance. A European study involving patients in 10
countries is one of the few to specifically ask participants for their views on
governance structures for managing the large amount of health and genetic data being
collected. Those
researchers found that participants wanted experts within the organization that was
home to the data to review requests for data access. Those same experts should
monitor how those data were used. Participants define those experts as healthcare
professionals, researchers, patient representatives, and lay persons among
others.Given the few studies that have explored who is making decisions about data access
and use, we conducted an exploratory study using focus groups to fill this gap and
investigated people’s views how the access and use of their deidentified
health-related information should be managed. Here we discuss the most salient
findings: (1) participants have concerns about whether the data would be shared and
with whom; (2) participants view some institutions as trustworthy and others as not;
and (3) participants value transparency about who should make decisions about the
access and use of personal data by researchers.
MATERIALS AND METHODS
We held 5 focus groups with participants who live within a 20-mile radius of an
academic medical center in rural Pennsylvania. Participants were recruited through
flyers, articles in local newspapers, and StudyFinder. (StudyFinder is a
Pennsylvania State University website for the public to search for actively
recruiting university clinical research studies by keyword or browse by health
condition. The University of Minnesota Clinical and Translational Science Institute
developed the StudyFinder platform (UL1TR000114) and shared it throughout the
National Center for Advancing Translational Sciences’ Clinical and
Translational Science Award Program. Pennsylvania State University Clinical and
Translational Science Institute (UL1TR002014) customized the StudyFinder platform
and supports it for Pennsylvania State University.) Inclusion criteria were English
speaking, willingness to share in a group, and 18 years of age and older. To explore
whether generational differences might influence perspectives, we also recruited
undergraduate students (n = 10) at a nearby institution of
higher education.Participants completed a demographic questionnaire that included questions on age,
gender, marital status, occupation, educational achievement, and ethnicity. They
were also asked if they or members of their families had previously taken part in a
research study. A discussion guide with open-ended questions was developed to
explore participants’ understanding of what personal data are being accessed
and used by researchers. One research team member (JBM) facilitated each group. Each
focus group lasted between 60 and 75 min. Discussions were audio recorded and
transcribed by a member of the research team (MAH).At the start of each discussion, we provided general information on what is meant by
personal health-related data, sources of those data, and the concepts of
“deidentification” and “anonymity.” We encouraged
participants to ask questions and offer opinions in order to arrive at a shared
understanding of those concepts. With this framing, we then moved to
participants’ perspectives on whether they supported the use of their
deidentified health-related data for research, whether participants wanted to be
informed about research studies that access and use their data, and whether personal
data collected at one institution should be shared across academic institutions and
with other entities such as pharmaceutical questions. We also asked who should be
making decisions about how patients’ personal health-related data are
accessed and used. These questions were asked of each focus group. The discussion
guide is available upon request.The initial 5 focus groups were held in Fall 2018 at which point, saturation of data
was reached. However, because only 2 of 27 participants were younger than 40 years
of age, we deliberately sought to hold a focus group of younger participants in
order to explore generational differences. This focus group was held in Spring 2019
on the campus of a nearby institution for student convenience. This focus group was
45 min because of students’ class schedules.Both authors (JBM, MAH) read each transcript individually, identifying themes that
emerged from the data. After sharing and discussing these, we developed a
preliminary codebook based on the most prominent themes that we agreed upon. We then
re-read the transcripts, revising and refining the codebook in an iterative process.
Disagreements about codes were resolved through discussion. NVivo 12 (QSR International) was used for
coding.This study was approved by the Pennsylvania State University Institutional Review
Board, and all participants provided verbal informed consent.
RESULTS
Participant characteristics
The initial focus groups included 27 participants (20 women and 7 men; average
age was 58 years). Twenty-five participants self-identified as white, one
self-identified as American Indian, one as multiracial. Educational levels
ranged from completion of a GED (n = 1), graduation from
high school and junior college (n = 9), completion of a
4-year degree (n = 10) to completion of graduate and
professional degrees (n = 7). One participant worked in
health care as a registered nurse while 13 self-identified as retired.
Participants also were asked if they had concerns about who has access to their
personal information. While most didn’t (n = 17),
a notable number did (n = 10). The ages of those who
participated in the undergraduate student focus group were between 18 and 22
years with 1 returning adult who was 34 years old.Below we describe 3 findings that emerged from the initial 5 focus groups. This
is followed by data from the student focus group.
Concerns about whether personal health data will be shared and with
whom
While participants were broadly supportive of research and willing to have their
data used by researchers with their academic medical institution, more than half
of participants were less willing for their data to be shared with other
institutions. These participants’ reasons mostly reflected uncertainty
about downstream use of their data. For instance, one participant worried that
purpose of the research might change: “Cause you don’t know what
the next institution’s gonna do or who they’re gonna give it [the
data] to.” (Male 3, Focus Group 4). Another who had misgivings about
sharing her personal information with researchers at her healthcare institution
assumed that downstream sharing of information could lead to identification of
patients: “I would be worried at some point that people would be
identified. If it’s going so many places, so many people are involved, so
many people seeing that data, that would be a little worrisome to me”
(Female 3, Focus Group 1).Concerns also were raised about what happens with data given possible mergers and
acquisitions. “You don’t know in the future what [this
institution] will evolve into. To some extent you’re just letting it
go” (Female 6, Focus Group 2).Few participants wanted their data shared with pharmaceutical companies or
companies that stood to benefit financially from it. While pharmaceutical
companies were mentioned by 8 participants, 3 participants also took issue with
biotech companies such as 23&Me that share personal information.
“If my data is [sic] being used with private companies that are all about
profits, that’s when I think I would have more of an issue with
it” (Male 4, Focus Group 4).Such concerns led a third of participants to want some control over the use of
their data. For instance, they wanted limits on how long their data could be
used or how their data were used: “…I want to know how the data
was used to help other people or you know, maybe led to another study to get
closer to what you’re trying to find” (Female 6, Focus Group 2).
Others wanted researchers to ask participants for permission to share their data
with one participant suggesting that patients be provided with a checklist of
possible entities with whom data could be shared.A few, however, were comfortable with having their data shared as long as they
knew about the sharing or were notified. Four participants noted that they
assumed the practice of sharing their data was already occurring so were not
surprised when informed about it. Three additional participants were willing to
have their data shared with other institutions but only if those institutions
provided their research protocols or signed agreements to follow “the
same standards and responsibilities and ethical considerations….”
of the data-granting institution (Female 6, Focus Group 2).
Why some institutions are trustworthy and others are not
A quarter of participants described themselves as being inherently trusting, and
trusting “…until you give me a reason not to trust you”
(Female 2, Focus Group 4). A third of participants extended this trust to their
healthcare organization: “I trust [the academic medical center where this
study occurred] because I haven’t been burned by it” (Male 2,
Focus Group 4). Another cited her positive experience with her doctor as her
reason for her trust in the organization while a third participant said she
trusted the hospital because “it’s been around for a long
time” (Female 3, Focus Group 1). Others based their trust in their local
healthcare organization simply because they and their family members have
received treatment there or known people employed by it.While participants noted the organization’s positive reputation, they also
acknowledged they had had some negative experiences. Reflecting on those, one
participant recognized she had not only learned more about the institution from
those experiences but they had also solidified rather than weakened her trust in
the institution: “I know them well enough to trust them” (Female
2, Focus Group 1).Six participants expressed trust in the biomedical research enterprise. They
noted the existence of research protocols, protections such as HIPAA, and
physicians’ Hippocractic Oath as sources of trust: “I think
that’s why a lot people kind of trust the researchers here …
because you’re doing no harm, you’re doing good. That benefits you
to be doing research in this setting,” said a participant (Female 6,
Focus Group 2).That trust didn’t always extent to other academic medical centers Asked
about sharing data with other universities, one participant commented,
“I’d want to know more. You can’t just say, oh, it’s
Princeton. Who at Princeton, you know?” (Female 6, Focus Group 2). Said
another, “Cleveland Institution or whatever you mentioned, I would
probably not do anything with them at this point because I know nothing about
them” (Female 2, Focus Group 1).A third of participants had even less trust in institutions such as
pharmaceutical companies and biotechnology corporations. Big Pharma, said one
participant, “isn’t really interested in a healthy population.
They’re interested in selling as many of their drugs as they can
produce” (Male 1, Focus Group 5). Another participant cited pricing
issues as a source of his distrust: “I read that some pharmaceutical
companies when they find a particular medication that’s most popular,
they tend to increase the prices on them … I don’t like
that” (Female 4, Focus Group 1).Participants who were distrustful of pharmaceutical companies also tended to be
distrustful of biotechnology companies and of government agencies’ use of
genetic databases produced by biotech companies. In response to a question about
providing a genetic sample to the National Institutes of Health, one participant
noting that the possibility “freaks me out. You know, it’s gonna
come back, and it’s gonna bite you in the butt” (Female 1, Focus
Group 2). Said another, “I do question the government thing [and am] wary
of what they’re (sic) gonna do with it…. Things like government, I
feel they’re always out to get you” (Female 5, Focus Group 2).
Who should make decisions about researchers’ access and use of
patients’ personal data
All participants supported establishment of an advisory council or group to make
decisions about what data were used, who was accessing those data, and whether
data could be shared. This group would function as a “gatekeeper between
the data and the use, so that a researcher who wants the data needs to make a
very formal proposal to this council before you open up the doors to all the
data” (Female 7, Focus Group 2). This group should be formalized through
policy so that “you don’t just have a group of people who get
together Monday morning with a cup of coffee and say, ok, we’re gonna let
him have the data” (Male 1, Focus Group 3).No consensus was reached about the size of this group although almost all
participants across focus groups advocated for a team of individuals rather than
a sole individual. Their argument was that a team would keep the decisions from
being hijacked by a single decision-maker with an agenda: “There needs to
be a group table because everybody’s looking out for their own
agenda” (Female 2, Focus Group 5); “That’s why I want 5 [at
the table]. They’re gonna have 5 different agenda, but they have to agree
on how it’s [the data] is used, so one agenda can’t take priority
over the others” (Female 2, Focus Group 1).While no consensus was reached about the number of seats at the table, there was
consensus that members of this advisory group were not just “Joe Blow
from down the road” (Female 5, Focus Group 1) and didn’t
“have to be all doctors” (Female 2, Focus Group 1). More
specifically, stakeholders should include lawyers, a cybersecurity expert or
computer scientist, medical professionals, and researchers, the last of which
included both those involved in the study and those with expertise about the
specific area of research such as department heads. “A team of folks that
understands what the whole mission of the research is, whatever the research
is,” noted one participant (Female 2, Focus Group 3). Said another,
“someone who can determine the need to know, who knows who needs to
know” (Female 4, Focus Group 1). Inclusion of hospital administrators,
members of hospital ethics committees, and privacy advocates also was
mentioned.Opinions differed about whether patients and volunteers should also be
represented with one participant asking, “who would choose that person
from the public and what makes that person from the public qualified?”
(Female 2, Focus Group 1). However, about a third of participants endorsed
having volunteers or someone representing research volunteers on the advisory
group, noting the need for “representatives that are like us, that are
participants who can get their voice heard. That would at least give people the
sense that it’s not just researchers that are making the
decisions” (Female 6, Focus Group 2). It was also suggested that the
advisory group membership fluctuate depending upon the purpose of the study or
the population to be studied: “Maybe the group is multiple. It’s
not one set group because of the different types of research that you’re
dealing with” (Female 4, Focus Group 3).
Exploring university students’ attitudes about data sharing, data
governance
Participants in this focus group recognized that the sharing of personal data
through information technologies is ubiquitous, and they largely accepted that
as the price of being connected. However, opinions differed about the
acceptability of sharing of personal health data. Two students assumed that from
a medical perspective, their health data would have little value:
“Nothing special [has] happened to me, so I don’t really care [if
health data shared]—especially if it’s going to benefit in a
positive way” (Female 4, Focus Group 6).Even so, half of the student participants wanted some control over the sharing of
and access to their information. They wanted to sign a consent form or be asked
for permission before their personal health information was shared. Another
wanted to be assured the purpose was good—“I want you to cure
something” (Female 4, Focus Group 6). Yet others wanted “the
ability to revoke the use of information,” depending upon the purpose of
the research and the recipient of the information (Female 7, Male 9, Focus Group
6).Unlike other participants, these students wanted to know the results of any
research that used their data. Once the research was finished, students wanted
to see either the full report or a short summary explaining how their data were
used. As one student suggested, learning this could assure them the data were
deidentified: “You want to make sure you’re not out there”
(Male .10, Focus group 6).As for who should make decisions about the access and sharing of health-related
data, the student participants added a privacy advocate and an enforcer to
ensure that policies are followed. They hedged on inclusion of a student on the
advisory board: “Someone who can speak on behalf of students, but I
don’t know that a student would be in the best position to understand the
ramifications of things” (Female 7, Group 6).
DISCUSSION
Overall, our participants were supportive of health-related research and trusting of
biomedical and public health sciences researchers. While generally unaware that
their personal data could be accessed and used by researchers, they had few
reservations about this practice when pursued at their local institution. However,
participants were less supportive of having their data shared with other
organizations. Some participants wanted only researchers at their local institution
to use their data while others wanted to be consented every time their data might be
shared outside the local institution.For our participants, transparency was key. They wanted to know when their data were
accessed, for what purpose, and by whom. Some also wanted to know the purpose of the
study so as to determine if they agreed with it or to opt-out. For others, the
determining factor was whether their data were going to be shared outside of their
local institution. Across the focus groups, however, wanting some control
didn’t conflict with participants’ support of the use of their data
for research.Participants saw advisory boards as foundational to that transparency. Those board
would develop and implement policies and procedures that would address
participants’ questions about when their data were accessed, for what
purpose, and by whom as well as whether and with whom their data would be shared. As
such, these boards’ mission would be fundamentally different than
institutional review boards (IRBs) that oversee the safety and well-being of
individuals who participate in research. While the IRB has 1 or 2 community members,
the rest of the membership are individuals with expertise in research using humans
and the regulations for research using humans, that is, the Common Rule. A data
governance board would have individuals with expertise in data science, data sharing
and privacy, computer science and cybersecurity, and federal and institutional
policies about data access, use, and sharing.Underlying participants’ concern whether their data would leave the boundaries
of their local institution were issues of trust. As has been concluded by other
researchers, people generally are more trusting of known or familiar
organizations. Hence, our participants trusted their local
academic medical institution, but were less trusting of other academic medical
institutions—even nationally known ones such as Princeton University. Our
participants also didn’t trust commercial entities such as insurance
companies and pharmaceutical companies. Sharing of their data with pharmaceutical
companies in particular was not supported if that sharing were to result in profits
for a company or the industry as a whole.The trust people have in their local institutions should be considered fragile, and
institutions should not take that trust for granted. Patients expect that their
healthcare institutions will be careful, conscientious, and responsible caretakers
of the personal information with which they are entrusted. To that
end, our participants supported establishment of an advisory council or group with
responsibility for deciding what data were used, who was accessing those data, and
whether data could be shared. Our participants also expressed interest in knowing
who serves on that data governance board and what their backgrounds and expertise
are.How institutions choose to govern patients’ data and what voices they include
in decisions about use and access are critical to maintaining the trust of the
public. As a concept and a practice, governance is said to provide a way for
addressing the ethical, regulatory, and policy challenges of research with personal
information. More specifically, governance addresses how and why deidentified data
are accessed and used by researchers, who makes those decisions, and how these
decisions are made—all of which may not be known by patients who are
providing the data.This study has several limitations. The study is qualitative with a small sample
size, and as such the findings are not generalizable. The majority of our
participants are 35 years or older. Additionally, 17 of the 27 participants were
college graduates or had advanced professional training, raising the questions of
whether focus groups of participants with less educational achievement would result
in similar findings and whether those with less education felt they could contribute
meaningfully. (Focus groups have been used in communities with low literacy
successfully. See, for example, Refs.,) Most of our participants are white as well as
patients at the local academic medical center. Their race/ethnicity may have
generated a sense of trust in institutional researchers that black or brown
participants might not have had, given historical medical inequities. The same is
true for whether familiarity with the academic medical center might have resulted in
more trust than a nonpatient participant population might have had.Another limitation is the technical background of the participants in the student
focus group, all of whom are pursuing degrees in information technologies. Selection
of these students was not deliberate but occurred through friend-of-a-friend
recruitment by one student who also participated in the group.Finally, we conducted these focus groups prior to the COVID-19 epidemic. In this new
era we are witnessing more surveillance and less trust in certain institutions,
though confidence in medical scientists has grown since the coronavirus
outbreak.In spite of these limitations, our findings provide initial insights into what
patients think about who should be making decisions about data access and use and
how those decisions should be made, and as such, provide the basis for larger more
generalizable future work.
CONCLUSION
Participants in our study were clear in wanting to know about their local
institution’s governance processes and policies. They advocated for a diverse
group of stakeholders from researchers to patients to serve on an advisory or
governance committee. The also advocated for more information either to be provided
or to be available that spelled out the governance processes and policies.Healthcare institutions typically provide materials outlining that patient
information may be used for research, education, and quality improvement purposes.
In addition, institutions might consider notifying patients when and how their
personal information is being used for research by, for example, sending periodic
letters to patients or hanging posters hanging in waiting rooms. Others have
suggested similar types of notification. Ultimately, transparency of this sort may have an
important influence on patient trust in both their healthcare institutions and the
biomedical research enterprise., That
said, little is known about whether patients understand what is meant by
“patient information” or how they conceptualize “research,
education, and quality improvement purposes.” Moreover, there is a paucity of
evidence on whether this kind of transparency actually increases patient trust.While we initiated each focus group with background on what is personal health
information, what are sources of health information, and what does deidentification
mean, these topics are multilayered and complicated. Given that, providing materials
may not be the best means of addressing the transparency about data governance
policies and processes that our participants said they wanted. Our findings, thus,
provide a basis for additional investigation into what patients think about who
should be making decisions about researchers’ access and use of personal data
as well as how those decisions should be made.
FUNDING
This study was funded by the Department of Humanities, College of Medicine,
Pennsylvania State University. JBM was funded in part by National Center for
Advancing Translational Sciences/National Institutes of Health under grant number
UL1 TR002014. The content is solely the responsibility of the authors and does not
represent the official views of the NIH.
AUTHOR CONTRIBUTIONS
Both authors contributed to the study design, data collection and analysis, and
manuscript preparation.
ETHICS APPROVAL AND CONSENT TO PARTICIPATE
This study was approved by the Pennsylvania State University Institutional Review
Board (IRB). Approval was granted for obtaining oral consent. Everyone in the study
consented to participate, including audio recording of the conversation,
transcribing of the recording, and analyzing and publication of the data
produced.
CONSENT FOR PUBLICATION
Both authors have consented to the publication of this manuscript.