| Literature DB >> 30664452 |
Anthony R Pisani1,2, Nitya Kanuri3,4, Bob Filbin5, Carlos Gallo6, Madelyn Gould7,8, Lisa Soleymani Lehmann9,10, Robert Levine11,12, John E Marcotte13,14, Brian Pascal15, David Rousseau16, Shairi Turner5, Shirley Yen17, Megan L Ranney18.
Abstract
Data sharing between technology companies and academic health researchers has multiple health care, scientific, social, and business benefits. Many companies remain wary about such sharing because of unaddressed concerns about ethics, data security, logistics, and public relations. Without guidance on these issues, few companies are willing to take on the potential work and risks involved in noncommercial data sharing, and the scientific and societal potential of their data goes unrealized. In this paper, we describe the 18-month long pilot of a data-sharing program led by Crisis Text Line (CTL), a not-for-profit technology company that provides a free 24/7 text line for people in crisis. The primary goal of the data-sharing pilot was to design, develop, and implement a rigorous framework of principles and protocols for the safe and ethical sharing of user data. CTL used a stakeholder-based policy process to develop a feasible and ethical data-sharing program. The process comprised forming a data ethics committee; identifying policy challenges and solutions; announcing the program and generating interest; and revising the policy and launching the program. Once the pilot was complete, CTL examined how well the program ran and compared it with other potential program models before putting in place the program that was most suitable for its organizational needs. By drawing on CTL's experiences, we have created a 3-step set of guidelines for other organizations that wish to develop their own data-sharing program with academic researchers. The guidelines explain how to (1) determine the value and suitability of the data and organization for creating a data-sharing program; (2) decide on an appropriate data sharing and collaboration model; and (3) develop protocols and technical solutions for safe and ethical data sharing and the best organizational structure for implementing the program. An internal evaluation determined that the pilot satisfied CTL's goals of sharing scientific data and protecting client confidentiality. The policy development process also yielded key principles and protocols regarding the ethical challenges involved in data sharing that can be applied by other organizations. Finally, CTL's internal review of the pilot program developed a number of alternative models for sharing data that will suit a range of organizations with different priorities and capabilities. In implementing and studying this pilot program, CTL aimed both to optimize its own future data-sharing programs and to inform similar decisions made by others. Open data programs are both important and feasible to establish. With careful planning and appropriate resources, data sharing between big data companies and academic researchers can advance their shared mission to benefit society and improve lives. ©Anthony R Pisani, Nitya Kanuri, Bob Filbin, Carlos Gallo, Madelyn Gould, Lisa Soleymani Lehmann, Robert Levine, John E Marcotte, Brian Pascal, David Rousseau, Shairi Turner, Shirley Yen, Megan L Ranney. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 17.01.2019.Entities:
Keywords: cooperative behavior; crisis intervention; data sharing; ethics, business; industry; information dissemination; privacy; technology; text messaging
Mesh:
Year: 2019 PMID: 30664452 PMCID: PMC6354196 DOI: 10.2196/11507
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Open data program: challenge, principles, and protocols.
| Challenges and principles | CTLa protocols | |
| Inform users in an unobtrusive way that anonymized data are shared with select research partners | CTL provides texters with a link to an easy-to-understand Terms of Serviceb, including a disclosure of potential future data use, before every crisis conversation | |
| Establish a review process that includes outside academics and ethics experts | An internal CTL team and external ethics committee review applications, with special attention paid to nonmaleficence and justice, texter confidentiality, data security, and social impact | |
| Require human subjects review by academic institution before data sharing | CTL requires each team to procure institutional review board approval | |
| Ensure adequate protection of marginalized groups | CTL reviews research proposals as well as final manuscripts before journal submission for inadvertent stigmatization of marginalized groups (eg, LGBTQ+) | |
| Determine which data are released to each team | CTL creates custom datasets for each team, sharing variables on a need-to-know basis for up to 1 year | |
| Protect against release of potentially identifying information | In addition to scrubbing all data for personally identifiable information such as names, addresses, emails, and social media handles, CTL transforms or coarsens any data found to pose a risk to texter confidentiality (eg, university name) | |
| Maintain possession of and oversight over data and use | CTL gives each team a virtual machine (VM) hosted on Amazon Web Services and accessed via a virtual private network. All analyses are conducted and stored on the VM with copy/paste and export functionalities disabled | |
| Authorize who can access the data | CTL grants access to university faculty only with demonstration of ethics approval, a signed data use agreement, and a clear data management plan | |
| Require university oversight of, and liability for, researcher behavior when interacting with the data | CTL signs a Data Use Agreement with the lead researcher as well as his or her respective university | |
| Limit the total number of teams to allocate sufficient resources, support, and oversight | CTL limits the number of teams to ≤6 per quarter | |
| Prioritize research that can benefit users and the service | CTL reviews applications for | |
| Assist with accurate and responsible reporting of results | CTL reviews data output requests and manuscripts before journal submission for accidental breaches of texter confidentiality and accurate contextualization of findings | |
aCTL: Crisis Text Line.
bTerms of service: “We have created a formal process for sharing information about conversations with researchers at universities and other institutions. We typically share data with trusted researchers when it will result in insights that create a better experience for our texters. We follow a set of best practices for data sharing based on the University of Michigan’s Inter-University Consortium of Social and Political Research, one of the largest open data projects in the U.S., which includes stringent ethical, legal, and security checks. For more details, see our policies for open data collaborations” [19].
Figure 1Key questions for organizations considering data sharing.
Research partnership models.
| Open data-sharing program (pilot) | Resident researchers | Selective research partnership | |
| Summary | Open application process for multiple teams to access data and conduct diverse studies at a distance | Researchers apply for 3 to 6 months on-site residency with access to data via computers maintained by the organization | Collaborate closely with a select few trusted research partners on a long-term ongoing basis |
| Pros | Maximizes variety and quantity of research projects | Eliminates cost of developing data center; eases communication and collaboration; and reduces data security concerns | Increases the organization’s voice in guiding research questions and operating principles and increases control over dissemination of research findings |
| Cons | Most costly option, requiring both start-up and maintenance funding and personnel | Geographic limitation to research collaborators and fewer teams at once | More limited scope of research |
Data management models.
| Internal data management | Third-party data management | |
| Summary | The organization manages data warehousing and access solutions | A third-party vendor manages the data warehousing for an external partner |
| Pros | Provides maximum control over data security and increases responsiveness to needs of the organization and researchers | Reduces technical costs of starting and maintaining a data center; increases data security, given the third party’s core competencies; and enables focus on leveraging and communicating research outputs |
| Cons | Significant expenses involved in starting and maintaining a data center and draws focus away from organization’s core competencies | Organization loses some control over data, therefore must work with a vetted partner |