| Literature DB >> 31800037 |
Regina Becker1, Pinar Alper1, Valentin Grouès1, Sandrine Munoz1, Yohan Jarosz1, Jacek Lebioda1, Kavita Rege1, Christophe Trefois1, Venkata Satagopam1, Reinhard Schneider1.
Abstract
BACKGROUND: The new European legislation on data protection, namely, the General Data Protection Regulation (GDPR), has introduced comprehensive requirements for the documentation about the processing of personal data as well as informing the data subjects of its use. GDPR's accountability principle requires institutions, projects, and data hubs to document their data processings and demonstrate compliance with the GDPR. In response to this requirement, we see the emergence of commercial data-mapping tools, and institutions creating GDPR data register with such tools. One shortcoming of this approach is the genericity of tools, and their process-based model not capturing the project-based, collaborative nature of data processing in biomedical research.Entities:
Keywords: GDPR; accountability; data mapping
Mesh:
Year: 2019 PMID: 31800037 PMCID: PMC6892452 DOI: 10.1093/gigascience/giz140
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:DAISY Target users and information flow.
Overview of key DAISY entities
| Entity | Definition |
|---|---|
| Project | A time-limited research activity with associated documentation on the ethical, legal, and administrative procedures carrying out its implementation |
| Partner | A research collaborator who is the source and/or recipient of human data. Partners are also legal entities with whom contracts are signed. Clinical entities that run longitudinal cohorts, research institutes, or data hubs are examples of partners. |
| Contract | A legal agreement with one or more partners. Contracts are established by one or more mutually signed documents. Data-sharing agreements, consortium agreements, and material transfer agreements are examples of contracts. |
| Dataset | A physical/logical unit of data, which is typically treated as a resource with an associated location and access control policy |
| Data declaration | A sub-unit of data, which is traceable to a particular source, which could be the provider partner and source cohort or another data declaration |
| Cohort | A study that collects data and/or biosamples from a group of participants (e.g., longitudinal case-control or family studies). A cohort is linked to the creation of data and is considered its ultimate source. |
Figure 2:DAISY Information model.
Figure 3:Example DAISY records and the relationships between them. Projects and datasets create a matrix of associations. A project may involve several datasets with (sub)data declarations coming from different source cohorts and covered by different contracts with various partners, which may have been signed at different times. Additionally, data/biosamples obtained for 1 project may be re-used in another project (biosamples in Dataset 3 of Project Y are re-used in Project Z Dataset 4). Furthermore, a project may access data of other projects (Project Y and Dataset 2). Typically, cohort data are collected for an open number of projects. Therefore, biosamples and data of 1 cohort may end up in several datasets within an institution; we denote this with colours: data from a specific cohort are shaded with the colour of that cohort. Also, note that researchers may generate different data types using biosamples/data obtained from a cohort, e.g., generate genomics data from samples. DAISY ensures that cohort annotations are propagated to newly generated (derivative) data.
Figure 4:Screenshot of Partner search page in DAISY.