| Literature DB >> 32075696 |
F M Aarestrup1, A Albeyatti2,3, W J Armitage4, C Auffray5, L Augello6, R Balling7, N Benhabiles8, G Bertolini9, J G Bjaalie10, M Black11, N Blomberg12, P Bogaert13, M Bubak14, B Claerhout15, L Clarke16, B De Meulder17, G D'Errico18, A Di Meglio19, N Forgo20, C Gans-Combe21, A E Gray22, I Gut23, A Gyllenberg24, G Hemmrich-Stanisak25, L Hjorth26, Y Ioannidis27, S Jarmalaite28, A Kel29, F Kherif30, J O Korbel31, C Larue32, M Laszlo33, A Maas34, L Magalhaes35, I Manneh-Vangramberen36, E Morley-Fletcher37,38, C Ohmann39, P Oksvold40, N P Oxtoby41, I Perseil42, V Pezoulas43, O Riess44, H Riper45, J Roca46, P Rosenstiel25, P Sabatier47, F Sanz48, M Tayeb2,3, G Thomassen49, J Van Bussel50, M Van den Bulcke50, H Van Oyen14,51.
Abstract
The European Union (EU) initiative on the Digital Transformation of Health and Care (Digicare) aims to provide the conditions necessary for building a secure, flexible, and decentralized digital health infrastructure. Creating a European Health Research and Innovation Cloud (HRIC) within this environment should enable data sharing and analysis for health research across the EU, in compliance with data protection legislation while preserving the full trust of the participants. Such a HRIC should learn from and build on existing data infrastructures, integrate best practices, and focus on the concrete needs of the community in terms of technologies, governance, management, regulation, and ethics requirements. Here, we describe the vision and expected benefits of digital data sharing in health research activities and present a roadmap that fosters the opportunities while answering the challenges of implementing a HRIC. For this, we put forward five specific recommendations and action points to ensure that a European HRIC: i) is built on established standards and guidelines, providing cloud technologies through an open and decentralized infrastructure; ii) is developed and certified to the highest standards of interoperability and data security that can be trusted by all stakeholders; iii) is supported by a robust ethical and legal framework that is compliant with the EU General Data Protection Regulation (GDPR); iv) establishes a proper environment for the training of new generations of data and medical scientists; and v) stimulates research and innovation in transnational collaborations through public and private initiatives and partnerships funded by the EU through Horizon 2020 and Horizon Europe.Entities:
Mesh:
Year: 2020 PMID: 32075696 PMCID: PMC7029532 DOI: 10.1186/s13073-020-0713-z
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Glossary of cloud computing terms
| Application | A set of programs running on one or more computers allowing a user to perform a set of tasks |
|---|---|
| Cloud application | An application running in the cloud |
| Cloud computing | The delivery of IT services over a network (e.g., the internet) by means of a combination of infrastructure, software, and data hosted by one or more cloud providers using a service model similar to that used by traditional utility companies (e.g., water or electricity) |
| Cloud federation | The combination of infrastructure, software, and services from separate networks and providers, having shared access mechanisms, to perform common actions, achieve load-balancing or optimize availability or costs |
| Cloud instance | A virtual server or container of resources running on a physical host computer possibly hosting several independent instances (see virtualization) |
| Cloud marketplace | An online marketplace of cloud services and applications operated by a Cloud Service Provider (CSP) |
| Cloud service provider (CSP) | A company or public entity that offers cloud services to individual users or other entities |
| Cloud storage | A model of data storage in which data are hosted across one or more facilities by a hosting entity or CSP and remotely accessed by users over the internet |
| Container | A type of virtualized instance running on a physical host server in isolated user spaces and possibly preloaded with applications |
| Hybrid cloud | A cloud computing infrastructure comprised of a mix of public and private cloud and on-premise instances and resources |
| Infrastructure | The combination of hardware resources (network, computing, storage, etc.) and virtualized instances supporting an IT environment |
| Infrastructure as a Service (IaaS) | A model of cloud computing in which a CSP provides an infrastructure of virtualized resources to users as a service over a network (e.g., the internet) |
| On-premise | Referring to infrastructure or software that is run on computing resources that are physically hosted by the entity using them |
| Platform | A computer system on which applications can run on or can be built |
| Platform as a Service (PaaS) | A model of cloud computing in which a CSP provides the infrastructure and the platforms where users can run and manage their own applications as a service over a network (e.g., the internet) |
| Private cloud | A cloud infrastructure used by a single organization either on-premise or hosted by a third-party CSP over the internet or dedicated private networks |
| Public cloud | A cloud infrastructure hosted by a CSP (or a federation) and used by the public or multiple organizations across the internet |
| Software as a Service (SaaS) | A model of cloud computing in which a CSP hosts and provides applications (software) to users as a service over a network (e.g., the internet) |
| Virtualization | A technology that allows users to run a software simulation of a physical computer on which a full operating system and applications can be installed |
Summary of the recommendations, details on the rationale, and suggestions for action points directed to the funding agencies and the actors in the field
| Recommendation | Rationale | Action points |
|---|---|---|
| Provide and foster standards, good practices, and guidelines necessary to establish the European Health Research and Innovation Cloud (HRIC) | The HRIC should be supported by predefined standards, data formats, protocols, and templates. The data standards and guidelines applied in the HRIC should be designed to facilitate interoperability between the diverse health systems and policies in Europe and globally | Suggest the adoption of data formats and architectures in policies, grant applications, and projects calls throughout the EU and its member states |
| Develop and certify the infrastructure and services required for operation of the HRIC | The HRIC should provide computational infrastructures and services and analytical and visualization tools to all users as a platform to share knowledge, data, and guidelines. Services for data sharing, security, and analysis should be compliant with an EU certification system | - In future grant and project calls related to the development of the HRIC, the EU and the applicants should commit to complying with the highest standards of security, interoperability, and reproducibility - The EU should develop a certification system to validate compliance with the standards mentioned |
| Enable the HRIC to operate within an ethical and legal framework that is adequate for health systems | A robust ethical and legal framework has to be developed that defines rules for privacy, security, ownership, access, and usage of data within the HRIC. A federated system architecture should be preferred as it allows for comparison of data and results, while complying with EU General Data Protection Regulation (GDPR) and international data protection and sharing rules | - In grants and project calls, federated and GDPR-compliant data architectures should be preferred |
| Establish a proper environment for the training of a new generation of data and medical scientists | Education and training of health professionals need to be updated with the HRIC in mind, considering both international standards and practices for data sharing as well as national environments and regulations. The EU should take inspiration in existing large and successful infrastructures that foster multidisciplinary teams, such as the European Organisation for Nuclear Research (CERN) | - Scientists and health professionals in training should be made aware of the possibilities of the HRIC - Communication in relevant professional channels should be strengthened |
Fund public and private initiatives for the development of the HRIC through EU Framework Programmes (Horizon 2020 and Horizon Europe) | The EU and its member states should, together with private investors, develop a coherent, ambitious, and long-term action plan supported by innovative funding mechanisms that consolidate the outcomes from the existing project portfolio into a long-term operational infrastructure | - The EU should invest through calls and grants in order to build and consolidate the HRIC - The existing industrial ecosystem should be supported, to remain competitive against the other actors in the world |
Relevant initiatives for the European Health Research and Innovation Cloud (HRIC)
| Project | Aims/summary | Cloud model used | References |
|---|---|---|---|
| CORBEL project | Creating a platform for harmonized user access to biological and medical technologies, biological samples, and data. The project has developed the data harmonization, ethics guidance, and user-access protocols necessary for transnational access to both pre-clinical and clinical research infrastructures and is piloting access to participant-level data from clinical trials [ | Scalable cloud-based provision of data access and compute across infrastructures | [ |
| ELIXIR | European research infrastructure with 21 members and over 180 research organizations. ELIXIR is creating a network of local instances of the European Genome-Phenome Archive that give users controlled and secure access to raw data and precomputed results [ | Hybrid cloud ecosystem: i) Local, private clouds (e.g., EMBL-EBI Embassy) ii) National community clouds (e.g., cPouta, MetaCentrum cloud, de. NBI) iii) European research and innovation-oriented clouds (e.g., European Open Science Cloud (EOSC)) iv) Public/commercial compliant clouds (e.g., Google, Azure, Amazon web Service (AWS)) | [ |
| European Translational Information and Knowledge Management Services (eTRIKS) | IMI-funded highly scalable cloud-based platform for translational research, information, and knowledge management providing open-source applications that can securely host heterogeneous data types, including multi-omics data, preclinical laboratory data, and clinical information, including longitudinal data sets. The platform is a robust translational research knowledge management system that is able to host other data-mining applications and support the development of new analytical tools [ | Scalable cloud-based platform for translational research and applications development. The Openstack technology is used to run a private cloud for eTRIKS | [ |
| European Medical Information Framework (EMIF) | Innovative Medicines Initiative (IMI)-funded project that has successfully improved access to human health data by providing tools and workflows that can be used to discover, assess, access, and (re) use human health data. The efforts of this IMI project are being extended through the European Health Data and Evidence Network (EHDEN) project | Research analytical service approaches from EHR and cohort data platforms | [ |
Human Brain Project (HBP) Medical Informatics Platform | The HBP Medical Informatics Platform allows researchers around the world to exploit medical data to create machine-learning tools that can analyze these data for new insights into brain-related diseases. The Medical Informatics web-portal [ | The HBP Joint Platform plans to adopt cloud technology and provide those services through its computer centers (JSC-Jülich and CSCS-Lugano) together with BSC-Barcelona, CINECA-Bologna, and CEA-Saclay. Software infrastructure: i) The (base) infrastructure layer is accessible through an ‘Infrastructure as a Service’ (IaaS) interface ii) Tools to enable simulation and modeling as well as data analytics workflows for neuroscience, which the HBP operates as a “Platform as a Service” (PaaS) iii) Several software services for data-driven brain simulations and for virtual neurorobot design and operation offered in the form of ‘Software as a Service’ (SaaS). HBP operates the following SaaS: a) Model-driven brain simulations b) Neurorobotics simulation and development tools across the whole workflow the of neurorobotics life cycle | [ |
| Human Brain Project (HBP) Knowledge Graph Data Platform | The HPB Knowledge Graph (KG) is an online graph database that accepts submissions of anonymized human data, animal data, and models from the brain. All data that are made discoverable and accessible through a KG search have been curated. Data are also made available with integrated multilevel HBP Atlases, holding information about the brain in standard reference spaces | [ | |
| Helix Nebula | A pan-European public–private partnership initiative led by EIROforum and leading commercial cloud-computing partners. Since 2011, this project has been piloting the use of cloud computing to enable complex data analyses and large-scale data sharing, with life science-oriented projects ranging from complex genome assembly to assessing somatic variation in the context of different types of cancer | Helix Nebula Science Cloud (HNSciCloud): hybrid cloud platform that links together commercial cloud service providers and publicly funded research organizations’ in-house IT resources via the GEANT network to provide innovative solutions supporting data intensive science. These services support the connection of the research infrastructures identified in the European Strategy Forum on Research Infrastructures (ESFRI) Roadmap to the nascent European Open Science Cloud (EOSC) and are intended to create a single digital research space for Europe’s 1.8 million researchers | [ |
| European Open Science Cloud (EOSC) | The pilot project has recently investigated the benefits of data and cloud computer sharing at a pan-European level in life science-oriented projects on pan-cancer analyses [ | Cloud-based services for open sciences—integration and consolidation of e-infrastructure platforms, federation of existing European research infrastructures and scientific clouds | [ |
| Pancancer Analysis of Whole Genomes (PCAWG) | An international collaboration to identify common patterns of mutation in more than 2800 cancer whole genomes from the International Cancer Genome Consortium. This project is exploring the nature and consequences of somatic and germline variations in both coding and non-coding regions, with specific emphasis on cis-regulatory sites, non-coding RNAs, and large-scale structural alterations | Hybrid cloud model. The data-coordinating center lists collaborative agreements with cloud providers’ AWS and an academic computing cloud resource maintained at the cancer collaboratory, by the Ontario Institute for Cancer Research and hosted at the Compute Canada facility | [ |
| RD-Connect | RD-Connect is an integrated platform connecting databases, patient registry data, biobanks, and clinical bioinformatics for rare disease research. It allows the integration of different data types (e.g., omics, clinical information, patient registries, and biobanks). Those integrated data can be accessed and analyzed by the scientific community to speed up research, diagnosis, and therapy development for patients with rare diseases | Online secured platform connecting different types of patient-related rare disease data, enabling genome-phenome analysis | [ |
| COMPARE | A network of collaborators of the Global Microbial Identifier initiative (GMI) that aims to improve the identification and mitigation of emerging infectious diseases and foodborne outbreaks | One-serve-all analytical framework and data exchange platform with various data integration for real-time analysis and interpretation of pathogen sequence data | [ |
Fig. 1Proposed general architecture of the HRIC European (inter) national databases, with varying data formats and data types referenced in a metadata repository, following formatting rules of the federated data commons as agreed at the HRIC governance level. The different users, after access control to the cloud, use the HRIC interface to access the repository, which gathers the relevant data and performs analysis, with outputs such as mathematical models, data visualizations, statistics, and patient’s profiles according to the users’ needs