| Literature DB >> 35280732 |
Luis Martí Bonmatí1, Ana Miguel1, Amelia Suárez2, Mario Aznar2, Jean Paul Beregi3, Laure Fournier3, Emanuele Neri4, Andrea Laghi5, Manuela França6, Francesco Sardanelli7, Tobias Penzkofer8, Phillipe Lambin9, Ignacio Blanquer10, Marion I Menzel11,12, Karine Seymour13, Sergio Figueiras14, Katharina Krischak15, Ricard Martínez16, Yisroel Mirsky17, Guang Yang18, Ángel Alberich-Bayarri19.
Abstract
The CHAIMELEON project aims to set up a pan-European repository of health imaging data, tools and methodologies, with the ambition to set a standard and provide resources for future AI experimentation for cancer management. The project is a 4 year long, EU-funded project tackling some of the most ambitious research in the fields of biomedical imaging, artificial intelligence and cancer treatment, addressing the four types of cancer that currently have the highest prevalence worldwide: lung, breast, prostate and colorectal. To allow this, clinical partners and external collaborators will populate the repository with multimodality (MR, CT, PET/CT) imaging and related clinical data. Subsequently, AI developers will enable a multimodal analytical data engine facilitating the interpretation, extraction and exploitation of the information stored at the repository. The development and implementation of AI-powered pipelines will enable advancement towards automating data deidentification, curation, annotation, integrity securing and image harmonization. By the end of the project, the usability and performance of the repository as a tool fostering AI experimentation will be technically validated, including a validation subphase by world-class European AI developers, participating in Open Challenges to the AI Community. Upon successful validation of the repository, a set of selected AI tools will undergo early in-silico validation in observational clinical studies coordinated by leading experts in the partner hospitals. Tool performance will be assessed, including external independent validation on hallmark clinical decisions in response to some of the currently most important clinical end points in cancer. The project brings together a consortium of 18 European partners including hospitals, universities, R&D centers and private research companies, constituting an ecosystem of infrastructures, biobanks, AI/in-silico experimentation and cloud computing technologies in oncology.Entities:
Keywords: artificial intelligence-AI; cancer imaging; cancer management; image harmonization; quantitative imaging biomarkers; radiology
Year: 2022 PMID: 35280732 PMCID: PMC8913333 DOI: 10.3389/fonc.2022.742701
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Project overview.
Project milestones.
| Milestone ID | Description | Due date (months) |
|---|---|---|
| M1 | Initial repository design available for regulatory clearance. Repository’s legal operational model established. | 12 |
| M2 | Start of data collection at data provider sites, with clearance for data to be incorporated into the CHAIMELEON repository. | 13 |
| M3 | Completion of the repository design phase and the verification of the repository’s compliance with GDPR. | 18 |
| M4 | First repository prototype released, fully interfaced with data provider sites | 24 |
| M5 | Start of the repository’s technical validation phase Stage 1 – Internal by project partners | 30 |
| M6 | Start of the repository’s technical validation phase Stage 2 – External validation | 31 |
| M7 | End of the repository’s technical validation phase Stage 1 – Internal validation completed and documented | 34 |
| M8 | Execution of the repository’s technical validation phase Stage 2 – External validation | 34 |
| M9 | Start of the repository’s data expansion stage – addition of new datasets provided by external collaborators. Legal clearance and IT interfacing with selected centers. | 37 |
| M10 | End of the repository’s technical validation phase Stage 2 – External validation | 38 |
| M11 | Start of the clinical validation phase - observational studies for AI-based solutions developed/refined using the repository start | 41 |
| M12 | End of the clinical validation phase | 46 |
| M13 | Assessment of observational studies finalized. | 48 |
Summary of CHAIMELEON’s main design and infrastructure features.
| Feature | Description |
|---|---|
| Distributed infrastructure | In the first phase of the project, data will be centralized, after it has been collected, curated and anonymized by a set of tools deployed locally. In the second phase of the project, we will explore a distributed architecture, where the architecture will be composed of a central index and multiple physical repositories (local indexes), which may be either regional, national or hospital-based data warehouses connected to the hospital’s PACS and EHR/RIS. Repositories will be connected using encrypted communications and standards for interoperability, such as DICOM-TLS or DICOM web. Federated Learning approaches and distributed data exploration solutions will be explored. |
| Single-entry point for pan- European users | CHAIMELEON will be designed to facilitate AI developers access to any relevant curated datasets, independently of their origin. |
| Publicly available, upon user registration | The registration process will include requirements for the researchers to sign acceptance of the conditions of use and access to the data. These will include commitments related to the purposes of data use and contracting of non-identification commitments. |
| Types of roles | Different entities and physical persons under different roles will be key parties to the repository, including data providers, entities providing infrastructure or services (primary data users), and researchers willing to access data for research purposes (data users). Roles will be carefully defined and assigned the applicable rights and obligations. |
| Powered with automatic tools, human refined | The latest machine learning advancements on data ingestion, curation, quality control, annotation, segmentation and harmonization will be incorporated into CHAIMELEON. During the project execution, extensive human resources will be devoted to the supervision and refinement of the automation tools. As technologies evolve, the repository will steadily progress towards less human supervision and more automated processes. |
| Pseudonymized and anonymized data | The Repository will have two levels of de-identification. The first one will be pseudonymization at local premises, in order to preserve traceability and enable potential linkage to other biobanks (e.g., Pathological or genetic). The second, at the central repository level, will be complete anonymization, meaning the data will no longer be identifiable, even indirectly. |
Figure 2Distributed architecture and IT infrastructure. (A) and (B) refer to two different types of hospitals based on their capacity to curate, complete, and anonymize data prior to their ingestion into the central repository. (A) Data processing is done on site. (B) Data processing is done via an intermediation platform.
Figure 3High-level architecture of the central repository and technologies used.
Types of datasets to be accessible from the CHAIMELEON repository.
| Type of cancer | Imaging Data | Estimated number of cases | |
|---|---|---|---|
| Training phase | Validation phase | ||
|
| CT/PET/CT | 7000 | 4500 |
|
| Mammography, Digital breast tomosynthesis, Ultrasound and MRI | 3500 | 2500 |
|
| |||
| Colon | CT | 2334 | 1667 |
| Rectum | MRI | 1167 | 833 |
|
| MRI | 6000 | 4000 |
Clinical end points to be addressed in CHAIMELEON for the four targeted types of cancer.
| Type of Cancer | Current therapies | CEPs |
|---|---|---|
| Lung | Immunotherapy | Predicting patients with a positive response to immunotherapy |
| Colorectal | Surgery/neoadjuvant chemotherapy | (Rectal cancer) Prediction of patients with a positive response to chemoradiation and classification in different treatment response sub-groups. |
| (Colon cancer) Identification of patients at higher risk of distant metastases at an early timepoint. | ||
| Breast | Surgery, radiation and systemic therapy | Diagnostic performance and cancer staging. |
| Prostate | Wide range due to heterogeneity | Early Staging/Grading |