Literature DB >> 32802398

A new standardized data collection system for brain stereotactic external radiotherapy: the PRE.M.I.S.E project.

Silvia Chiesa1, Barbara Tolu2, Silvia Longo3, Barbara Nardiello2, Nikola Dino Capocchiano3, Federica Rea2, Luca Capone2, Gerardina Stimato4, Roberto Gatta5, Alessandro Pacchiarotti1, Mariangela Massaccesi1, Giuseppe Minniti2, Francesco Cellini1, Andrea Damiani3, Mario Balducci1,3, Piercarlo Gentile2, Vincenzo Valentini1,3, Federico Bianciardi2.   

Abstract

BACKGROUND: In recent years, novel radiation therapy techniques have moved clinical practice toward tailored medicine. An essential role is played by the decision support system, which requires a standardization of data collection. The Aim of the Prediction Models In Stereotactic External radiotherapy (PRE.M.I.S.E.) project is the implementation of systems that analyze heterogeneous datasets. This article presents the project design, focusing on brain stereotactic radiotherapy (SRT). MATERIALS &
METHODS: First, raw ontology was defined by exploiting semiformal languages (block and entity relationship diagrams) and the natural language; then, it was transposed in a Case Report Form, creating a storage system.
RESULTS: More than 130 brain SRT's variables were selected. The dedicated software Beyond Ontology Awareness (BOA-Web) was set and data collection is ongoing.
CONCLUSION: The PRE.M.I.S.E. project provides standardized data collection for a specific radiation therapy technique, such as SRT. Future aims are: including other centers and validating an extracranial SRT ontology.
© 2020 Silvia Longo.

Entities:  

Keywords:  big data; brain; personalized medicine; predictive model; stereotactic

Year:  2020        PMID: 32802398      PMCID: PMC7421993          DOI: 10.2144/fsoa-2020-0015

Source DB:  PubMed          Journal:  Future Sci OA        ISSN: 2056-5623


Background

Recently, oncological knowledge has grown exponentially in terms of both diagnosis and therapy. Qualitative improvements in different fields, for example, genomics, histology and technology, provide a heterogeneity of data regarding both patient and tumor characteristics. Therefore, a larger amount of different types of data, together with their increased complexity, need to be considered in the decision-making process [1,2]. This process has historically been guided by international guidelines, based on randomized clinical trial evidence that provide a patient’s selection beforehand. Population-based observational studies are recently emerging as a complementary form of research, often named ‘rapid learning healthcare’ (RLHC), which is essential to ensure that clinical trials results can be translated into tangible benefits for the general population [3]. Data collection quality in the RLHC approach can be low as data is frequently collected using different procedures, thus, pooled multicenter research is difficult to perform. Standardized data collection improves the quality of this process, defining variables and the way they should be shared without ambiguity [4]. Data collection standardization methods benefit from the use of a common ontology system. Ontologies are commonly defined as an ‘explicit specification of a conceptualization’ [5]: this, in our specific context, is equivalent to a classification system where uniform and not ambiguous definitions represent each variable and all their relationships. A large heterogeneous database is required to store all the information without knowing beforehand what the research topic would be. From the hypothesis, it is determined what features should be included in the learning effort in order to obtain a predictive model that represents the distribution of the same features and their relationship inside the dataset. Predictive models are the basis of predictive tool implementation; beside the more appealing interactive websites, graphical calculating devices, like nomograms [6,7]. In oncological literature, several experiences have been published regarding decision support system (DSS) implementations in different anatomical sites [1,4,8,9], but a DSS for a specific radiation technique is not available yet. The PREdiction Models In Stereotactic External radiotherapy (PRE.M.I.S.E.) is one of the research projects involved in the ‘umbrella protocol’ [9], which works at facilitating RLHC. The aim of the PRE.M.I.S.E. project is to create a consistent dataset to support the future development of DSSs for stereotactic radiotherapy (SRT), moving toward a ‘shared decision making’ approach. Doctors, together with patients, will be able to evaluate pros and cons of different treatment strategies. On the other hand, clinicians will be able to actively discuss and decide for the best therapeutic intervention, once having assessed all the features to optimize a stereotactic treatment plan.

Materials & methods

A multidisciplinary team was created with members stemming from the first two centers involved. Physicians, physicists, nurses and therapists were involved in the team. The group planned a set of phases and scheduled periodical meetings to assess the development of each single task: this iterative approach (design-implementation-validation-back to design) helped us in exploiting the synergy of the multiple discipline in our team. The local ethics committee approved the protocol before patients’ accrual according to the legislation of the country. Our general workflow was divided into different phases and can be summed up as follows: Ontology definition; Setting of the storage system; Data analysis.

Ontology definition

To reduce the ambiguity in collecting and analyzing data, the first step was the definition of a clear ontology. Even if an ontology can intuitively be represented using the natural language, this is commonly discouraged, because even if it is the simplest solution for a human to human communication it cannot be easily translated into a formal language to be computed by a computer. The opposite, the immediate development of an ontology through the use of one of the many available formal languages, such as the options provided by the World Wide Web Consortium, is a task that requires a specific training, a multidisciplinary task force and the final result cannot easily be made understandable for non-skilled physicians for review or external validation. This also makes validating the ontology a complex task, as many participants in multicentric studies lack the required training. The radiation oncology ontology [10] is an example of ontology written in web ontology language for radiotherapy, but the complexity of the language and the implications (in particular for automatic reasoning) are an important barrier for an extended use in a real world scenario. Because of the aforementioned reasons, we adopted the following strategy: a raw ontology was defined, exploiting some semi-formal languages (block diagrams and entity relationship diagrams) and the natural language; then it was transposed in a case report form (CRF). A CRF is a format that can be loaded, parsed and executed by a computer, where each single clinical variable is described in terms of type, admitted values, relation with other variables. Here each variable is framed as an attribute of more general entities (or classes) such as patients, treatments, visits and toxicities. The relations among the entities are also provided with the specification of the cardinalities. Once the ontology has been extensively validated and consolidated by the practice, we will consider an implementation with the World Wide Web Consortium technology, in order to exploit the automatic inference for some minor tasks (e.g., descriptive statistics on the cohort). In building the ontology, the complexity of the knowledge domain has been separated into three different and distinct layers: the registry level is the most general tier and includes the baseline patient and tumor characteristics (age, gender, ethnicity etc.), which are considered relevant for epidemiological analysis only. The Procedure level comprises treatment information and related toxicities, and the evaluation of outcome in terms of disease-free survival and acute and late toxicities. The final level is the Research level, and includes clinical and imaging information used for in-depth, advanced research projects only. In order to implement and use the ontology and guide the work of the designated data managers, the team created the CRF according to the format compatible with beyond ontology awareness (BOA) – a research electronic data capture software.

Setting of the storage system

BOA utilizes a relational database model as the base of the data layer. SQLite has been chosen as the designated database in order to guarantee a degree of portability, by allowing installation of the complete service on wide variety of devices. A part of the implemented ontology structured in the SQLite database is shown in Figure 1. Specifically, a single archive was created for the study and successively populated with patients that could have one or more pathologies. Each of these pathologies could have one or more treatments. The specific CRFs designed for this study were then imported into BOA and converted to the required structure, which meant that each CRF had a multitude of related questions of various types (e.g., dates, single-select lists, multiselect lists or other types of inputs) with specific constraints for allowable inputs defined during the definition of the ontology. During the data entry phase, CRF links were automatically generated – linking each recorded answer to a specific question and finally linking each completed CRF link to a specific phase in the patient history (e.g., first contact with the patient). This architecture allows not only to guarantee the integrity of the ontology, but also greatly eases any subsequent data extraction / data analysis effort.
Figure 1.

Ontology structure in the SQLite database.

CRF: Case report form.

Ontology structure in the SQLite database.

CRF: Case report form. BOA itself is structured as a portable Django webservice, which allows data managers to quickly access to the required interfaces and automatically handle all required data validation aspects. An example screenshot of a CRF is shown in Figure 2. BOA can store data in two different ways, depending on the needs of the center and the wishes of the participants: BOA.Cloud and BOA.Local.
Figure 2.

An example screenshot of a case report form.

CRF: Case report form.

An example screenshot of a case report form.

CRF: Case report form.

BOA.Cloud

The collected data are automatically anonymized and transferred to a cloud-based large database. After the transfer, it will not be possible to reconstruct the history of the transferred data or the pertaining patient files, due to the complete anonymization algorithm that does not allow identifying information to be conveyed, including unique IDs.

BOA.Local

The data are stored in a local database, in a secure area that prohibits any data exchange between the local client and other computers in the local area network or internet. The two distinct pathways and their optional convergence toward a final database are also highlighted in Figure 3, which depicts a general overview of how the BOA service is laid out. In this particular example, the institution marked with the blue color, works through a BOA.Local installation that can (if desired) upload data to the purple BOA.Cloud Master server installation on demand, while both the institutions marked with the green and yellow colors connect directly to the aforementioned BOA.Cloud Master server, without the need to locally store data.
Figure 3.

general overview of how the Beyond Ontology Awareness service is laid out.

BOA: Beyond ontology awareness.

general overview of how the Beyond Ontology Awareness service is laid out.

BOA: Beyond ontology awareness.

Data analysis

One of the goals of the PRE.M.I.S.E. project is to also be able to support multicentric clinical studies. To face this challenge, the project can not rely only on a collection of data stemming from a local repository or a centralized database, as these options present remarkable problems concerning the patient’s privacy. Techniques such as anonymization or de-identification are dangerous because part of the information is shared; data encryption or homomorphic encryption are suboptimal as they can potentially be decrypted. To ensure the patient’s privacy and guarantee data ownership, PRE.M.I.S.E. exploits distributed learning to generate statistical models through multiple separated databases in the various BOA installation sites (both BOA.Local and BOA.Cloud infrastructures): through this paradigm only, aggregated data are shared or transferred – the data never leaves the databases [11] and the metaphorical walls of the institutions that the data is stored in. In more detail, the distributed learning architecture is composed of one central master node and many client nodes (called local learning nodes) distributed in all database end points. The master node will have the primary task of coordinating and overviewing the learning protocols between the single hospitals, and as such, will never have direct access to clinical data but will only process aggregate data, as necessary for the algorithms intended to be run. The second part of the architecture is composed of the many local learning nodes, which are installed at each hospital. They have access to the local data and perform learning tasks as instructed by the master node. Patient data is not shared with the outside world. The complete algorithm can be summed up by the following steps: The local application learns a model from local data. This local model is sent to the master, where it is processed and compared with the models sent by the other hospitals. A consensus model is generated and sent back to each hospital for refinement. After preset convergence criteria are met, a final consensus model is generated. The information exchanged between master and local nodes is limited to aggregate values (e.g., parameter weights, general statistics, coefficients) and contains no patient data. All traffic between master and local nodes is managed, monitored and audited by the infrastructure. An entire learning run is an iterative process that usually requires many cycles until the master determines that the learning process has been completed. In the distributed learning mode, distributed research nodes do not move data around at all: they only apply iterative algorithms that the distributed research master will use to build a consensus model and estimate the model's parameters. Distributed learning can support many algorithms for data analysis. It has been widely used as an inferential regression analysis tools, mainly based on the relationship between outcomes (binary, continuous or multinomial) and covariates, or elements in the dataset. It establishes a data-to-outcome one-way link, investigated using traditional statistical tools as linear models, generalized linear models, survival models and support vector machines [11-13], among others. The final model can then be presented to the end-user in a variety of ways, such as nomograms, or via interactive websites. In order to become a reliable tool to be used in clinical contexts, each model must undergo a strict evaluation process, mainly based on internal and external validation [14]. Discrimination will be assessed using the c-statistic or area under the curve of the receiver-operating characteristic. The c-statistic is comparable to the area under the curve for dichotomous outcomes but can also be used for Cox regression analyses. Plotting the expected versus the observed outcomes will provide a graphical assessment of the calibration. In addition, to identify variables to be inserted in the ontology, validate variables and build a system that defines variables’ characteristics and relationships among them, the Hosmer–Lemeshow test will be used. The future development of the sharing platform is to involve other radiotherapy centers to combine multiple datasets.

Results

The ‘umbrella protocol’ has been utilized in order to standardize both the data and procedures, this led to the creation of a consistent dataset reporting ‘personalized treatment registry’, which is paving the way to obtain a trustful analysis for the DSS. A well-defined data collection model – able to collect, standardize and organize features – called ‘Ontology’ was then created. The team identified more than 130 variables related with brain SRT. All features were collected and organized into three levels: ‘Registry level’, containing epidemiological information; ‘Procedural level’, which includes elements about treatment, toxicities and outcome evaluation and ‘Research level’ where dimensional data, such as imaging information, are collected (Table 1). When identifying variables on a specific technique, attention must be kept on treatment characteristics for every single phase, from the simulation CT scan, to the delivery. We decided to start by grouping treatment variables into three separate categories: ‘contouring’, ‘planning’ and ‘delivery’. We then tackled features related to patient’s set up and subsequently organized different aspects like contouring guidelines to be followed, varying imaging characteristics, clinical target volume, planning target volume margins and lesion(s) localization. In particular, regarding the planning phase, we considered the isodose line prescription, the conformity index, the calculation algorithms, the resolution grid, the multi-lesion treatments with a single isocenter, the beam energy, the gradient index and the normalization method. For the delivery phase, we identified other variables describing image guided radiation therapy techniques (Table 2).
Table 1.

Extract from brain stereotactic radiotherapy ontology registry level.

Extract from Brain SRT Ontology registry level
VariablesDefinitionMeasurement
The phaseThe phase of oncologic history in which the patient is evaluated0: at diagnosis1: at follow-up2: at progression or recurrence3: othersmissing data
Intent• Curative: patient who can have a radical treatment• Local control: patient who can benefit of treatment with a stable disease• Palliative: patient with symptoms that can benefit from the treatment or can benefit in terms of quality of life0: curative1: local control2: palliativemissing data
Comorbidities(CharlsonComorbidity Index) (total of the achieved score -> calculate automatically) ACE-27 COMORBIDITY SCORING http://www.rtog.org/LinkClick.aspx?fileticket=oClaTCMufRA%3D&ta0: no1: yes/specify (calculate automatically)missing data
Previous oncological historySiteSpecify
 Treatment0: no1: yes (if yes, complete relative fields)missing data
 State of previous disease (according to RECIST criteria; if not applicable, refer to specific disease ontology) http://www.eortc.org/investigators-area/recist0: NED1: stable complete response2: stable partial response3: progression diseasemissing data

NED: No evidence of disease; SRT: Stereotactic radiotherapy.

Table 2.

Radiotherapy treatment characteristics.

Set-up
VariablesDefinitionMeasurement
RT treatment position 1: supine2: other (specify)
RT immobilization 0: none1: thermoplastic masks3: stereotactic system (true-point-arc?)4: OSMS5: frame with bite block and head stabilizer6: stereotactic helmet4: others (specify)999: missing data
Simulation CT scan:Thickness of CT sliceValue (mm)
 FOVValue
 Cochlea CT scan0: no1: yes999: missing data
Contouring
Reference guidelinesGuidelines to define target volumes0: RTOG1: Scoccianti et al. ((15)00080-8/pdf">http://www.thegreenjournal.com/article/S0167-8140(15)00080-8/pdf)2: AIRO (file:///C:/Users/01903418/Downloads/linee%20guida%20snc%202%20OAR.pdf)2: Gondi et al. ((10)03477-2/fulltext">http://www.redjournal.org/article/S0360-3016(10)03477-2/fulltext)3: Chera et al. (https://www.ncbi.nlm.nih.gov/pubmed/19194118)4: other (specify)999: missing data
Organs at risk 0: brain1: brainstem2: spinal cord3: pituitary gland4: chiasm5: optic nerve right6: optic nerve left7: eye ball (right)8: eye ball (left)9: lens (right)10: lens (left)11: hippocampus (right)12: hippocampus left13: cochlea14: others (specify)999: missing data
Imaging to define field 1: T1-weighted brain MR with contrast-enhancement2: T2-weighted brain MR3: FLAIR-brain MR4: CT5: PET-CT6: PET-MR6: 3D MR7: Other (specify)999: missing data
Treatment volumeTumour volume in cc1: value
Treatment volumeGTV or CTV1. tumor bed2. residual mass3. site of disease4. others
GTV-CTV margin mm
CTV-PTV margin mm
Type of expansion from CTV to PTV 0: isotropic1: anisotropic999: missing data
Margin value 0: none mm1: anterior mm2: posterior mm3: lower mm4: upper mm5: right mm6: left mm999: missing data
Distance between lesionsDistance between two equivalent spheres(x2x1)2+(y2y1)2+(z2z1)2(REq2+REq1)
Prescription
PrescriptionPrescription for all CTVs (a CTV can contain more lesions)1: dose per fraction (Gy)2: value of fractions3: total dose (Gy)4: CTV description
Planning
RT technique 1: 3D2: IMRT (sliding window)3: IMRT (step & shoot)4: Arc (VMAT)5: VMAT-SIB6: IMRT-SIB7: D-conformed arches MLC8: archtherapy with cones9: others999: missing data
Type of beam 1: photons2: heavy particles (specify)999: missing data
Beam energy 1: energy2: FF3: FFF4: dose rate999: missing data
Geometry isocenters 1: number of isocenters2: number of of lesions for isocenters3: distance between lesions with the same isocenter999: missing data
Geometry beamsReport export1: number of beams2: coplanar beams3: noncoplanar beams4: number of archs5: coplanar archs6: noncoplanar archs7: complete arch8: Partial arch9: collimator angle999: missing data
Distribution dose 1: homogeneous2: inhomogeneous999: missing data
Gradient index Value
Conformity index Value
Guidelines 1: TG101 (https://aapm.onlinelibrary.wiley.com/doi/pdf/10.1118/1.3438081)2: ICRU 91 (https://www.icrp.org/docs/ICRU%20RELEASES%20REPORT%20NO.%2091.pdf)3: Grimm 2011 (Grimm J, J. Appl. Clin. Med. Phys. 12, 267–292 [2011])4: Timmerman 2009 (http://mr.crossref.org/iPage?doi=10.3322%2Fcaac.20013)5: Seminars of Radiation Oncology 2016 (Grimm J, Semin. Radiation Oncol. 26, 165–171 [2016])
Method of normalization 0: ICRU point1: Dmax2: Isocenter3: target mean4: isodose (specify)5: others999: missing data
Prescription isodose Value
TPS version Specify
Algorithm Specify
Grid Specify
Dosimetric parameters DVH export (research levels)
Delivery
Treatment device 1: model2: version3: width lamella-isocenter
Set-up 1: 6 DOF1: 4 DOF
RT IGRT technique 1: MV-MV2: MV-KV3: KV-KV4: CBCT5: MR6: OSMS
RT IGRT frequency Value
Date of start RT Date
Date of last day Value
Elapsed daysFor each treatment planeValue
RT total prescribed dose to PTVs Value
RT total delivered dose to PTVs Value

CBCT: Cone beam computed tomography; CT: Computed tomography; CTV: Clinical target volume; DOF: Degrees of freedom; DVH: Dose volume histogram; FF: Flattering filter; FFF: Flattering filter free; FOV: Field of view; GTV: Gross tumour volume; IGRT: Image-guided radiation therapy; IMRT: Intensity-modulated radiation therapy; KV: Kilovoltage; MLC: Multileaf collimator; MR: Magnetic resonance; MV: Megavolts; OSMS: Optical Surface Monitoring System; PTV: Planning treatment volume; RT: Radiation therapy; SIB: Simultaneous integrated boost; TPS: Treatment Planning Systems; VMAT: Volumetric modulated arc therapy.

NED: No evidence of disease; SRT: Stereotactic radiotherapy. CBCT: Cone beam computed tomography; CT: Computed tomography; CTV: Clinical target volume; DOF: Degrees of freedom; DVH: Dose volume histogram; FF: Flattering filter; FFF: Flattering filter free; FOV: Field of view; GTV: Gross tumour volume; IGRT: Image-guided radiation therapy; IMRT: Intensity-modulated radiation therapy; KV: Kilovoltage; MLC: Multileaf collimator; MR: Magnetic resonance; MV: Megavolts; OSMS: Optical Surface Monitoring System; PTV: Planning treatment volume; RT: Radiation therapy; SIB: Simultaneous integrated boost; TPS: Treatment Planning Systems; VMAT: Volumetric modulated arc therapy. While the ontology was put into writing, the group realized that one of the most relevant aspects regarding brain metastases is that they are often multiple and the possibility to treat them together with a single isocenter and a single plan depends on their location in the brain. For this reason, in order to easily calculate distance among lesions, we decided to equate every lesion to an equivalent sphere (Figure 4). The BOA web service platform was completely set up and configured for both the BOA.Local and BOA.Cloud pathways, and the ontology has been successfully uploaded. A first institution is currently collecting data using the local server, allowing the institution to store all data without complete anonymization in accordance with the previously reported principles, while a second institution is collecting data through the cloud server. Both involved centers are using ontology-driven CRF [15-17] and all collected data is now available on an on-demand basis, ready to be further processed.
Figure 4.

Distance between lesions, equating every lesion to an equivalent sphere.

Discussion

Stereotactic radiosurgery is a radiation therapy technique in which multiple focused radiation beams intersect over a target to deliver a conformal, high-dose radiation and minimal radiation to surrounding normal tissues, thanks to the steep dose gradient. It is usually delivered in a single fraction but can sometimes be delivered over multiple once-daily fractions, usually to a maximum of five [18]. To our knowledge, no standardized data collection system or predictive model focusing on a treatment technique is available in literature. Several ontologies focusing on pathologies and different anatomical sites (e.g., rectum, thyroid and prostate) [1,8,15] can be found in the literature; however, to date, none of these focus on a specific radiotherapy technique (Table 3). Only the brachytherapy ontology can be considered a technique-specific tool but is for head and neck cancers only [4,19].
Table 3.

Examples of interactive DSS currently in use in clinical practice.

InstitutionRef.
EORTC[20]
MSKCC[21]
Dana Farber Cancer Institute and Johns Hopkins Sidney Kimmel Comprehensive Cancer Center[22]
MGH[23]
Cancer Research UK[24]
NCI[25]
Maastro Clinic[26]
Policlinico A. Gemelli[4,19]
Policlinico A. Gemelli[15]

DSS: Decision support systems; EORTC: European Organization for Research and Treatment of Cancer; MGH: Massachusetts General Hospital; MSKCC: Memorial Sloan Kettering Cancer Center; NCI: National Cancer Institute.

Modified with permission from [16].

DSS: Decision support systems; EORTC: European Organization for Research and Treatment of Cancer; MGH: Massachusetts General Hospital; MSKCC: Memorial Sloan Kettering Cancer Center; NCI: National Cancer Institute. Modified with permission from [16]. The PRE.M.I.S.E. project aims to focus on SRT in every anatomic site, which will lead us to strongly emphasize both the technical and dosimetrical aspects of stereotactic treatments in our ontology. In fact, when approaching SRT from an ontology perspective, a large number of variables have to be taken into consideration. Gantry-based LINAC (lin[ear] ac[celerator]) systems use either fixed circular collimators or multileaf collimators. Treatment planning imaging is based on CT scans, but other images including magnetic resonance images and positron emission tomography, which can be fused to the treatment CT. Once again, different on-board imaging can be used to assure patient alignment. The treatment can be delivered as either multiple arcs or as one continuous arc. The isocenter is generally in the middle of the target lesion, but newer systems with volumetric modulated arc therapy allow for treatment of multiple lesions in a single arc. Dose prescription varies and treatments can be prescribed to the 60–80% isodose line or to 95–98% of the planning target volume and dose distribution can be inhomogeneous or homogenous [27]. In order to easily face the complexity of such a sophisticated technique we decided to start writing the SRT ontology, focusing on brain stereotactic treatment. The choice to build an ontology for each anatomic site was driven both by the need to reduce the risk to deal with a lot of variables, thus omitting relevant ones and by the necessity to reduce the bias of target motion in other anatomic sites (e.g., the lungs and liver). The team identified more than 130 variables related to brain SRT (isodose line prescription, resolution grid etc.) and organized them into three levels (registry, procedural and research) in order to classify all the information to easily address the query depending on requests. In trying to create an SRT common language we faced a lot of difficulties (Table 4). First, the lack of a unique definition for SRT in terms of dose, fractions and dose homogeneity and second, differences in treatment and planning modalities among different centers, led our research toward collecting a greater number of variables that needed to be included in our ontology, in order to make it suitable for anyone. Another important aspect we faced is represented by multiple lesions treatments. In these cases, lesions can either be treated as a group with a single isocenter or as a single lesion separated from the others, thus, using multiple isocenters. When defining variables for the lesions’ position, we realized that no standard exists in literature for defining tridimensional distance between lesions. We decided to assume each lesion as a sphere and calculate both the distance between the equivalent spheres and the distance among their longitudinal axis. The latter parameter appears to be important in clinical practice when deciding to treat different lesions with single or multiple radiotherapy plans. Collecting the distance between lesions could also be useful because the predictive model could be able to suggest how to treat multiple lesions (with one or multiple isocentres).
Table 4.

Expected data quality issues and measures of mitigation.

ProblemExample problemMitigationExample mitigation
Completely missing dataHospital A does not have a diffusion MRI, so all MRI diffusion weighted images derived features are missing. Hospital B has and uses a diffusion MRI in patients with brain metastasisImpute based on populations from other centers and what is known for the patientSuppose a (probabilistic) relationship between tumor size and is learned from Hospital B, then the tumor size of Hospital A can be used to infer MRI diffusion weighted parameters in Hospital A even if they don't have a MRI diffusion and are using the same scan protocols
Randomly missing dataRandom physician in Hospital A forgets to note the TNM stage of the patientBecause data are missing randomly, the percentage of missing data is generally low and samples are large, machine learning techniques will be unaffected by these errorsDo nothing
Biased data: continuousAn MRI scanner is calibrated differently in Hospital A than in Hospital B, so the SUV values are differentAssuming patients are similar a conversion is possible between two distributionsDetermine the distribution of MRI features in Hospital A and B and derive a conversion function from SUVs in Hospital A to hospital B
Biased data: scoring systemCTCAE v3 was used, but after a certain date CTCAE v4 was used to score toxicitiesImpute the new score from the old score, if possibleA (probabilistic/deterministic) conversion between the two CTC systems is possible
Random errorsIn Hospital A, a physician has noted an incorrect stage on an individual patientBecause errors are random, the percentage errors will be low and samples are large, the effect when using machine learning will be lowDo nothing
Biased missing dataIn Hospital A, severe toxicities are noted but mild toxicities are not.In Hospital B, toxicities are always noted Compare occurrence of toxicities in Hospital A with Hospital B. Detect too low, unexplained mild toxicities in Hospital A. Infer a probability of mild toxicity for patients of Hospital A based on the distribution of Hospital B

CTCAE: Common terminology criteria for adverse events; SUV: Standardized uptake value; TNM: Tumor, node, metastasis.

CTCAE: Common terminology criteria for adverse events; SUV: Standardized uptake value; TNM: Tumor, node, metastasis. Brain SRT is usually employed when treating brain metastases. This aspect implies the need to include the primary tumor and its stage in the ontology not excluding all complementary treatments. We considered variables regarding new therapies such as immunotherapy and target therapy, for which an internationally recognized standard timing for concomitant radiotherapy is not yet available. PRE.M.I.S.E. perspectives reside in the need to develop a system allowing the clinical decision-making process to be shared between physicians and patients in order to choose the best tailored treatment. This project could lead to the development of predictive models based on individual patients features complementing the existing consensus or guidelines. The large amount of clinical data can then be further processed either through more classical statistical approaches or through the use of modern machine learning tools, which can be further refined into reliable clinical decision making support tools in order to guarantee a personalized approach to medicine. Clinical evidences are difficult to be generated rapidly, in a reliable way and the analysis of retrospective case series can present data collection biases due to known outcomes.

Conclusion

This project represents the first example of a standardized data collection system created for a particular radiation therapy technique and specifically for SRT. The next step of this initiative is patient enrollment. The setup of a DSS-based validated prediction model represents the long term aim of the project and could be helpful in personalizing treatment choices, both in terms of efficacy and toxicity and in identifying the most suitable patients to be included in future randomized clinical trials [8,28].

Future perspective

We intend to substantially expand the number institutions engaged with the project and the data collection efforts, initiating a parallel effort to incorporate start an ontology for stereotactic body radiation therapy into the workflow. Moreover, we aim to provide a DSS capable of individualizing the SRT treatment: developing, validating and improving prediction models for overall survival, local control, disease free survival as well as acute and late radiation-induced side effects relevant for patients that undergo a stereotactic treatment. These prediction models could be very useful to better informs patients on the risks (acute and late toxicity) and benefits of the treatment. ‘Personalized medicine’ is defined by the National Cancer Institute (MA, USA) as a “form of medicine that uses information about a person’s genes, proteins and environment to prevent, diagnose and treat disease. In cancer, personalized medicine uses specific information about a person’s tumor to help diagnose, plan treatment, find out how well treatment is working or make a prognosis”. The tendency toward individualized medicine and the increasing amount and complexity of data, makes extremely difficult to identify which clinical decisions are better for specific patients. In daily clinical practice, decision support systems could help to personalize clinical choice. The ontology is a system to collect heterogeneous data in a standardized way in order to create large databases. The creation of an ontology increased the power of description by moving from local data dictionaries to a global data vocabulary. The storage system architecture is based on the use of a specific software called Beyond Ontology Awareness, which proposes two distinct data consolidation approaches and two data processing strategies. The complete algorithm can be summed up by the following steps: The local application learns a model from local data; This local model is sent to the master, where it is processed and compared with the models sent by the other hospitals; A consensus model is generated and sent back to each hospital for refinement; After preset convergence criteria are met, a final consensus model is generated. PRE.M.I.S.E. project innovation resides mainly in having created an ontology for a particular radiation therapy technique instead of creating a model that only concerns a specific pathology. To provide decision support system capable of individualizing the treatment: Development, validation and improvement of prediction models for overall survival, local control and disease-free survival for patients that undergo a stereotactic treatment; Development, validation and improvement of prediction models for acute and late radiation-induced side effects relevant for patients that undergo a stereotactic treatment; Use of prediction models to better informs patients on the risks (acute and late toxicity) and benefits of the treatment.
  15 in total

Review 1.  Standardized data collection to build prediction models in oncology: a prototype for rectal cancer.

Authors:  Elisa Meldolesi; Johan van Soest; Andrea Damiani; Andre Dekker; Anna Rita Alitto; Maura Campitelli; Nicola Dinapoli; Roberto Gatta; Maria Antonietta Gambacorta; Vito Lanzotti; Philippe Lambin; Vincenzo Valentini
Journal:  Future Oncol       Date:  2015-12-17       Impact factor: 3.404

2.  WebDISCO: a web service for distributed cox model learning without patient-level data sharing.

Authors:  Chia-Lun Lu; Shuang Wang; Zhanglong Ji; Yuan Wu; Li Xiong; Xiaoqian Jiang; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2015-07-09       Impact factor: 4.497

3.  Nomograms for predicting local recurrence, distant metastases, and overall survival for patients with locally advanced rectal cancer on the basis of European randomized clinical trials.

Authors:  Vincenzo Valentini; Ruud G P M van Stiphout; Guido Lammering; Maria Antonietta Gambacorta; Maria Cristina Barba; Marek Bebenek; Franck Bonnetain; Jean-Francois Bosset; Krzysztof Bujko; Luca Cionini; Jean-Pierre Gerard; Claus Rödel; Aldo Sainato; Rolf Sauer; Bruce D Minsky; Laurence Collette; Philippe Lambin
Journal:  J Clin Oncol       Date:  2011-07-11       Impact factor: 44.544

4.  The future of predictive models in radiation oncology: from extensive data mining to reliable modeling of the results.

Authors:  Vincenzo Valentini; Nicola Dinapoli; Andrea Damiani
Journal:  Future Oncol       Date:  2013-03       Impact factor: 3.404

5.  An umbrella protocol for standardized data collection (SDC) in rectal cancer: a prospective uniform naming and procedure convention to support personalized medicine.

Authors:  Elisa Meldolesi; Johan van Soest; Nicola Dinapoli; Andre Dekker; Andrea Damiani; Maria Antonietta Gambacorta; Vincenzo Valentini
Journal:  Radiother Oncol       Date:  2014-05-19       Impact factor: 6.280

6.  Nomograms for predicting survival of patients with newly diagnosed glioblastoma: prognostic factor analysis of EORTC and NCIC trial 26981-22981/CE.3.

Authors:  Thierry Gorlia; Martin J van den Bent; Monika E Hegi; René O Mirimanoff; Michael Weller; J Gregory Cairncross; Elizabeth Eisenhauer; Karl Belanger; Alba A Brandes; Anouk Allgeier; Denis Lacombe; Roger Stupp
Journal:  Lancet Oncol       Date:  2007-12-21       Impact factor: 41.316

7.  A new standardized data collection system for interdisciplinary thyroid cancer management: Thyroid COBRA.

Authors:  Luca Tagliaferri; Carlo Gobitti; Giuseppe Ferdinando Colloca; Luca Boldrini; Eleonora Farina; Carlo Furlan; Fabiola Paiar; Federica Vianello; Michela Basso; Lorenzo Cerizza; Fabio Monari; Gabriele Simontacchi; Maria Antonietta Gambacorta; Jacopo Lenkowicz; Nicola Dinapoli; Vito Lanzotti; Renzo Mazzarotto; Elvio Russi; Monica Mangoni
Journal:  Eur J Intern Med       Date:  2018-02-21       Impact factor: 4.487

8.  Stereotactic radiosurgery and stereotactic radiotherapy for brain metastases.

Authors:  Lia M Halasz; Jason K Rockhill
Journal:  Surg Neurol Int       Date:  2013-05-02

9.  Randomised controlled trials and population-based observational research: partners in the evolution of medical evidence.

Authors:  C M Booth; I F Tannock
Journal:  Br J Cancer       Date:  2014-01-14       Impact factor: 7.640

Review 10.  ENT COBRA ONTOLOGY: the covariates classification system proposed by the Head & Neck and Skin GEC-ESTRO Working Group for interdisciplinary standardized data collection in head and neck patient cohorts treated with interventional radiotherapy (brachytherapy).

Authors:  Luca Tagliaferri; Ashwini Budrukkar; Jacopo Lenkowicz; Mauricio Cambeiro; Francesco Bussu; Jose Luis Guinot; Guido Hildebrandt; Bengt Johansson; Jens E Meyer; Peter Niehoff; Angeles Rovirosa; Zoltán Takácsi-Nagy; Luca Boldrini; Nicola Dinapoli; Vito Lanzotti; Andrea Damiani; Roberto Gatta; Bruno Fionda; Valentina Lancellotta; Tamer Soror; Rafael Martìnez Monge; Vincenzo Valentini; György Kovács
Journal:  J Contemp Brachytherapy       Date:  2018-06-30
View more
  1 in total

1.  Welcome to volume 7 of Future Science OA.

Authors:  Francesca Lake
Journal:  Future Sci OA       Date:  2020-11-09
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.