Literature DB >> 32393280

Clinical implementation of MRI-based organs-at-risk auto-segmentation with convolutional networks for prostate radiotherapy.

Mark H F Savenije^1,2, Matteo Maspero^3,4, Gonda G Sikkes¹, Jochem R N van der Voort van Zyp¹, Alexis N T J Kotte¹, Gijsbert H Bol¹, Cornelis A T van den Berg^1,2.

Abstract

BACKGROUND: Structure delineation is a necessary, yet time-consuming manual procedure in radiotherapy. Recently, convolutional neural networks have been proposed to speed-up and automatise this procedure, obtaining promising results. With the advent of magnetic resonance imaging (MRI)-guided radiotherapy, MR-based segmentation is becoming increasingly relevant. However, the majority of the studies investigated automatic contouring based on computed tomography (CT).
PURPOSE: In this study, we investigate the feasibility of clinical use of deep learning-based automatic OARs delineation on MRI.
MATERIALS AND METHODS: We included 150 patients diagnosed with prostate cancer who underwent MR-only radiotherapy. A three-dimensional (3D) T1-weighted dual spoiled gradient-recalled echo sequence was acquired with 3T MRI for the generation of the synthetic-CT. The first 48 patients were included in a feasibility study training two 3D convolutional networks called DeepMedic and dense V-net (dV-net) to segment bladder, rectum and femurs. A research version of an atlas-based software was considered for comparison. Dice similarity coefficient, 95% Hausdorff distances (HD95), and mean distances were calculated against clinical delineations. For eight patients, an expert RTT scored the quality of the contouring for all the three methods. A choice among the three approaches was made, and the chosen approach was retrained on 97 patients and implemented for automatic use in the clinical workflow. For the successive 53 patients, Dice, HD95 and mean distances were calculated against the clinically used delineations.
RESULTS: DeepMedic, dV-net and the atlas-based software generated contours in 60 s, 4 s and 10-15 min, respectively. Performances were higher for both the networks compared to the atlas-based software. The qualitative analysis demonstrated that delineation from DeepMedic required fewer adaptations, followed by dV-net and the atlas-based software. DeepMedic was clinically implemented. After retraining DeepMedic and testing on the successive patients, the performances slightly improved.
CONCLUSION: High conformality for OARs delineation was achieved with two in-house trained networks, obtaining a significant speed-up of the delineation procedure. Comparison of different approaches has been performed leading to the succesful adoption of one of the neural networks, DeepMedic, in the clinical workflow. DeepMedic maintained in a clinical setting the accuracy obtained in the feasibility study.

Entities: CellLine Chemical Disease Gene Species

Keywords: Artificial intelligence; Contouring; Deep learning; Delineation; MR-only treatment planning; Magnetic resonance imaging; Prostate cancer; Radiotherapy; Segmentation

Mesh：

Year: 2020 PMID： 32393280 PMCID： PMC7216473 DOI： 10.1186/s13014-020-01528-0

Source DB: PubMed Journal: Radiat Oncol ISSN： 1748-717X Impact factor: 3.481

Background

Structure delineation is a necessary, yet time-consuming manual procedure in radiotherapy. Consistent and accurate delineation of organs-at-risk (OARs) and target structures for prostate patients is vital when performing dose escalation and treating patients with highly conformal plans [1]. Traditionally, computed tomography (CT) has been used for radiotherapy simulation and structure delineation [2]. In the last few decades, magnetic resonance imaging (MRI) has found its way for radiotherapy simulation as it provides superior soft-tissue contrast compared to CT [3, 4], thus enabling more accurate delineation of target regions and critical structures compared to CT [5-7]. The manual segmentation of anatomical structures is a time-consuming process [8]. Besides, with the advent of MR-guided radiotherapy [9-11], the accuracy and speed of delineations become the weakest link [12] that hinders the possibilities of online adaptive radiotherapy by being responsible for longer fraction time [13]. To automatically perform delineations of target and OARs for patients affected by prostate cancer, various methods have been developed over the past years. For example, three-dimensional (3D) deformable model surface [14], organ-based modelling [15], and atlas-based solutions [16, 17] have been demonstrated. For all these methods, the time required to perform segmentation is in the order of minutes, if not hours, which is excessive to enable online adaptive treatments. To obviate this limitation, currently in online treatments only the target delineations and the OARs in the vicinity of the target (e.g. within a ring of 3-5 cm) are adjusted due to the excessive time needed for OARs segmentation [18-20]. Recently, deep learning has been proposed to speed-up and automatise automatic segmentation obtaining promising results [8, 21, 22]. Deep learning is a branch of artificial intelligence and machine learning that involves the use of neural networks to generate a hierarchical representation of the input data to achieve a specific task without the need of hand-engineered features [23, 24]. Many studies focused on target delineations [8] reaching mean dice similarity coefficients compared to manual delineations in the range 0.82-0.95 [25-31]. Automatic delineation of OARs is also a crucial aspect to achieve full online adaptive radiotherapy and to possibly save time to manual contouring. In this study, we aim at investigating the feasibility of convolutional neural network-based automatic OARs delineation on MRI. A preliminary retrospective study was conducted to select a suitable network architecture and prepare for clinical implementation. After having chosen the most suitable convolutional network and performing clinical implementation, performances of automatic deep learning-based OARs delineation from our clinic are presented.

Material and methods

Patient data collection

Patients diagnosed with intermediate and high-risk prostate cancer undergoing MR-only radiotherapy [32] in the period between June 2018, and January 2020 were included in the study. Further inclusion criteria were: the presence of four gold fiducial markers for position verification and absence of hip implants. The patients were also scanned with a specific radio-frequency spoiled gradient-recalled echo (SPGR) sequence that will be described in more detail further on. The clinical exclusion criteria for MR-only radiotherapy were: patients with more than four positive lymph-nodes (N1, as on PET-CT or after pelvic lymph-nodes dissection), life expectancy <10 years (as from WHO >3), prior pelvic irradiation, IPSS >20, presence of prostatitis, active Crohn’s disease, colitis ulcerosa or diverticulitis, an anastomotic bowel in the high dose region and patients undergoing trans-rectal prostate resection less than three months before treatment. With the application of these exclusion criteria, a total of 150 patients that were included in this study and treated with external beam radiotherapy. For all patients, 3T MRI (Ingenia MR-RT, v 5.3.1, Philips Healthcare, the Netherlands) was acquired after requesting the patients to empty their bladder and drink 200-300 ml of water one hour before the acquisition. Patients were positioned on a vendor-provided flat table using a knee support cushion (lower extremity positioning system, without adjustable FeetSupport, MacroMedics BV, the Netherlands). Patients were tattooed at the MRI with the aid of a laser system (Dorado3, LAP GmbH Laser Applikationen, Germany) to facilitate treatment positioning. Also, MR-visible markers (PinPoint Ⓡ for Image Registration 128, Beekley Medical, USA) were used to identify the set-up location on MRI. MR images were acquired using anterior and posterior phased array coils (dS Torso and Posterior coils, 28 channels, Philips Healthcare, the Netherlands). Two in-house-built bridges supported the anterior coil to avoid skin contour deformation. OARs were contoured on Dixon images [33] obtained with a dual-echo three-dimensional (3D) Cartesian radio-frequency SPGR sequence. For each patient, in-phase (IP), water (W), and fat (F) images [34] (Fig. 1) were reconstructed as in [35]. Dixon images were generated as part of a proprietary solution (MRCAT, rev. 257, Philips Healthcare, Finland) that enabled MR-based dose calculation for patients with prostate cancer [36, 37]. The imaging parameters, reported in Table 1, were locked by the vendor; therefore, they were stable through the whole study. Radiotherapy technicians (RTTs) with dedicated experience in contouring delineated bladder, rectum and femurs using IP, W and F Dixon images. The OARs delineations were approved or revised by a radiation oncologist. Besides, the radiation oncologist delineated the target structures. The delineation indications followed RTOG guidelines [38] requiring that the rectum was delineated from the outer part of the sphincter (anus) until the sigmoid fold (expected length of the rectum was 10-15 cm), as described in [39], with the sphincter delineated as a separate structure. The bladder was entirely delineated, while the femurs were delineated in the whole FOV of the image. In the case of regional radiotherapy, the bowel bag was also included.

Fig. 1

Table 1

Image parameters of the sequences used for the OARs contouring. The term FOV refers to the field-of-view, while AP to anterior-posterior and LR to right-left

Imaging parameters	Value
TE₁/TE₂/TR [ms]	1.2/2.5/3.9
Flip Angle [ ^∘]	10
FOV^∗ [cm³]	55.2x55.2x30
Acquisition Matrix^∗	324x324x120
Reconstruction Matrix^∗	528x528x120
Reconstructed Voxel^∗ [mm³]	1x1x2.5
Bandwidth [Hz/px]	1072
Readout direction	AP
Phase direction	RL
Geometry correction	3D
Acquisition time	2 min 17 s

∗expressed in terms of anterior-posterior (AP), right-left (RL) and superior-inferior directions

Transverse view of in-phase (IP), water (W) and fat (F) images for a patient (69 yo) diagnosed with T2b cancer. Note the large portion of void space surrounding the patient body. Cropping has been applied as preprocessing to remove such void regions Image parameters of the sequences used for the OARs contouring. The term FOV refers to the field-of-view, while AP to anterior-posterior and LR to right-left ∗expressed in terms of anterior-posterior (AP), right-left (RL) and superior-inferior directions

Study design

The first 48 patients (treated until January 2019) were included in a feasibility study training two state-of-the-art 3D convolutional networks called DeepMedic [40] and dense V-net (dV-net) [41] (“Networks architecture, image processing and training” section). Three-fold cross-validation was performed, splitting the patients in 32/16 for train/validation. The network hyperparameters were optimised on the first fold and maintained for the other two folds. For example, the number of epochs was chosen considering the loss function in the validation set by performing early stopping when loss function did not decrease after five consecutive epochs. The performance of the networks was compared against a research version of commercial software based on multi-atlases and deformable registration and against the clinically used delineations (“Evaluation” section). This preliminary study enabled us to choose among the three automatic methods. The preferred approach was retrained on 97 patients that were imaged and treated until August 2019; it was implemented for automatic use in the clinical workflow. The performances of the implemented model were reported on the 53 successive consecutively treated patients. A schematic overview of the study design is presented in Fig. 2.

Fig. 2

Schematic of the study design representing the timeline and the number of patients included. Also, the length and the number of patients for the preliminary study, the training of the final model and the patients used for testing the clinical implementation are reported

Networks architecture, image processing and training

Three-dimensional network architectures were chosen to investigate performance differences considering as perceptive field the whole volumes or smaller patches. In particular, DeepMedic [40] was the network chosen to perform patch-based training, while dV-net [41] was chosen to perform training on whole volumes. The two architectures, which will be described in detail in the next sections, required similar pre-processing. Three channels were used as input: IP, W and F images. The OARs that were considered as target are: bladder, rectum, right and left femur; they were decoded as masks with values from 1 to 4 without overlapping each others. To increase the amount of contextual information, the CTV was also decoded with a value of 5, which means that the networks also output CTV. Note that CTV was not considered in our study given that CTV delineation is clinically based on a different MRI, i.e. T2-weighted turbo spin-echo sequences [42]. The networks were trained on a graphical processing unit (GPU) Tesla P100 (NVIDIA Corporation, USA) with 16 GB of memory. To allow the whole volume to fit on the GPU, the IP, W and F images were initially cropped with 90 voxels at the borders of the anterior-posterior and lateral directions obtaining matrices of 348x348x120 voxels. Note that an observer controlled the presence of femurs within the FOV. Also, the image intensity of IP, W and F were clipped at their respective 99.9 percentile per each patient volume. Images were subsequently divided by the standard deviation (σ), and then a fixed value of 1 was subtracted. After training and inference of the networks, the delineations were post-processed generating four binary volumes. Morphological operations of closure and hole filling by one voxel were applied. The largest 3D connected region was selected for each delineated structure. These operations were performed to remove possible small-sized delineations that may have been found by the networks.

DeepMedic

The DeepMedic [40] implementation employed was provided by the Kamnitsas et al.1 in Tensorflow v1.7. The model employed a three-pathway architecture for multi-resolution processing of 3D patches. A low, medium and high-resolution pathway with receptive fields of 853, 513, 173 voxels were employed with each pathway consisting of 11-layers. A fully connected network (FCN) was used for combining the pathways and post-processing, as presented by Kamnitsas et al. [40]. Note that the size of the receptive fields has been modified compared to the original implementation. The training configuration was kept as the original, with learning rate = 0.001, Adam optimiser with momentum = 0.6, epochs = 35, batch size = 10 and and regularisations 2 weighted with factor 0.000001 and 0.0001, respectively. The configuration file is reported in the Supplementary Material. All the OARs were equally sampled during training enforcing that the patches considered in each epoch contains the four OARs the same amount of times. Also, as in Kamnitsas et al. [40], volumetric dice similarity coefficient was adopted as the loss function. Data augmentation was applied in terms of random shifts and rescaling perturbation of the intensity (I) by the following: I′=(I+s)∗m, where s and m where Gaussian distributed with μ=0, 1 and σ=0.05, 0.01, respectively. For training, DeepMedic made use of about 9 GB of GPU memory.

Dense v-net

The dV-net implementation provided in NiftyNet was employed3. It consisted of a 3D U-Net with a sequence of three downsampling and dense upsampling feature strided stacks with skip connections to propagate higher resolution information to the final segmentation. Dilated convolutions were employed to reduce the number of features [41]. The training configuration was kept as the original, with learning rate= 0.001, Adam optimiser with momentum = 0.6, batch size = 6, regularisation (weight = 0.001) and epoch = 25. The configuration file is reported in the Supplementary Material. Dice was adopted as loss function, and data augmentation was applied in terms of elastic deformation, as implemented within NiftyNet. For training, dV-net made use of about 16 GB of GPU memory.

Evaluation

Preliminary study

The first 48 patients treated between June 2018 and August 2019 were included in a preliminary study to compare the performance of the two networks and atlas-based approach to the delineation used during clinical treatment planning. The advanced medical imaging registration engine (ADMIRE, research version 1.13.5, Elekta AB, Sweden) was the software considered; ADMIRE is based on multi-atlases [43, 44] and gradient-free dense mutual information deformable registration [45]. In particular, the rectum was delineated based on the F image, bladder and femurs were delineated based on IP images using an atlas of 9 patients that were previously acquired with the same sequence. ADMIRE took about 10 to 15 minutes to generate automatic contouring on a Tesla K20c GPU (NVIDIA Corporation, USA) with 6 GB of memory. Performances of the three automatic approaches were evaluated in terms of (volumetric) dice similarity coefficients (DSC), 95% boundary Hausdorff distances (HD95) [46], mean surface distance (MSD) against clinical delineations. All the metrics were calculated using Plastimatch4, except for the surface distance, which was calculated as from https://github.com/deepmind/surface-distance. In particular, violin plots [47] representing the mean, σ, 95% percentile and the probability distribution were obtained for the three metrics. Also, Wilcoxon signed-rank tests were conducted among the three evaluation metrics with a confidence interval of 0.05. For a subset of 8 patients, an RTT with five years of experience in contouring scored the quality of the delineations for all three methods. The delineations were classified from zero to three, which corresponds to clinically acceptable, small modifications, large modifications, or clinically unacceptable contours. In total, the RTT scored 96 delineations. The percentage of each score over all the contours was reported for the three methods and visualised in a pie chart. Also, the most challenging structures (structures with an average score ≥2) were reported for each method.

Clinical implementation

After a choice was made among the three automatic approaches, the best performing network was retrained for the first 97 patients that were included up to August 2019. The hyperparameters were identical to the preliminary study. The network was implemented for clinical use complying with the medical device regulation (MDR 2017/745)5. Quantitative evaluation was perfomed in terms of DSC, HD95 and MSD for the 53 consecutive patients undergoing MR-only radiotherapy from August 2019 to January 2020. The delineations adopted for clinical use, i.e. delineated by RTTs and approved or re-adjusted by a radiation oncologist, were considered as reference. Also, surface dice similarity coefficient (SDSC) [48] was calculated6 to enable comparison with previous work [49]. Besides, the performance of the network clinically implemented was compared with the performance of the same network obtained during the preliminary study.

Results

Timing performance

The inference time of the network was about 60 s for DeepMedic and approximately 4 s for dV-net using the full resolution images of 328x328x120 voxels on GPU. ADMIRE generated contours in approximately 14 min on GPU.

Preliminary study

Figure 3 represents the violin plots for DSC, HD95 and the MSD. One can observe that performances were higher for both the networks compared to ADMIRE. For the bladder, no significant differences were observed between the networks, but significant differences were observed between the networks and ADMIRE. For the rectum, no significant differences were observed among the three automatic methods. When considering the femurs, DeepMedic outperformed both dV-net and ADMIRE. For example, for the right femur, the mean (±σ) HD95 was 2.2 ±1.4, 2.5 ±1.8 and 3.2 ±1.4 mm for DeepMedic, dV-net and ADMIRE, respectively.

Fig. 3

Violin plots representing the mean (white dot), σ (black vertical rectangle), 95% percentile (black vertical line) and the probability distribution for the dice similarity coefficient (DSC, top) and 95% Hausdorff distance (HD95, middle) and surface distance (bottom) for the OARs against clinical contours in among the preliminary study. The statistical significance of the Wilcoxon signed-rank test is reported as well as the mean(±σ) of each metric. The asterisks represent p ≤0.05 (∗), p ≤0.01 (∗∗) and p ≤0.001 (∗∗∗) The qualitative scoring by an RTT expert (Fig. 4) demonstrated that delineations from DeepMedic required fewer adaptations, followed by dV-net and then ADMIRE. Specifically, the expert RTT stated that, for all the structures, the number of delineations that were acceptable or needed small adjustment was 81%, 59% and 3% for DeepMedic, dV-net and ADMIRE, respectively. For both the networks, the rectum followed by bladder were indicated as the most challenging structures, while for ADMIRE, the bladder followed by rectum and femurs (same scoring) were the structures considered as the most challenging (score ≥ 2).

Fig. 4

Pie chart reporting the percentage of the qualitative scoring performed by the expert RTT for each auto-segmentation method

Clinical implementation

On the basis of the preliminary analysis, we decided to implement DeepMedic for our clinic. Clinical implementation was performed in August 2019. The performance of DeepMedic in the preliminary study and after clinical implementation are presented in Table 2. After retraining DeepMedic and testing on the successive patients, the performances slightly improved. For example, it can be observed that, on average, the performance of DSC, HD95 and MSD after retraining the network on a more extensive set was ameliorated by 0.01-0.03, 1.2-1.4 mm and 0.1-0.4 mm, respectively. Delineations obtained with DeepMedic for a patient in the test set are presented in Fig. 5.

Table 2

Comparison of performance between the preliminary study (PS) and after the clinical implementation (Clinic) for DeepMedic in terms of (volumetric) dice similarity coefficient (DSC), 95% Hausdorff distance (HD95) and mean surface distance (MSD)

Site	DSC		HD₉₅		MSD
	PS	Clinic	PS	Clinic	PS	Clinic
			[mm]	[mm]	[mm]	[mm]
Bladder	0.95 ±0.03	0.96 ±0.02	3.8 ±3.4	2.5 ±1.1	1.0 ±0.6	0.6 ±0.3
Rectum	0.85 ±0.07	0.88 ±0.05	8.3 ±5.0	7.4 ±4.4	2.1 ±1.1	1.7 ±0.8
Femur_L	0.96 ±0.01	0.97 ±0.01	2.2 ±1.4	1.6 ±0.5	0.6 ±0.2	0.5 ±0.1
Femur_R	0.96 ±0.01	0.97 ±0.01	1.9 ±0.4	1.5 ±0.6	0.6 ±0.1	0.5 ±0.1

Fig. 5

Example of in-phase MRI after cropping along with segmentations (OARs) obtained with DeepMedic (contours) versus clinical segmentations (filled contours) in the transverse (left), coronal (centre) and sagittal (right) view for a patient in the test. For this patient, average performance was obtained in terms of DSC: 0.96, 0.86, 0.97 and 0.97 for bladder, rectum, and femurs, respectively. Note that DeepMedic also outputs CTV, but it was not considered for clinical evaluation Comparison of performance between the preliminary study (PS) and after the clinical implementation (Clinic) for DeepMedic in terms of (volumetric) dice similarity coefficient (DSC), 95% Hausdorff distance (HD95) and mean surface distance (MSD) Also, the SDSC was calculated for several threshold, τ= 0.5, 1, 1.5, 2 and 3 mm as reported in Fig. 6. The mean (±σ) DSCS was 0.98 ±0.03, 0.92 ±0.05, 0.989 ±0.008 and 0.997 ±0.003 for τ=2 mm for bladder, rectum, left and right femur, respectively.

Fig. 6

Boxplots for each structure of surface Dice similarity coefficient (SDSC) as a function of threshold (τ) for the 53 patients after clinial implementation. The data is plotted for the range of τ from sub-pixel (0.5 mm) to above the voxel size (3 mm). Box plots are shown with an inter-quartile range from 25 to 75% with the horizontal line representing mean value. Upper and lower whisker represent the 2.5 and 97.5 percentiles

Discussion

The use of MRI for prostate radiotherapy delineation is becoming increasingly common among radiotherapy departments [50]. MRI are used to plan radiotherapy [32, 51]. Besides, use of MRI is also accelerated by the adoption of new advancements in linear accelerator technology, whereby daily MR imaging in treatment position is possible [9-11]. In this study, we demonstrated that deep learning-based approaches can utilise MRI to automatically segment OARs achieving high conformality. Also, a convolutional network has been implemented for clinical use, demonstrating the capability to maintain the performances obtained in a preliminary study. Table 3 compiles previous work based on the use of convolutional networks and a selection of conventional approaches [16, 17, 52] for OARs delineation in the pelvic area. One can notice that CT-based segmentation [53-55] achieved mean DSC in the range 0.88-0.95 for prostate, rectum and bladder. Also, MRI-based segmentation [27, 49, 56] achieved mean DSC in the range 0.82-0.95. This study seems to outperform previous studies in almost all the metrics (in bold in the Table) except for the rectum, as obtained by Kazemifar et al. [54] and the HD95 and MSD as obtained by Kazemifar et al. and Dong et al. [56]. Comparing the results of automated contouring methods should be done with caution. For example, the guidelines used for clinical delineation may be different, and the impact of inter-observer variability on deep learning-based methods is not generally investigated [57]. In this sense, our study is novel given that a comparison of approaches based on CNNs to an atlas-based method is presented.

Table 3

Study	Pts	Modality	Method(s)	Bladder	Rectum	Femur_L	Femur_R
				DSC	DSC	DSC	DSC
				HD₉₅	HD₉₅	HD₉₅	HD₉₅
				MSD	MSD	MSD	MSD
Convolutional network-based
Men2017 [53]	218/60 ^∗	CT	2D	0.92		0.93	0.92
			dilated
			VGG-16
Feng2018 [27]	30/10 ^∗	MRI	Multi-task	0.952 ±0.007	0.88 ±0.03
			residual
			2D FCN
Kazemifar2018 [54]	51/9/20 ^∗	CT	2D	0.95 ±0.04	0.92±0.06
			U-net	0.4±0.6	0.2±0.3
				1.1 ±0.8^a	0.8±0.6^a
Balagopal2018 [55]	108/28	CT	2D U-net	0.95 ±0.02	0.84 ±0.04	0.96 ±0.03	0.95 ±0.01
	mean		+ 3D U-net	17.0 ±14.6	4.9 ±3.9
	4 models		(ResNeXT)	0.5 ±0.7	0.8 ±0.7
Dong2019 [56]	140x5 ⁺	MRI	3D Cycle-GAN	0.95 ±0.03	0.89 ±0.04
			+ deep attention	6.81 ±9.25	10.84 ±15.59
			U-net	0.52±0.22	0.92 ±1.03
Elguindi2019 [49]	40/10/50	MRI		0.93 ±0.04	0.82 ±0.05
			DeepLabV3+
				0.92 ±0.1^b	0.87 ±0.07^b
This study	97/53 ^∗	MRI	3D	0.96±0.02	0.88 ±0.05	0.97±0.01	0.97±0.01
			multi-scale	2.5 ±1.1	7.4 ±4.4	1.6±0.5	1.5±0.5
			DeepMedic	0.6±0.3	1.7 ±0.8	0.5±0.1	0.5±0.1
				0.98 ±0.03^c	0.92 ±0.05^c	0.989±0.008^c	0.997±0.003^c
Conventional
LaMacchia2012 [16]	5	CT	ABAS 2.0	0.93 ±0.03	0.77 ±0.07	0.94 ±0.04	0.94 ±0.04
			VelocityAI 2.6.2	0.72 ±0.15	0.75 ±0.04	0.92 ±0.02	0.92 ±0.03
			MIM 5.1.1	0.93 ±0.02	0.87 ±0.05	0.94 ±0.02	0.94 ±0.01
Dowling2015 [17]	39	MRI	multi-atlas	0.86 ±0.12	0.84 ±0.06	0.91 ±0.03
			voting
			diffeomorphic reg	5.1 ±4.6	2.4 ±1.0	1.5 ±0.5
Delpon2016 [52]	10/10 ^∗	CT	Mirada	0.76 ±0.12	0.73 ±0.07	0.89 ±0.05	0.91 ±0.03
				15 ±9	10 ±3	0.2 ±6.4	8.1 ±5.6
			MIM	0.80 ±0.14	0.75 ±0.07	0.89 ±0.08	0.92 ±0.02
				14.0 ±6.3	9.9 ±3.4	9.9 ±7.9	8.2 ±5.3
			ABAS	0.81 ±0.13	0.75 ±0.09	0.91 ±0.04	0.92 ±0.02
				13.6 ±7.9	9.9 ±4.4	8.6 ±6.9	8.5 ±6.1
			SPICE	0.76 ±0.26	0.68 ±0.12	0.70 ±0.05	0.72 ±0.03
				9.2 ±11.7	13 ±5	29.7 ±9.0	30 ±6.5
			Raystation	0.59 ±0.15	0.49 ±0.12	0.91 ±0.03	0.92 ±0.02
				28.5 ±13.1	16.5 ±3.7	8.8 ±7.2	6.4 ±5.0

∗ training/(validation)/test; + indicating x... cross-fold validation; mean surface Hausdorff distance; surface dice similarity coefficient as in [48] with τ=3 or 2 mm, respectively

Overview of the performance of automatic OARs delineations based on MRI and CT subdivided in convolutional network-based and conventional approaches. The number of patients included in the study (Pts), the imaging modality, a brief description of the method and metrics as dice similarity coefficient (DSC), 95% boundary Hausdorff distance (HD95) and mean surface distance (MSD) were reported for each study. HD95 and MSD are expressed in mm ∗ training/(validation)/test; + indicating x... cross-fold validation; mean surface Hausdorff distance; surface dice similarity coefficient as in [48] with τ=3 or 2 mm, respectively In this study, a qualitative assessment by a manual observer has been presented. Unfortunately, it has not been recorded whether the overall time for the delineation has been reduced. Previous studies investigated this aspect [58] when introducing deep learning-based techniques in their clinic. Also, it is unclear whether the performance of the network may further improve when a dataset larger than 97 patients is used for training. This may be an object of future research. The time necessary for automatic delineation on full FOV is within a minute. Such time-scale can be of interest for conventional radiotherapy and for MR-guided treatments. On the one hand, for conventional radiotherapy, fast automatic OAR segmentation may facilitate the reducing delays in the start of the treatments that may lead to hampered clinical outcomes [59]. On the other hand, for online adaptive MR-guided radiotherapy, fast OAR segmentation may relieve clinicians from dedicating effort in OARs segmentation while facilitating the delineation of the target [60]. Currently, it has been reported that about 5-10 min is necessary for the for delineation in an online setting [19]. The time frame reported in our work may facilitate online adaptive radiotherapy, especially with an integrated automatic workflow.

Conclusion

High conformality for OARs delineation was achieved with two in-house trained networks, obtaining a significant speed-up of the delineation procedure. One of the networks, DeepMedic, was successfully adopted in the clinical workflow maintaining in the clinical setting the accuracy obtained in the feasibility study conducted before clinical implementation. Additional file 1 Configuration files for the network architectures. As part of the supplementary material, it is possible to download the configuration files (ConfigFiles.zip) for DeepMedic (DeepMedic_model.cfg, DeepMedic_inference.cfg, DeepMedic_train.cfg) and dV-net (NiftyNet_train.ini). The configuration files for DeepMedic are three and contains the information regarding the model (DeepMedic_model.cfg), training (DeepMedic_train.cfg) and inference (DeepMedic_inference.cfg) of the network.

49 in total

1. Automated model-based organ delineation for radiotherapy planning in prostatic region.

Authors: Vladimir Pekar; Todd R McNutt; Michael R Kaus
Journal: Int J Radiat Oncol Biol Phys Date: 2004-11-01 Impact factor: 7.038

2. See, Think, and Act: Real-Time Adaptive Radiotherapy.

Authors: Paul Keall; Per Poulsen; Jeremy T Booth
Journal: Semin Radiat Oncol Date: 2019-07 Impact factor: 5.934

Review 3. Deep learning in medical imaging and radiation therapy.

Authors: Berkman Sahiner; Aria Pezeshk; Lubomir M Hadjiiski; Xiaosong Wang; Karen Drukker; Kenny H Cha; Ronald M Summers; Maryellen L Giger
Journal: Med Phys Date: 2018-11-20 Impact factor: 4.071

Review 4. Vision 20/20: perspectives on automated image segmentation for radiotherapy.

Authors: Gregory Sharp; Karl D Fritscher; Vladimir Pekar; Marta Peroni; Nadya Shusharina; Harini Veeraraghavan; Jinzhong Yang
Journal: Med Phys Date: 2014-05 Impact factor: 4.071

Review 5. The value of magnetic resonance imaging for radiotherapy planning.

Authors: Piet Dirix; Karin Haustermans; Vincent Vandecaveye
Journal: Semin Radiat Oncol Date: 2014-07 Impact factor: 5.934

6. The Australian magnetic resonance imaging-linac program.

Authors: Paul J Keall; Michael Barton; Stuart Crozier
Journal: Semin Radiat Oncol Date: 2014-07 Impact factor: 5.934

7. Clinical evaluation of atlas and deep learning based automatic contouring for lung cancer.

Authors: Tim Lustberg; Johan van Soest; Mark Gooding; Devis Peressutti; Paul Aljabar; Judith van der Stoep; Wouter van Elmpt; Andre Dekker
Journal: Radiother Oncol Date: 2017-12-05 Impact factor: 6.280

8. Pelvic normal tissue contouring guidelines for radiation therapy: a Radiation Therapy Oncology Group consensus panel atlas.

Authors: Hiram A Gay; H Joseph Barthold; Elizabeth O'Meara; Walter R Bosch; Issam El Naqa; Rawan Al-Lozi; Seth A Rosenthal; Colleen Lawton; W Robert Lee; Howard Sandler; Anthony Zietman; Robert Myerson; Laura A Dawson; Christopher Willett; Lisa A Kachnic; Anuja Jhingran; Lorraine Portelance; Janice Ryu; William Small; David Gaffney; Akila N Viswanathan; Jeff M Michalski
Journal: Int J Radiat Oncol Biol Phys Date: 2012-04-06 Impact factor: 7.038

9. Deep learning-based auto-segmentation of targets and organs-at-risk for magnetic resonance imaging only planning of prostate radiotherapy.

Authors: Sharif Elguindi; Michael J Zelefsky; Jue Jiang; Harini Veeraraghavan; Joseph O Deasy; Margie A Hunt; Neelam Tyagi
Journal: Phys Imaging Radiat Oncol Date: 2019-12-12

10. Systematic evaluation of three different commercial software solutions for automatic segmentation for adaptive therapy in head-and-neck, prostate and pleural cancer.

Authors: Mariangela La Macchia; Francesco Fellin; Maurizio Amichetti; Marco Cianchetti; Stefano Gianolini; Vitali Paola; Antony J Lomax; Lamberto Widesott
Journal: Radiat Oncol Date: 2012-09-18 Impact factor: 3.481

11 in total

1. Automatic Contour Refinement for Deep Learning Auto-segmentation of Complex Organs in MRI-guided Adaptive Radiation Therapy.

Authors: Jie Ding; Ying Zhang; Asma Amjad; Jiaofeng Xu; Daniel Thill; X Allen Li
Journal: Adv Radiat Oncol Date: 2022-04-20

Review 2. A Survey on Deep Learning for Precision Oncology.

Authors: Ching-Wei Wang; Muhammad-Adil Khalil; Nabila Puspita Firdi
Journal: Diagnostics (Basel) Date: 2022-06-17

3. Clinical implementation of deep learning contour autosegmentation for prostate radiotherapy.

Authors: Elaine Cha; Sharif Elguindi; Ifeanyirochukwu Onochie; Daniel Gorovets; Joseph O Deasy; Michael Zelefsky; Erin F Gillespie
Journal: Radiother Oncol Date: 2021-03-03 Impact factor: 6.901

4. Deep learning-based classification and structure name standardization for organ at risk and target delineations in prostate cancer radiotherapy.

Authors: Christian Jamtheim Gustafsson; Michael Lempart; Johan Swärd; Emilia Persson; Tufve Nyholm; Camilla Thellenberg Karlsson; Jonas Scherman
Journal: J Appl Clin Med Phys Date: 2021-10-08 Impact factor: 2.102

5. Automated Planning for Prostate Stereotactic Body Radiation Therapy on the 1.5 T MR-Linac.

Authors: Stefania Naccarato; Michele Rigo; Roberto Pellegrini; Peter Voet; Hafid Akhiat; Davide Gurrera; Antonio De Simone; Gianluisa Sicignano; Rosario Mazzola; Vanessa Figlia; Francesco Ricchetti; Luca Nicosia; Niccolò Giaj-Levra; Francesco Cuccia; Nadejda Stavreva; Dobromir S Pressyanov; Pavel Stavrev; Filippo Alongi; Ruggero Ruggieri
Journal: Adv Radiat Oncol Date: 2022-02-12

6. Clinical utility of convolutional neural networks for treatment planning in radiotherapy for spinal metastases.

Authors: Sebastiaan R S Arends; Mark H F Savenije; Wietse S C Eppinga; Joanne M van der Velden; Cornelis A T van den Berg; Joost J C Verhoeff
Journal: Phys Imaging Radiat Oncol Date: 2022-02-17

7. Impact of Denoising on Deep-Learning-Based Automatic Segmentation Framework for Breast Cancer Radiotherapy Planning.

Authors: Jung Ho Im; Ik Jae Lee; Yeonho Choi; Jiwon Sung; Jin Sook Ha; Ho Lee
Journal: Cancers (Basel) Date: 2022-07-22 Impact factor: 6.575

8. Dosimetric impact of deep learning-based CT auto-segmentation on radiation therapy treatment planning for prostate cancer.

Authors: Maria Kawula; Dinu Purice; Minglun Li; Gerome Vivar; Seyed-Ahmad Ahmadi; Katia Parodi; Claus Belka; Guillaume Landry; Christopher Kurz
Journal: Radiat Oncol Date: 2022-01-31 Impact factor: 3.481

9. The potential role of MR-guided adaptive radiotherapy in pediatric oncology: Results from a SIOPE-COG survey.

Authors: Enrica Seravalli; Petra S Kroon; John M Buatti; Matthew D Hall; Henry C Mandeville; Karen J Marcus; Cem Onal; Enis Ozyar; Arnold C Paulino; Frank Paulsen; Daniel Saunders; Derek S Tsang; Suzanne L Wolden; Geert O Janssens
Journal: Clin Transl Radiat Oncol Date: 2021-06-04

10. The dosimetric impact of deep learning-based auto-segmentation of organs at risk on nasopharyngeal and rectal cancer.

Authors: Hongbo Guo; Jiazhou Wang; Xiang Xia; Yang Zhong; Jiayuan Peng; Zhen Zhang; Weigang Hu
Journal: Radiat Oncol Date: 2021-06-23 Impact factor: 3.481