Purpose: Various deformable image registration (DIR) methods have been used to evaluate organ deformations in 4-dimensional computed tomography (4D CT) images scanned during the respiratory motions of a patient. This study assesses the performance of 10 DIR algorithms using 4D CT images of 5 patients with fiducial markers (FMs) implanted during the postoperative radiosurgery of multiple lung metastases. Methods: To evaluate DIR algorithms, 4D CT images of 5 patients were used, and ground-truths of FMs and tumors were generated by physicians based on their medical expertise. The positions of FMs and tumors in each 4D CT phase image were determined using 10 DIR algorithms, and the deformed results were compared with ground-truth data. Results: The target registration errors (TREs) between the FM positions estimated by optical flow algorithms and the ground-truth ranged from 1.82 ± 1.05 to 1.98 ± 1.17 mm, which is within the uncertainty of the ground-truth position. Two algorithm groups, namely, optical flow and demons, were used to estimate tumor positions with TREs ranging from 1.29 ± 1.21 to 1.78 ± 1.75 mm. With respect to the deformed position for tumors, for the 2 DIR algorithm groups, the maximum differences of the deformed positions for gross tumor volume tracking were approximately 4.55 to 7.55 times higher than the mean differences. Errors caused by the aforementioned difference in the Hounsfield unit values were also observed. Conclusions: We quantitatively evaluated 10 DIR algorithms using 4D CT images of 5 patients and compared the results with ground-truth data. The optical flow algorithms showed reasonable FM-tracking results in patient 4D CT images. The iterative optical flow method delivered the best performance in this study. With respect to the tumor volume, the optical flow and demons algorithms delivered the best performance.
Purpose: Various deformable image registration (DIR) methods have been used to evaluate organ deformations in 4-dimensional computed tomography (4D CT) images scanned during the respiratory motions of a patient. This study assesses the performance of 10 DIR algorithms using 4D CT images of 5 patients with fiducial markers (FMs) implanted during the postoperative radiosurgery of multiple lung metastases. Methods: To evaluate DIR algorithms, 4D CT images of 5 patients were used, and ground-truths of FMs and tumors were generated by physicians based on their medical expertise. The positions of FMs and tumors in each 4D CT phase image were determined using 10 DIR algorithms, and the deformed results were compared with ground-truth data. Results: The target registration errors (TREs) between the FM positions estimated by optical flow algorithms and the ground-truth ranged from 1.82 ± 1.05 to 1.98 ± 1.17 mm, which is within the uncertainty of the ground-truth position. Two algorithm groups, namely, optical flow and demons, were used to estimate tumor positions with TREs ranging from 1.29 ± 1.21 to 1.78 ± 1.75 mm. With respect to the deformed position for tumors, for the 2 DIR algorithm groups, the maximum differences of the deformed positions for gross tumor volume tracking were approximately 4.55 to 7.55 times higher than the mean differences. Errors caused by the aforementioned difference in the Hounsfield unit values were also observed. Conclusions: We quantitatively evaluated 10 DIR algorithms using 4D CT images of 5 patients and compared the results with ground-truth data. The optical flow algorithms showed reasonable FM-tracking results in patient 4D CT images. The iterative optical flow method delivered the best performance in this study. With respect to the tumor volume, the optical flow and demons algorithms delivered the best performance.
Intensity-modulated radiation therapy (IMRT) has become the standard technique in
radiation therapy as an accurate and reliable dose delivery method for treating
tumors, while sparing normal tissues from being affected by radiation.[1-4] While implementing IMRT, it is
desirable to define the tumor volume as small as possible, but tumor movements due
to respiratory motions disturb the clear delineation of the tumor volume.[5,6]In this case, 4-dimensional computed tomography (4D CT) images are an effective
option to provide information on the trajectories of tumors during respiratory motion,
and the real-time tumor tracking method based on a 4D CT-based plan is one of
the most effective ways to deliver a high dose to a focal area while accounting for
the respiratory motion.
Tumor tracking techniques mostly require the implant of fiducial markers
(FMs) around the tumor as surrogates for the tumor position. However, in patients
with small lung cancers (eg, oligometastases
), the implant of FMs before radiation treatment is clinically unacceptable
due to the risk of pneumothorax up to 50%.Pop et al
suggested a new method, that is, peri-operative fiducial implant, for
postoperative CyberKnife® (CyberKnife SRS, Accuray, Inc.) radiosurgery for the
tumor-tracking treatment of lung tumor patients. For potential risks and local
recurrences, they implanted preventive FMs near tumors during surgery, and the FMs
were utilized for complementary radiation treatments. However, these methods can
increase tumor localization errors.[12-14] Therefore, to use the tumor
tracking method with a limited number of FMs, uncertainties related to motion
discrepancies between FMs and tumors should be quantified to be considered in the
treatment planning.In this field, a deformable image registration (DIR) algorithm is one of the feasible
solutions for finding a correlation between the tumor positions and FMs in 4D CT
images.[15-17] DIR algorithms have evolved over the last few decades and have
been utilized to achieve specialized objectives, such as matching of images
generated by homogeneous/heterogeneous modalities and dose accumulation, considering
different motion/posture phases.[18,19] Moreover, DIR algorithms have
been frequently utilized for the assessment of respiratory motion based on 4D CT
images.[20-24] Several literature reports have also presented the accuracies
of DIR algorithms in different cases, including the occurrence of high- and
low-contrast regions[25-36] and multiorgan deformation.[37,38-40] Based on this information,
various DIR algorithms have been implemented in commercial oncology software and
treatment planning systems (eg, MIM [MIM Software Inc.] and RayStation [RaySearch
Laboratories]) and are being used for practical purposes.Yeo et al
quantitatively investigated the accuracy of various DIR algorithms in terms
of the displacement of FMs and tumors in the low-contrast regions of images. They
also evaluated the deformation performance of DIR algorithms using a deformable
phantom. Based on the literature,
the DIR methods have been already used clinically for lung cases.
Unfortunately, these DIR algorithms were not quantitatively evaluated using the CT
data of patients who received perioperative FM implant in the lungs to treat
multiple lung metastases using postoperative radiosurgery.
Accordingly, evaluating the trajectory correlation between FMs and tumors in
tumor-tracking radiation therapy is necessary using implanted FMs. Thus, in this
study, we evaluated the tracking performance of 10 DIR algorithms for FMs and tumors
using 4D CT images of 5 patients with multiple lung metastases.
Materials and Methods
Patients and 4D CT Images
This study was approved by the institutional review board of the Severance
Hospital (4-2020-0311), and all methods were performed under relevant guidelines
and regulations. The requirement for informed consent waived with the approval
of the ethics committee given that patient anonymity was ensured.The 4D CT images of 5 consecutive patients with intraoperative FM implant were
retrospectively evaluated in this study. Patient characteristics are summarized
in Table 1. The
number of multiple metastatic tumor lesions (gross tumor volume) in the lungs
varied among the patients (range: 1-10). Four to six FMs were implanted for each
patient during the surgery performed prior to radiation therapy. A maximum of 3
FMs were implanted in each of the right and left lobes. In general, one FM was
implanted into each of the upper, middle, and lower lobes of the lung. The
number of implanted FMs was determined during surgery by the surgeon considering
the number and location of the tumor and patient factors. Figure 1 shows a schematic illustration
and representative CT image with 6 FMs.
(a) Schematic and (b) patient CT images of perioperative fiducial implant
for postoperative radiosurgery in the lung.
(a) Schematic and (b) patient CT images of perioperative fiducial implant
for postoperative radiosurgery in the lung.Summary of Patient Characteristics.Abbreviations: FM, fiducial markers; GTV, gross tumor volume.Patients were set up in the head-first supine position and immobilized using a
vacuum BodyFix® (BodyFix, Elekta AB) and Kneefix™ (CIVCO, Civco Medical
Solutions). For respiratory motion control, all 4D CT simulations were scanned
with free breathing and continuous positive airway pressure (C-PAP). The C-PAP
levels were set to 10 to 15 cmH2O depending on the breathing
condition of the patient. Under the consideration of clinical practice, the
pixel resolution of CT images was approximately 1.0 × 1.0 mm2, and
the slice thickness of each image was fixed at 3.0 mm. All 4D CT images were
acquired with a 16-slice scanner (Aquilion LB, Canon Medical Systems) using a
respiratory monitoring system (Anzai, Anzai Medical) and retrospectively sorted
(reconstructed) into 10-phase bins within the respiratory cycle. Table 1 provides
information regarding the number of FMs and the tumors (or gross tumor volumes
[GTVs]).
Generation of the Ground-Truth From 4D CT
For the quantitative evaluation of the DIR algorithms, we compared the deformed
position and its ground-truth position.
First, in this study, the end exhalation (T50) phase was set to the
reference phase of 4D CT, and the ground-truth positions for tracking FMs and
GTVs in 4D CT were generated from the reference phase to others (ie, T60, …, T0,
…, T30, T40). Regarding the ground-truth, the trajectory of an FM was generated
by following the position of the voxel that had the maximum Hounsfield unit (HU)
value near the FM position of the previous phase, considering the smaller size
of the FM as compared to the voxel resolution. A contour of GTVs was defined on
each phase of the 4D CT by physicians based on their medical expertise, and
subsequently, the trajectory of each GTV was estimated according to the center
of mass of the tumor volumes (average volume of GTV <0.5 mL). The final
contours used in the analysis were reviewed by the same experienced physician to
avoid inter-physician bias. The uncertainty of the ground-truth position was
within 1.00 × 1.00 × 3.00 mm3, which is as much as the voxel
resolution used in this study.
Assessment of DIR Algorithms
This study aims to evaluate whether the previous phantom-based research could be
applied to a real patient image in clinical practice. For this, a total of 10
DIR algorithms were investigated, selected by considering the results obtained
by Yeo et al
: 4 optical flow algorithms,
3 demon algorithms,
2 level-set motion algorithms, and 1 free-form deformation algorithm.
Based on the results reported in the literature,
an algorithm in which FMs would not be identifiable (ie, b-spline) and
the worst demons algorithm producing significantly incorrect deformation results
(ie, double force demons) were excluded in this study. Table 2 lists the DIR algorithms
investigated in this study.
Table 2.
List of the Deformable Image Registration (DIR) Algorithms Investigated
in This Study.
Index
DIR Algorithm
A
Original Horn and Schunck45
B
Combined Lucas Kanade and Horn and Schunck46
C
Inverse consistent Horn and Schunck47
D
Iterative optical flow method43
E
Fast demons34
F
Modified demons48
G
Original demons44
H
Original level set motion49
I
Affine approximation of level set motion49
J
Free-form deformation method50
List of the Deformable Image Registration (DIR) Algorithms Investigated
in This Study.All DIR algorithms listed in Table 2 were implemented by the
Deformable Image Registration and Adaptive Radiotherapy (DIRART) software.
Four multigrid stages were used (n = 1 to 4) with
10n iterations per pass to establish the optimum DIR performance.
The DIRART software has been widely used in research and clinical areas
with respect to image registration study to this day.[36,52-54] The DIRs were calculated
without any postprocessing methods, such as smoothing, and the results were
extracted as an array type, including deformation vector fields (DVFs) generated
by DIR for the calculation of voxel movements. For performing DIR, the reference
phase (ie, T50) and other phases (T60, T70, …, T30, and T40) were selected as a
moving image and fixed images, respectively. The entire field of view of the CT
image was selected as a region of interest of DIR.Each deformed position of the FMs and GTVs was collected by direction vectors
extracted from the DVF corresponding to the index of the maximum HU value and
the center of mass of the GTV in the reference phase, respectively.
Subsequently, for the evaluation of the DIR performance, the results obtained by
the 10 DIR algorithms were compared, in terms of the target registration error
(TRE), with the ground-truth generated, as detailed in the section “Generation
of the Ground-Truth From 4D CT”.
Results
Assessment of the Deformed Images According to the DIR Algorithms
Figure 2 presents an
example of deformed images according to each of the 10 DIR algorithms (top) and
their difference maps compared with the original fixed image (bottom).
Qualitatively, our results indicate that the images deformed by optical flow
algorithms (Algorithms A–D) and demons algorithms (Algorithms E–G) were
reasonably well deformed. The images deformed by other algorithms were slightly
(Algorithms H) or fairly (Algorithms I–J) different from the original fixed
image (Figure 2).
Figure 2.
Example of deformed images according to 10 deformable image registration
(DIR) algorithms (top) and their difference maps compared with the
original fixed image (bottom). The DIR algorithms are listed in Table 2.
Example of deformed images according to 10 deformable image registration
(DIR) algorithms (top) and their difference maps compared with the
original fixed image (bottom). The DIR algorithms are listed in Table 2.
Assessment of the Deformed Position for FMs
Figure 3 details the
TREs of the X, Y, Z, and 3D displacements between the ground-truth and estimated
positions obtained using the DIR algorithms for FM movements. In the figure, the
optical flow methods (Algorithms A–D) delivered better tendency regardless of
the position of FMs, and the mean and standard deviations of TREs were in the
range of 1.82 ± 1.05 mm (best performance, Algorithm D) to 1.98 ± 1.17 mm (worst
performance, Algorithm B). These errors were within the uncertainty of the
ground-truth. With respect to the demons algorithms (Algorithms E–G), the TREs
were slightly higher (2.23 ± 2.78 [Algorithm F] to 3.41 ± 4.42 mm [Algorithm G])
than those of the optical flow algorithms. In other DIR algorithms (Algorithms
H–J), considerable differences were observed between the ground-truth and
deformed positions, varying from 5.43 ± 7.82 mm (Algorithm I) to 7.11 ± 9.76 mm
(Algorithm J).
Figure 3.
Comparison of the target registration errors (TREs) between the
ground-truth and the DIR results for fiducial marker (FM) movements: (a)
X-axis, (b) Y-axis, (c)
Z-axis, and (d) distance in the 3D coordinate
space. The DIR results were obtained from 9 images deformed from the
reference phases (ie, T50) of 5 patients, and outliers over the 14 mm
range were not visualized.
Comparison of the target registration errors (TREs) between the
ground-truth and the DIR results for fiducial marker (FM) movements: (a)
X-axis, (b) Y-axis, (c)
Z-axis, and (d) distance in the 3D coordinate
space. The DIR results were obtained from 9 images deformed from the
reference phases (ie, T50) of 5 patients, and outliers over the 14 mm
range were not visualized.
Assessment of the Deformed Position for Tumors
Figure 4 presents the
TREs of the X, Y, Z, and 3D displacements between the ground-truth and estimated
positions obtained using the DIR algorithms for GTV movements. Figure 4 indicates the
reasonable results obtained by the optical flow methods (Algorithms A–D). These
results are comparable to the previous FM results, where the mean and standard
deviations of TREs ranged from 1.29 ± 1.21 mm (best performance, Algorithm D) to
1.64 ± 1.22 mm (worst performance, Algorithm B). These deformation errors were
within the uncertainty of the ground-truth. With respect to the demons
algorithms (Algorithms E–G), the TREs showed better tendency than before,
ranging from 1.32 ± 1.22 mm (Algorithm E) to 1.78 ± 1.75 mm (Algorithm G)),
which has a similar level of error to that in the case of the optical flow
methods. In other cases (ie, Algorithms H–J), the TREs were slightly lower than
those of the previous FM results but still higher than those of the optical flow
and demons algorithms, varying from 3.07 ± 2.16 mm (Algorithm I) to
5.07 ± 9.33 mm (Algorithm J).
Figure 4.
Comparison of the target registration errors (TREs) between the
ground-truth and the DIR results for GTV movements: (a)
X-axis, (b) Y-axis, (c)
Z-axis, and (d) distance in the 3D coordinate
space. The DIR results were obtained from 9 images deformed from the
reference phases (ie, T50) of 5 patients, and outliers over the 14 mm
range were not visualized.
Comparison of the target registration errors (TREs) between the
ground-truth and the DIR results for GTV movements: (a)
X-axis, (b) Y-axis, (c)
Z-axis, and (d) distance in the 3D coordinate
space. The DIR results were obtained from 9 images deformed from the
reference phases (ie, T50) of 5 patients, and outliers over the 14 mm
range were not visualized.
Discussion
The deformed images obtained by the 10 DIR algorithms exhibited the same trends in
all patient images, and our results show that the deformed positions obtained using
Algorithms A–F were well estimated within the uncertainty of the tested CT
resolution. Here, Algorithm D showed the best performance. Our results were similar
to those observed in a prior study on phantom images conducted by Yeo et al.
Nevertheless, a few outliers were detected in real patient cases.With respect to the deformed position of FMs, in the 2 DIR algorithm groups, that is,
optical flow and demons, the maximum TREs were observed owing to the large
discrepancy between the HU values in fixed and moving images. Figure 5 presents an example of outlier
results deformed by the original Horn and Schunck optical flow algorithm (Algorithm
A). As illustrated in Figure
5a and b, the HU values of the FM included in the fixed image (ie, T30)
were relatively lower than those of the moving image (ie, T50), and this HU
discrepancy caused the DIR error. In this case, the voxels had to move somewhere,
not the ground-truth position, and consequently, TREs would be higher than those in
general situations. In Figure
5, the maximum TREs were 2.79 to 4.05 times higher than the mean TREs for
all optical flow algorithms. Particularly, when outlier cases that had unclear FMs
in 4D CT images were excluded in these samples, the mean, standard deviation, and
maximum TREs of the results deformed by the iterative optical flow method were
extremely better than before, with values of 1.07, 0.60, and 2.92 mm, respectively.
In the clinical approach, FMs may not appear (or appear faintly) in 4D CT images,
depending on the status of the 4D CT machine and its configuration (eg, slice
thickness and slice increment). This condition would be overcome using an
interpolation method between 2 well-deformed results.
Figure 5.
Sample outlier results deformed by the original Horn and Schunck optical flow
method (Algorithm A). (a) T50, (b) T30, (c) deformed T50, and (d) the
difference between (c) and (b). The red contour in (a) shows a fiducial
marker (FM), but the FM is unclear in (b).
Sample outlier results deformed by the original Horn and Schunck optical flow
method (Algorithm A). (a) T50, (b) T30, (c) deformed T50, and (d) the
difference between (c) and (b). The red contour in (a) shows a fiducial
marker (FM), but the FM is unclear in (b).Figure 6 presents an example
of another outlier result deformed by the fast demons (Algorithm E) algorithm. Even
if FMs were well and clearly defined in moving (Figure 6a) and fixed (Figure 6b) images, under the demons
algorithms, the deformation of the FM positions included in the deformed T50 (Figure 6c) was superior to
that of the fixed image.
Figure 6.
Sample outlier results deformed by the fast demons algorithm (Algorithm E).
(a) T50, (b) T90, (c) deformed T50, and (d) the difference between (c) and
(b). The red contours in (a)–(c) show the different positions of an
identical fiducial marker (FM) inserted in the patient.
Sample outlier results deformed by the fast demons algorithm (Algorithm E).
(a) T50, (b) T90, (c) deformed T50, and (d) the difference between (c) and
(b). The red contours in (a)–(c) show the different positions of an
identical fiducial marker (FM) inserted in the patient.This trend in the deformation errors did not exhibit any significant difference, in
comparison with the results obtained via DIR in phantoms reported by Yeo et al.
Hence, the limitations of the demons algorithms described in the literature
apply not only to phantom images but also to patient CT images.Considering the deformed positions for tumors, in the 2 DIR algorithm groups, that
is, optical flow and demons, the maximum differences of the deformed positions for
GTV tracking were approximately 4.55 to 7.55 times higher than the mean differences.
Here, errors caused by the aforementioned difference in the HU values were also
observed. As shown in Figure
7, when GTV was located at the diaphragm level, it was not clearly
distinguished from adjacent normal tissues (ie, liver, stomach, bowel, and spleen),
resulting in an error in GTV tracking. Consequently, even if the GTV position could
be estimated by the DIR algorithms, the physician should carefully and manually
check the position of the GTV in each phase.
Figure 7.
Sample case of DIR errors with similar Hounsfield unit (HU) values between
the gross tumor volume (GTV) and nearby normal tissues. In this figure,
slices of 4-dimensional computed tomography (4D CT) images for (a) T50 and
(b) T0 show an indistinguishable GTV. Hence, DIR could not clearly define
the GTV deformation in the (c) deformed T50; (d) shows the difference
between (c) and (b) images. Contour lines in (a) and (b) represent the GTV
region, and red and green lines in (c) represent the deformed GTV and GTV at
T0, respectively.
Sample case of DIR errors with similar Hounsfield unit (HU) values between
the gross tumor volume (GTV) and nearby normal tissues. In this figure,
slices of 4-dimensional computed tomography (4D CT) images for (a) T50 and
(b) T0 show an indistinguishable GTV. Hence, DIR could not clearly define
the GTV deformation in the (c) deformed T50; (d) shows the difference
between (c) and (b) images. Contour lines in (a) and (b) represent the GTV
region, and red and green lines in (c) represent the deformed GTV and GTV at
T0, respectively.The limitation of this study is that the DIR algorithms were evaluated only by using
a limited number of patients’ 4D CT images (ie, 5 4D-CT datasets). Nevertheless, the
trend of the deformed results of more than 20 FMs and GTVs by the DIR algorithms was
virtually similar to those of the previous phantom-based study.
Another limitation is that the DIR algorithm used in this study was not
evaluated at different operational parameters. The parameters used in the detailed
options of the DIR process were similar with those suggested by Yeo et al.
Hence, the limitations due to the parameters may be the same as those in the
previous literature. Properly establishing the best parameters can improve the
performance of the algorithm. Other parameters might have a possibility to bring
better results than our results.
Conclusions
In this study, we quantitatively evaluated 10 common DIR algorithms in real 4D CT
images of patients. The results for 2 categories, namely, FM and GTV, were compared
with the ground-truth positions. Based on our results, the optical flow algorithms
yielded reasonable results for FMs and GTVs, and the iterative optical flow
algorithm yielded the best performance. The mean and standard deviations of TREs
were 1.82 ± 1.05 mm for the FM position and 1.29 ± 1.21 mm for the GTV position. The
estimations all the GTV positions had some limitations. The demons algorithms
exhibited large discrepancies between the deformed and fixed images with respect to
FMs, and the other 2 algorithms (level-set motion and free-form deformation) had the
worst results. Accordingly, we conclude that FM positions could be estimated using
the best DIR algorithms, such as the iterative optical flow method, within the
uncertainties of the CT resolution, with respect to the tumor position. However, the
DIR algorithms could not be used to estimate several tumor positions tested in this
study. Hence, deformed positions for a tumor might need to be manually checked by
physicians. We expect that this limitation would be overcome in the near future with
the use of advanced methods, such as AI-based techniques.
Authors: Julia A Schnabel; Christine Tanner; Andy D Castellano-Smith; Andreas Degenhard; Martin O Leach; D Rodney Hose; Derek L G Hill; David J Hawkes Journal: IEEE Trans Med Imaging Date: 2003-02 Impact factor: 10.048
Authors: Weiguo Lu; Gustavo H Olivera; Quan Chen; Kenneth J Ruchala; Jason Haimerl; Sanford L Meeks; Katja M Langen; Patrick A Kupelian Journal: Phys Med Biol Date: 2006-08-15 Impact factor: 3.609
Authors: Paul J Keall; Gig S Mageras; James M Balter; Richard S Emery; Kenneth M Forster; Steve B Jiang; Jeffrey M Kapatoes; Daniel A Low; Martin J Murphy; Brad R Murray; Chester R Ramsey; Marcel B Van Herk; S Sastry Vedam; John W Wong; Ellen Yorke Journal: Med Phys Date: 2006-10 Impact factor: 4.071
Authors: B T Thomas Yeo; Mert R Sabuncu; Tom Vercauteren; Nicholas Ayache; Bruce Fischl; Polina Golland Journal: IEEE Trans Med Imaging Date: 2009-08-25 Impact factor: 10.048