Jie Zhang1, Xiaolin Huang1, Yuxiaotong Shen1, Ying Chen1, Jing Cai2, Yun Ge1. 1. 1 School of Electronic Science and Engineering, Nanjing University, Nanjing, Jiangsu Province, China. 2. 2 Department of Radiotherapy, Nantong Tumor Hospital, Nantong, Jiangsu Province, China.
Abstract
PURPOSE: This work proposed a nearest neighbor estimation method to track the respiration-induced tumor motion. METHODS: Based on the simultaneously collected motion traces of external surrogate and internal target during the modeling phase prior to treatment, we first obtain the nearest neighbors of the current surrogate in external space. Subsequently, the concurrent targets in internal space are determined and used to estimate the current target position. The method was validated on 71 cases that were from 3 open access databases. In addition, to evaluate the method's estimation and prediction accuracy, we compared the method with other works. RESULTS: Except for 2 cases, the nearest neighbor estimation achieved the root-mean-square error of <3 mm. The comparison indicated that the method had better estimation accuracy than polynomial model and good prediction performance. DISCUSSION: The 2 exceptive cases were further analyzed for failure causes. We inferred that one was because of the lack of estimating new target in our method, and the other one was because of the mistake during data collection. Accordingly, the potential solutions were suggested. Besides, the method's estimation for surrogate outliers, effects of modeling length, calibration, and extension were discussed. CONCLUSION: The results demonstrated nearest neighbor estimation's effectiveness. Except for this, the method imposes no restrictions on the modality of the pretreatment target images and does not assume a specific correspondence function between the surrogate and the target. With only 1 critical parameter, this nearest neighbor estimation method is easy to implement in clinical setting and thus has potential for broad applications.
PURPOSE: This work proposed a nearest neighbor estimation method to track the respiration-induced tumor motion. METHODS: Based on the simultaneously collected motion traces of external surrogate and internal target during the modeling phase prior to treatment, we first obtain the nearest neighbors of the current surrogate in external space. Subsequently, the concurrent targets in internal space are determined and used to estimate the current target position. The method was validated on 71 cases that were from 3 open access databases. In addition, to evaluate the method's estimation and prediction accuracy, we compared the method with other works. RESULTS: Except for 2 cases, the nearest neighbor estimation achieved the root-mean-square error of <3 mm. The comparison indicated that the method had better estimation accuracy than polynomial model and good prediction performance. DISCUSSION: The 2 exceptive cases were further analyzed for failure causes. We inferred that one was because of the lack of estimating new target in our method, and the other one was because of the mistake during data collection. Accordingly, the potential solutions were suggested. Besides, the method's estimation for surrogate outliers, effects of modeling length, calibration, and extension were discussed. CONCLUSION: The results demonstrated nearest neighbor estimation's effectiveness. Except for this, the method imposes no restrictions on the modality of the pretreatment target images and does not assume a specific correspondence function between the surrogate and the target. With only 1 critical parameter, this nearest neighbor estimation method is easy to implement in clinical setting and thus has potential for broad applications.
By delivering a targeted high dose of radioactive rays, radiotherapy kills malignant cells (eg, tumors) while sparing the surrounding health tissues. Therefore, accurate target location is of great importance in radiotherapy. Although modern medical imaging technologies, for example, computed tomography (CT), are used to assist physicians to locate the target prior to treatment, real-time target tracing images during the treatment are typically unavailable. In these cases, the motion induced by patients’ spontaneous respiration during treatment will adversely affect therapy accuracy and efficiency, particularly when the target locates in the thorax or abdomen. As previously reported, the scale of induced motion typically exceeds 20 mm,[1,2] which can significantly deteriorate the target coverage and increase the radiation damage to surrounding normal structures.To address this problem, respiratory motion is estimated and compensated in real time during treatment. Studies concerning the respiratory motion model have been conducted in the past years.[3-10] Conventionally, such respiratory motion models comprise data from 2 spaces, that is, the internal target space and the external surrogate space. In the modeling phase prior to treatment, 2 types of data are simultaneously collected and the correlation between them is modeled, whereas in the estimating phase, only external surrogate data are collected, and the internal motion of the target is estimated based on the surrogate combined with the model.According to the correspondence between the target and the surrogate, the model can be classified as direct or indirect: In direct models, the location of internal targets is estimated as an explicit polynomial,[11-17] B-spline function[11,18-24] or derivative of the concurrent surrogate.[14,25-27] This type of model is simple and easy to implement. However, its estimation accuracy greatly depends on the selected explicit function of correlation/correspondence. Moreover, the underlying assumption of this type models that the correlation between the internal and external data can be determinately and constantly defined by the selected function is not necessarily valid. Cyberknife System (Accuray Inc, Sunnyvale, California) introduced by Schweikard et al
[28] is a typical application of direct models. In this system, an infrared tracking system combined with external markers, that is, infrared emitters, is used to obtain the external surrogate data, and a stereo X-ray camera system combined with internal markers of gold fiducials are used to obtain the internal target data. The correlation between the target and the surrogate is fitted as a linear, quadratic, or constrained fourth-order polynomial. According to Hoogeman et al,[16] the Synchrony Respiratory Tracking System (a subsystem of the Cyberknife robotic treatment device) has accuracy ranging from 0.2 to 1.9 mm in the superior–inferior direction, 0.1 to 0.9 mm in the left–right direction, and 0.2 to 2.5 mm in the anterior–posterior direction. However, to keep up with variations in the respiratory pattern, the system needs updating parameters of correspondence function during treatment, and thus introduces nontherapeutical radiation that may be detrimental to patients.Indirect correlation models[3,25,29,30] do not directly relate the internal target to the external surrogate through an explicit function. Instead, the motions of both the internal target and the external surrogate are correlated with a set of parameters optimized by maximizing the match between the estimated surrogate and the real surrogate.[25] The indirect model is more robust as long as the modeling phase takes into account sufficient information concerning breathing cycles,[31] and its estimation accuracy is comparable to that of the direct model when the selected parameters construct a complete description of the motion. For example, King et al
[29] tracked liver motion using a surface-based statistical shape model. By registering information from preoperative CT or magnetic resonance images (MRIs) to physical space, which was intraoperatively acquired using a 3-dimensional (3D) ultrasound (US) system, this model resulted in an approximately 5-mm root-mean-square error (RMSE). Zhang et al further considered the deformable behavior of the entire region of interest (ROI). Based on principle component analysis combined with the motions of sparser surrogates, that is, fiducial markers or the diaphragm, they proposed to reconstruct the motion field embracing all voxels in ROI and estimate 3D deformable motions. It was reported that a median error magnitudes <2.63 mm can be achieved.[32]Most current models, as introduced above, require a prior assumption about the pivotal parameters depicting the respiratory motion or the definite correspondence function, whether explicit or implicit. However, such assumptions may be easily violated by great interindividual or interfraction variations.[25] Herein, we present an indirect model based on a nearest neighbor estimation (NNE) method using 2 types of tracking data collected from the external surrogate space and the internal target space. In this method, no specific correspondence function between the surrogate and the target was assumed a priori. The only 2 assumptions are that the trajectories of the surrogate and the target are continuous, bounded, and intercorrelated, and the system is a quasi-determinate system with memory of the past. Subsequently, based on the synchronization principle and nearest neighbor principle, estimation rules are constructed with careful consideration of different situations involving the current surrogate. The proposed method has been validated on several open access databases.Notably, in the presented study, only a single-target point, which is typically the tumor center defined by the physician prior to treatment, is concerned for the estimation of motion. During treatment, this point is used as a reference point whose movement is followed by the treatment beam, similar to the Cyberknife and TrackBeam[33] (a multileaf collimator-based beam tracking system; Initia Ltd, Petah Tikva, Israel) systems. Considering the deformability of human tissues, other models based on the motion field for the entire ROI have also been proposed.[32] Indeed, the motion field method is intriguing. Nevertheless, this method imposes a high demand on the instrument to identify the trajectory of each voxel in the entire ROI. Such instruments are typically unavailable in most hospitals in China. Therefore, the proposed NNE method is more applicable to the commonly used instruments (eg, Cyberknife) in Chinese hospitals.The rest of the manuscript is organized as follows: Details of the NNE method are described in the second section. Validation experiments and results are presented and analyzed in the third section. In the fourth section, we further discuss the exceptive cases in which NNE method fails and suggest potential solutions. The study’s conclusions are presented in the fifth section.
Methods
The proposed NNE method is applicable to the case in which both the external and the internal signals are sequences of positions in real space. In this situation, the correspondence model is a mapping from external space to internal space. Considering the topological ductility of human soft tissues and the limited range of respiratory motion, it is reasonable to assume that (1) trajectories of the surrogate and the target are continuous, bounded and intercorrelated and (2) the system is a quasi-determinate system with memory of the past. Therefore, it is not difficult to infer that if the surrogate travels to a location near to a certain past one, the target will also travel to the neighborhood of the target at that time. Thus, based on the synchronized surrogate and target data collected simultaneously in the modeling phase, we construct the following rules in estimation phase:Denoting the surrogate and the target in modeling phase as u
= (u, u, u)T, (i = 1,2,…, N) and v
= (v, v, v)T, (i = 1,2,…, N), respectively, we construct a one-to-one mapping set as Ψ:= {(u
, v
) | i = 1,2,…, N} by aligning in time. For a current u′
during the estimating phase, we provide 3 different estimating rules according to different situation of u′
location relative to the surrogate cloud in modeling phase:(i) In the case that u′
is inside the cloud, we first search in surrogate space for M nearest neighbors of u′
and construct a subset Φj ⊂ Ψ as,in which ‖ ‖ is the Euclidean distance and ε equals to the Mth nearest distance between u
and u′
. Subsequently, v
s in Φ are used to estimate the location of current internal target v′
as.(ii) In the case that u′
lies outside the cloud, whereas the previous u′
is in the cloud, we search for the neighbors of u
′
and construct Φj − 1 ⊂ Ψ asand correspondingly, the estimation of v′
is modified as(iii) In the case that both u′
and u′
lie outside the surrogate cloud in the modeling phase, we project the original 3D surrogate space into a 2-dimension (2D) singular subspace. That is, given the N×3 matrix U = [u
,u
,…,u
]T representing the surrogate cloud in modeling phase, U can be decomposed asin which I
∈ R
and r
∈ R
3 are left- and right-singular vectors, respectively, and σ denotes the singular value sorting as σ1 > σ2 > σ3. By ignoring the direction with the minimal σ, that is, r
3, we project both U and u′
onto a 2D subspace with orthogonal bases r
1, r
2. That is, by multiplying u′
and U by r
1 and r
2, we obtain and U
2D. Subsequently Φj ⊂ Ψ will be constructed asin which ‖ ‖ is the Euclidean distance in the projected 2D subspace. Finally, v′
is estimated in the same way as in Equation 2.Rules (2) and (3) aim to address the situation in which the current surrogate u′
during the estimation phase is out of the motion range during the modeling phase. Specifically, rule (2) addresses the outlier using a temporal correlation when the disturbance is occasional, whereas rule (3) addresses more persistent disturbances. Specifically, when outliers continually appear, this rule suggests that there may be an overall shift, that is, the surrogate may be drifted by an external force that is irrelevant to breathing or internal motion. Since the variance in the cloud along the minimum singular vector direction is minimum, an interruption along this direction makes the surrogate more easily to step out of the cloud compared to the other 2 directions. Therefore, we compress the original 3D surrogate space into the 2D singular subspace (the 2D subspace comprising 2 orthogonal vectors with 2 primary singular values) to filter out the shift.To determine whether a surrogate point is inside or outside the cloud, we defined the corresponding range for each direction as [umin + 0.1 × s, u
max − 0.1 × s], in which u
min and u
max are the minimum and maximum values of the modeling surrogates, respectively, along the corresponding direction, and s is their standard deviation along the same direction. Only when a surrogate point falls into range in all 3 directions is the surrogate labeled inside the cloud; otherwise, the surrogate is labeled as an outlier.This method is referred to as the NNE method. Taking M = 4 for example, this method is illustrated in Figure 1. This method has no specific restrictions on the modality of the tracing data. Thus, any tracking device, for example, US imaging or X-ray imaging, can be used prior to treatment.
Figure 1.
Illustrations of the proposed nearest neighbor estimation method with 3 estimating rules (A-C). Empty circles represent modeling phase clouds in corresponding spaces, empty triangles indicate clouds whose nearest neighbors are sought, solid circles indicate the neighbors, and asters represent the final estimations for current targets. (A) Rule (1) applied when the current surrogate u′j lies inside the surrogate cloud of the modeling phase. (B) Rule (2) applied when the current surrogate u′j lies outside the surrogate cloud of the modeling phase, while its previous point u′j − 1 lies inside the surrogate cloud. (C)** Rule (3) applied for the situation in which u′j and u′j − 1 both lie outside the surrogate cloud. In this case, the 2D subspace is constructed using the first and second singular vectors of the surrogate cloud in modeling phase. *For a clearer vision, the x, y, and z-axes have been modified to different scales. **In this situation, the surrogate cloud is exactly the same as those in (A) and (B), whereas the axes have been rotated to the direction of 3 singular vectors. Additionally, the new x, y, and z-axes have been modified to different scales for a clearer vision effect.
Illustrations of the proposed nearest neighbor estimation method with 3 estimating rules (A-C). Empty circles represent modeling phase clouds in corresponding spaces, empty triangles indicate clouds whose nearest neighbors are sought, solid circles indicate the neighbors, and asters represent the final estimations for current targets. (A) Rule (1) applied when the current surrogate u′j lies inside the surrogate cloud of the modeling phase. (B) Rule (2) applied when the current surrogate u′j lies outside the surrogate cloud of the modeling phase, while its previous point u′j − 1 lies inside the surrogate cloud. (C)** Rule (3) applied for the situation in which u′j and u′j − 1 both lie outside the surrogate cloud. In this case, the 2D subspace is constructed using the first and second singular vectors of the surrogate cloud in modeling phase. *For a clearer vision, the x, y, and z-axes have been modified to different scales. **In this situation, the surrogate cloud is exactly the same as those in (A) and (B), whereas the axes have been rotated to the direction of 3 singular vectors. Additionally, the new x, y, and z-axes have been modified to different scales for a clearer vision effect.In addition, there is only 1 critical parameter, that is, M. To select an appropriate M, several independent variables, including the respiratory rates, sampling rates, and the modeling phase duration, must be considered. When the sampling rates are less than 100 times the respiratory rates, we suggest that M is set less than the total number of breath cycles in the modeling duration, which implies roughly 1 neighbor in each cycle. For example, as to respiratory rate 15/minute, sampling rate 15 Hz, and modeling duration 5 minutes, M should be <75 and represent a trade-off between fit goodness and noise tolerance. When the sampling rates are higher, we suggest a higher M to take into account more than 1 point labeled as neighbors in 1 breath cycle.
Validation Experiments and Results
Data and Setting Descriptions
The proposed NNE method was applied to 3 open access databases shared on the website of the Institute of Robotics and Cognitive Systems at the University of Lübeck.[8,34,35]
Database I
This database includes 7 sets of 3-dimension bimodal liver motion traces induced by spontaneous breathing collected from 6 humans (all males, aged 23-30). In each set, 1 external marker and 1 internal marker are employed to label the external surrogate and the internal target, respectively. The external traces were recorded using the infrared tracking system, and the internal traces were simultaneously obtained by 4-dimensional (4D) US imaging. These 2 types signals were aligned in time with sampling or resampling rates of 17.5∼21.3 Hz by the data provider. All signals in this database are 6∼7 minutes in length.
Database II
This database includes 2 sets of 3-dimensional bimodal liver motion traces induced by a simulated respiration collected from a pig in 2 sessions. In each set, 6 external markers labeled 6 potential external surrogates and 4 implanted gold fiducials marked 4 potential internal targets (Please refer to the study by Ernst[8] for detailed position information.) The external traces were recorded using the infrared tracking system, and the internal traces were simultaneously obtained through X-ray imaging. Both types of signals had been aligned in time with sampling or downsampling rates of 15 Hz. The data set no. 1 is approximately 10 minutes in length and the data set no. 2 is 2 minutes in length.
Database III
This database includes 8 sets of 1-dimensional bimodal liver motion traces induced by free breathing collected from 8 humans (4 females and 4 males, aged 21-31). In each set, 2 external markers, placed at the lower end of the sternum and next to the navel, respectively, labeled 2 potential surrogates, and 1 internal marker tracked the internal target. Traces of surrogates were obtained using infrared tracking, and the trace of the target was simultaneously obtained through 4D US imaging. The provider aligned all signals in time with sampling or resampling rates of 17 Hz. Because the data provider provided the first principal components (PCs) of the traces, the motion traces were assumed to be in 1 dimension, thus only estimating rules (1) and (2) in the methods section were applied to this database. All signals are 15∼20 minutes in length.To simulate the modeling phase and estimating phase in real applications, each original consecutive series were segmented into 2 sections. The first sections were used to construct one-to-one mapping sets for modeling phase, and the second sections were used for estimation and verification. For signals longer than 6 minutes, the modeling length was fixed to 5 minutes, while for shorter signals, the series were segmented into 2 sections with equal length. For the number of neighbors, that is, M was set as 5∼20 in the presented study.
Evaluation Measurements
Defining the estimation error at time index j asin which v′
represents the actual value and is the corresponding estimation, we investigate the median (denoted as e
(50)), the 75th percentile (denoted as e
(75)), the 95th percentile e
(95), and 99th percentile e
(99) of the estimation errors.Considering that the motion scale may vary from target to target, we also investigate the motion scale for each target, defined asin which j1 and j2 represent 2 different time indices, thus and v′j2 represent 2 arbitrary and different points in the estimating phase.In addition, to obtain an overall evaluation for the entire estimating duration, we also calculate the commonly used RMSE for each data, defined aswhere L denotes the number of the data points upon which the estimation was performed.Finally, assuming that a 5-mm error is unacceptable,[36-38] we define the false ratio f asin which # represents the number.
Results
Results for Databases I and II
In databases I and II, 3D traces were given, so the estimation errors were calculated in 3D. The results for database I are listed in Table 1.
Table 1.
Estimation Errors’ Statistics for Databases I.a
No.
e(50), mm
e(75), mm
e(95), mm
e(99), mm
r, mm
RMSE, mm
f
1
1.49
1.86
2.60
3.64
13.10
1.62
0.0008
2
1.22
1.66
2.46
2.91
14.29
1.53
0.0016
3
0.71
1.22
1.79
2.17
32.48
1.02
0.0000
4a
1.57
2.68
18.39
23.38
44.36
6.80
0.1326
5a
3.26
5.94
9.68
14.41
34.86
5.05
0.3697
6
0.70
0.86
1.22
1.65
8.50
0.76
0.0000
7
1.22
2.18
3.99
5.17
22.46
2.00
0.0152
Abbreviations: RMSE, root-mean-square error.
a These 2 cases are not as good as the others. Further investigation will be given in the discussions section.
Estimation Errors’ Statistics for Databases I.aAbbreviations: RMSE, root-mean-square error.a These 2 cases are not as good as the others. Further investigation will be given in the discussions section.Notably, in database II, for each data set, there are 6 surrogates and 4 targets. We made all potential combinations between surrogates and targets, obtaining 48 (2 × 6 × 4) implementations in total for database II. The results for database II are presented in Table 2.
Estimation Errors’ Statistics for Databases II.Abbreviations: RMSE, root-mean-square error; Sur, surrogate; Tar, Target.As shown in Tables 1 and 2, except for nos. 4 and 5 in database I, the proposed method performs well in general, since RMSEs in 3D are <3 mm and false ratios are <0.06, and the values for the 75th percentile are much less than the motion scales, suggesting that the estimation can greatly improves the therapy accuracy. Further investigations of 2 exceptions are provided in the “Discussion” section.In addition, Table 2 shows that the estimation error varies from surrogate to surrogate, even for the same target. For example, for data no. 1, estimations from surrogate no. 3, 5, and 6 are much better than those from no. 1, 2, and 4, indicating that the placement of the surrogate affects estimation efficiency. To obtain a better target estimation, the surrogate should be placed at a position close to the target or whose motions are more correlated with those of the target.
Results for Database III
In database III, the traces were shown in 1 dimension (the first primary component), thus the estimation errors were calculated in 1 dimension, and estimation rule (2) in the Methods section was not applied. In this database, for each data set, there are 2 surrogates and 1 target, thus all potential combinations between surrogates and targets were made, generating 16 (8 × 2 × 1) implementations in total for database III. All 16 results for database III have been listed in Table 3.
Estimation Errors’ Statistics for Databases III.Abbreviations: RMSE, root-mean-square error; Sur, surrogate.a These 4 cases are not as good as the others.As shown in Table 3, on one hand, except for 4 cases, all the RMSEs are <3 mm and the false ratios are <0.06, and with the values for the 75th percentile far less than the motion scales, the estimation can greatly improve the therapy accuracy.On the other hand, the number of exceptions (4) is more than that in databases I and II (2). We attributed this phenomenon to 2 reasons: (1) This database only provided 1-dimension tracings, thus the information is not as sufficient as in databases I and II. (2)According to the data providers,[35] in these experiments, after 2 minutes of freely breathing, the patients were told to produce breathing artifacts by coughing, sneezing, harrumphing, or speaking every 15 seconds; nevertheless, the modeling phase and the estimation phase do not necessarily contain the same type of artifacts. Thus, the proposed method still performs well on this database. The results of Database I∼III (i.e. Table 1
∼3) were also plotted in a supplementary figure.
Comparison with other methods
In this subsection, we compare the proposed NNE with several other methods that have been applied to the human free breathing data, that is, databases I and III. In Table 4, we compared the results of the presented study with those of the polynomial and the support vector regression (SVR)-based methods for database I.[13] Since Ernst et al
[13] only listed the results for the first PC of the estimated target tracings, we processed in the same way for a better comparison. The best estimation has been printed in bold for each data set.
a Only results of the first principal component were given in this table, and the minimum RMSE for each data is printed in bold.
The RMSE (mm) Comparison for Database I.aAbbreviations: NNE, nearest neighbor estimation; RMSE, root-mean-square error; SVR, support vector regression.a Only results of the first principal component were given in this table, and the minimum RMSE for each data is printed in bold.As shown in Table 4, the proposed NNE method performs better than the polynomial model and equally well compared to the SVR model for all patients except patient no. 4. However, the NNE method is much easier to implement in the clinical settings than SVR model. In term of time complexity, NNE is of the order less than square of the sample size, O(N
2), while SVR is of the order cube of the sample size, O(N
3). For a typical sample length of N = 1022, NNE takes 1.608 seconds to build the model and estimate target motion, compared to 79.748 seconds for SVR (with the grid search method provided by Faruto)[39,40] on Matlab 2013a (Windows 7, CPU 3.10 GHz).We also compared the performances of the NNE method and those proposed in study by Durichen et al
[41] on database III. To maintain consistency with study by Durichen et al,[41] we predicted the subsequent no. 1∼5 sampling points based on modeling of the first 30 seconds. Only minor modifications of the methods are needed. Specifically, we first searched M (M = 5) nearest neighbors of the last point of the modeling phase in the surrogate space, and the subsequent no. 1∼5 samples of their concurrent target were used for predictions. Root-mean-square errors across patients are listed in Table 5, and the best prediction is indicated in bold. Thus, the proposed NNE method resulted in the best prediction accuracy compared to the others.
Table 5.
RMSE (mm) Over All Patients in Database III for Subsequent No. 1∼5 Samples Predictions.a
No.
NNE
DP-wLMS[41]
SVR-wLMS[41]
MTGPSE-best
[41]
MTGPSE-NLML
[41]
1
0.339
2.148
1.854
1.854
2.465
2
0.775
2.168
1.862
1.860
2.465
3
0.885
2.223
1.889
1.871
2.468
4
1.180
2.305
1.946
1.892
2.482
5
1.368
2.391
1.999
1.931
2.510
Abbreviations: DP, dual-polynomial model; MTGP, multitask Gaussian Process; NLML, negative logarithmic marginal likelihood; NNE, nearest neighbor estimation; RMSE, root-mean-square error; SE, squared-exponential; SVR, support vector regression; wLMS, wavelet-based least mean squares algorithm.
aThe minimum RMSE for each data is printed in bold.
RMSE (mm) Over All Patients in Database III for Subsequent No. 1∼5 Samples Predictions.aAbbreviations: DP, dual-polynomial model; MTGP, multitask Gaussian Process; NLML, negative logarithmic marginal likelihood; NNE, nearest neighbor estimation; RMSE, root-mean-square error; SE, squared-exponential; SVR, support vector regression; wLMS, wavelet-based least mean squares algorithm.aThe minimum RMSE for each data is printed in bold.
Discussion
Detailed Investigation of 2 Failure Cases in Database I
Since database I provides the complete 3D tracings, we provide a more detailed investigation for data no. 4 and 5 to characterize the poor performances and suggest potential solutions in this section.
Estimation error of the 2 cases
In Figure 2, the estimation errors for data no. 4 and 5 were plotted versus time. As shown in Figure 2, although the method in the presented study fails in both cases, these situations are not exactly the same. For no. 4, small errors and large errors alternately appear, suggesting that the performance is not always bad and the performance deteriorates only in certain occasions. As to no. 5, although the error maximum is not the greatest, for most part of time the error exceeds 3 mm, suggesting that the proposed method is completely unsuitable for this case. We further investigate why the NNE method does not perform well in these 2 cases.
Figure 2.
Estimation error versus time for data nos. 4 and 5 in database I. The x-axis represents time and y-axis represents estimation error in 3D space. The black dotted line denotes the preferable 3-mm threshold. Note that the beginning of the estimating phase is set as 0 seconds.
Estimation error versus time for data nos. 4 and 5 in database I. The x-axis represents time and y-axis represents estimation error in 3D space. The black dotted line denotes the preferable 3-mm threshold. Note that the beginning of the estimating phase is set as 0 seconds.
Analysis for the failure causes
The greatest estimation error for data no. 4, occurring at exactly 18.0577 seconds referencing to the first point in estimation phase, is illustrated in Figure 3A. From this figure, we inferred that the error of data no. 4 was because of a fairly new state which wasn’t recorded in the modeling phase.
Figure 3.
Illustration of 2 representative cases with greatest estimation errors. A, is at 18.0577 seconds for data no. 4 and (B) is at 12.8504 seconds for data no. 5. Denotations are exactly same as those in Figure 1.
Illustration of 2 representative cases with greatest estimation errors. A, is at 18.0577 seconds for data no. 4 and (B) is at 12.8504 seconds for data no. 5. Denotations are exactly same as those in Figure 1.As shown in Figure 3A, although the current surrogate lies inside the surrogate cloud in modeling phase, the concurrent actual target lies outside the modeling target cloud. In the case that data quality is guaranteed, since both the current surrogate and estimated concurrent target are close to the end of corresponding clouds, it is likely that an extreme event occurred; therefore, we consider this case as extrapolation. Intrinsically, the proposed NNE is an interpolation method, suggesting that the estimation for the current state is always the interpolation of the former similar states in the modeling phase. Therefore, for a current state without sufficient similar former states, the performance will deteriorate. Indeed, for a new state in surrogate space, the presented method involves the rules (2) and (3) in the Methods section to generate the surrogate cloud in the modeling phase. This method works when the new surrogate state reflects interruptions and the target still lies in target cloud in modeling phase. However, the method fails when the target to estimate is rather new for the modeling target cloud. In other words, the proposed method shows the insufficiency in extrapolating or expecting a new state far from previous states. Most poor performances for data no. 4 result from this type of situation.A representative example for great estimation error occasion for data no. 5 is illustrated in Figure 3B. This event occurs at 12.8504 seconds after the first data point in estimation phase. For good visualization, the target space in Figure 3B has been rotated to directions of singular vectors, denoted as PC1, PC2, and PC3, and axes have been adjusted to the same scale. Denotations are exactly same as those in Figure 1. Gray markers in the bottom of the target box represent the projection onto the 2D principle component subspace. In addition, since the estimation and the actual target are both hidden in the cloud, we add arrow annotations to point them out.From Figure 3B, we inferred that the reason of the error for no. 5 was the mistake during data collection. It is because that, in the target space of Figure 3B, there are several holes in the target cloud. Even in the projected 2D principle component subspace, the estimated value () and the actual value () are separated by some hollow structures. In addition, the error in PC1 direction is much less than that in PC2 direction. This is not common for a continuous dynamical system (CDS) with only 1 driving force (ie, breathing). Besides, in such a CDS, 2 correlating signals (surrogate and target) are supposed to have a descending correlation along with the decreasing singular value (σ), because σ refers to the variance in motion component along its corresponding PC direction. Table 6 shows that σ2 of data no. 5 is the greatest among all σ2 s; however, its r
tar2−sur1 is disproportionately low. It is very evident when comparing the data no. 5 with no. 2. It suggests that the target’s motion in PC2 direction might result from other attributions beyond respiratory motions. Therefore, we inferred that the target space had been shifted along PC2 direction during the experiment, likely reflecting the nonstatic US probe or rotation of the templates, as the data providers previously reported.[8]
Table 6.
Normalized Singular Value for Database I.a
Data No.
Target
Surrogate
Correlation
σ1
σ2
σ1
rtar2-sur1
1
0.705
0.177
0.880
−0.03
2
0.649
0.248
0.835
−0.89
3
0.871
0.096
0.786
−0.04
4
0.800
0.119
0.786
−0.14
5
0.647
0.300
0.892
−0.09
6
0.719
0.195
0.729
0.37
7
0.803
0.154
0.885
−0.14
a σ1 and σ2 are the normalized singular value along PC1 and PC2. r
tar2-sur1 is the linear correlation coefficients between the target PC2 and the surrogate PC1. “PC” is the principal component.
Normalized Singular Value for Database I.aa σ1 and σ2 are the normalized singular value along PC1 and PC2. r
tar2-sur1 is the linear correlation coefficients between the target PC2 and the surrogate PC1. “PC” is the principal component.Finally, we suggest a few potential solutions to improve the performance. In addition to guarantee the quality of the data, we also suggest: (1) including more potentially extreme events in modeling phase, for example, asking the patient to breathe deeply several times in modeling phase so that there are cases of extrapolation as least as possible and (2) employing an explicit extrapolation correspondence function which will work when the current surrogate approaches the boundary of the cloud. Nonetheless, the proposed NNE method can achieve a desirable good performance, and further modification should be a trade-off between performance and efficiency.
Estimation of Surrogate Outliers
The proposed NNE method provides solutions to estimate the internal target motion when the surrogate runs outside its prior motion range, that is, rules (2) and (3) in Methods section. Herein, to demonstrate the performance of NNE on the surrogate outliers, we plot in Figure 4 a representative example of rule (3). By using rule (3), we pulled the outliers back to the modeling cloud.
Figure 4.
Illustration of the nearest neighbor estimation method applied to a target estimation at 35.7380 seconds for the combination of surrogate 3 and target 2 of data no. 2 in database II. Denotations are exactly same as those in Figure 1.
Illustration of the nearest neighbor estimation method applied to a target estimation at 35.7380 seconds for the combination of surrogate 3 and target 2 of data no. 2 in database II. Denotations are exactly same as those in Figure 1.This event occurs at 35.7380 seconds after the first point of estimation phase for data no. 2 in database II. In this example, the current surrogate u′
and its previous surrogate u′
−1 are both outliers. By ignoring the motion component along the direction of the minimum singular vector, the current surrogate falls in the motion range of modeling phase. As shown in Figure 4, the estimation error is 0.2 mm, suggesting a good estimation.
Effects of Modeling Length and Number of Neighbors
In clinical application, the length of the modeling phase is quite important. Although we use 5 minutes in most cases of this research, it does not mean that 5 minutes is obligatorily required. We also tried different length in the data except for those of only 2 minutes long. We present the result in the Figure 5. According to Figure 5, except for data no. 2 in database I, although the modeling length affects the results, the effect is not that great. It is more evident in Figure 6. The maximum RMSE variation is <3 mm for most instances. For the data no. 2 in database I, the high RMSE is brought by the extreme values in 2∼5 minutes. For further investigation, we tried various modeling phases as Table 7 listed. As seen, even under the same modeling length, the RMSE was greatly reduced when taking the 4 to 5 minutes into modeling. This further proves that the 4 to 5 minutes includes the “extreme states.”
Figure 5.
A, Root-mean-square error with different modeling length, (B) a zoomed in image with the RMSE of 0∼7.5 mm.
Figure 6.
The maximum RMSE variation (Δ) when modeling length varies from 1 to 5 minutes. RMSE indicates root-mean-square error.
Table 7.
Root-Mean-Square Error With Various Modeling Phases for No. 2 of Database I.
A, Root-mean-square error with different modeling length, (B) a zoomed in image with the RMSE of 0∼7.5 mm.The maximum RMSE variation (Δ) when modeling length varies from 1 to 5 minutes. RMSE indicates root-mean-square error.Root-Mean-Square Error With Various Modeling Phases for No. 2 of Database I.Abbreviations: min, minutes; RMSE, root-mean-square error.In fact, the modeling phase including enough information is what really matters. Therefore, in practical use, we would suggest the patient to breathe as deeply as possible in the modeling phase. After that, 40 breath cycles are considered statistically reliable.The number of neighbors (M) is another important parameter in our method. To examine its influence on the model performance, we tried different M in our model. The result is presented in Figure 7. It showed that, for most data, the results were not that sensitive to M <100.
Figure 7.
The RMSE (root-mean-square error) of our model with different M (number of neighbors). The modeling length was 1 minute for no. 24∼48 cases in database II and 5 minute for all of the others.
The RMSE (root-mean-square error) of our model with different M (number of neighbors). The modeling length was 1 minute for no. 24∼48 cases in database II and 5 minute for all of the others.
Calibration and Extension of the NNE Method
For a good performance of the NNE method, one should ensure that the surrogate signal in the estimating phase is comparable to what is in the modeling phase. The camera should be fixed, and the patient should remain motionless as long as possible. However, the practical radiation treatment typically takes time. It is necessary to apply a calibration to avoid the accumulated drifts. Considering the Real-time Position Management (RPM) system (Varian Medical Systems, Palo Alto, California) as an example, we can place an extra RPM marker block at a relatively static location that is hardly affected by respiration, for example, the shoulder of the patient, as the reference box. In the modeling phase, the 3D coordinates of the reference box are recorded. Subsequently, at the beginning of the estimating phase, or whenever the experimenter thinks necessary, the location of the reference box is acquired and is compared to the original one. If there is a discrepancy, a transformation matrix between the 2 locations of the reference box is calculated and treated as the transformation matrix between the 2 different relative coordinates. Thus, the trajectories in the modeling phase and in the estimating phase become comparable. In the entire procedure, the reference box and the detecting box are fixed on the patient; however, the patient’s position relative to the camera is not strictly restricted.The NNE method introduced in the presented study can be extended to nonlocation surrogates. Considering the bellows system, which outputs the respiratory parameters, as an example, we can replace the current surrogate space with a state space, constructed based on different respiratory parameters, or phase space, constructed using the delay embedding method. Accordingly, the estimation rule needs modification to refer to the trajectory features in state space. This will be investigated in subsequent studies.
Future Work
Our future work involves the following 2 problems.
Limitation of the current target motion imaging modalities
Currently, the most important target motion imaging modalities are 4D-CT2 and 4D-MRI.[42] Because they cannot provide real-time 3D images as the 4D-US, our primary future work is to resolve the problem of sparse sampling of the internal target.One potential solution is the big data technique. We are now collaborating with 3 hospitals to build a respiratory database of 4D-CT, 4D-MRI, and 4D-US images, mainly on the splanchnocoele. With these data, we would try machine-learning technique to construct the motion model and then realize the interpolation. After that, the basic nearest-neighbor strategy will be applied to correlate the surrogate and the target. Once we solve the problem of sparse sampling of the internal target, the basic idea of synchronization and motion continuity, as presented in this manuscript, would still be the key to the final destination.
Different surrogate–target correlations between deep and free breath
The involved muscles for intentional deep breath and free breathing are different. That may lead to surrogate–target correlation difference between modeling phase and actual treatment. In other words, it may result in a one-to-many mapping. One potential solution is to increase the dimension of data space. For example, by introducing an extra dimensionality that distinguishes deep and free breath, we can turn the target/surrogate space into a 4D space. Thus, the one-to-many mapping in 3D space is unfolded in 4D space. As to the extra dimensionality, we suggest using electromyography collected from the surface of the representative muscle, which is involved differently in deep or free breath, for example, internal intercostal muscles,[43] sternocleidomastoid,[44] and abdominal muscles.[45]
Conclusions
In this article, we propose an NNE method to estimate the current internal target position based on the concurrent external surrogate. Specifically, nearest neighbors of the current surrogate in surrogate cloud of modeling phase are first obtained; subsequently, their synchronizing targets are determined and used for calculating the estimated value. According to verification results on several open access databases, the NNE method is proved effective. Because the algorithm is easy to implement, it has high potential in real-time tumor tracking during radiotherapy. We also suggest potential solutions to further improve the performance, including embracing more deep breaths in the modeling phase, incorporating an extrapolating function, and so on. However, these solutions only constitute supplementary methods that should be adopted with careful consideration of the trade-off between performance and efficiency.Further validation of the proposed method is needed, in which more bimodal motion traces are simultaneously collected from more patients.Click here for additional data file.Supplementary_figure20171206 for Nearest Neighbor Method to Estimate Internal Target for Real-Time Tumor Tracking by Jie Zhang, Xiaolin Huang, Yuxiaotong Shen, Ying Chen, Jing Cai, and Yun Ge in Technology in Cancer Research & Treatment
Authors: Eun Kyung Paik; Mi-Sook Kim; Chul Won Choi; Won Il Jang; Sung Hyun Lee; Sang Hyoun Choi; Kum Bae Kim; Dong Han Lee Journal: Radiat Oncol J Date: 2015-09-30