Literature DB >> 27277044

Semiautomated segmentation of head and neck cancers in 18F-FDG PET scans: A just-enough-interaction approach.

Reinhard R Beichel¹, Markus Van Tol², Ethan J Ulrich², Christian Bauer², Tangel Chang³, Kristin A Plichta³, Brian J Smith⁴, John J Sunderland⁵, Michael M Graham⁵, Milan Sonka⁶, John M Buatti⁷.

Abstract

PURPOSE: The purpose of this work was to develop, validate, and compare a highly computer-aided method for the segmentation of hot lesions in head and neck 18F-FDG PET scans.
METHODS: A semiautomated segmentation method was developed, which transforms the segmentation problem into a graph-based optimization problem. For this purpose, a graph structure around a user-provided approximate lesion centerpoint is constructed and a suitable cost function is derived based on local image statistics. To handle frequently occurring situations that are ambiguous (e.g., lesions adjacent to each other versus lesion with inhomogeneous uptake), several segmentation modes are introduced that adapt the behavior of the base algorithm accordingly. In addition, the authors present approaches for the efficient interactive local and global refinement of initial segmentations that are based on the "just-enough-interaction" principle. For method validation, 60 PET/CT scans from 59 different subjects with 230 head and neck lesions were utilized. All patients had squamous cell carcinoma of the head and neck. A detailed comparison with the current clinically relevant standard manual segmentation approach was performed based on 2760 segmentations produced by three experts.
RESULTS: Segmentation accuracy measured by the Dice coefficient of the proposed semiautomated and standard manual segmentation approach was 0.766 and 0.764, respectively. This difference was not statistically significant (p = 0.2145). However, the intra- and interoperator standard deviations were significantly lower for the semiautomated method. In addition, the proposed method was found to be significantly faster and resulted in significantly higher intra- and interoperator segmentation agreement when compared to the manual segmentation approach.
CONCLUSIONS: Lack of consistency in tumor definition is a critical barrier for radiation treatment targeting as well as for response assessment in clinical trials and in clinical oncology decision-making. The properties of the authors approach make it well suited for applications in image-guided radiation oncology, response assessment, or treatment outcome prediction.

Entities: Chemical Disease Gene Species

Year: 2016 PMID： 27277044 PMCID： PMC4874930 DOI： 10.1118/1.4948679

Source DB: PubMed Journal: Med Phys ISSN： 0094-2405 Impact factor: 4.071

INTRODUCTION

Background

FDG PET/CT has become an essential tool for clinical management of head and neck (H&N) squamous cell carcinoma (SCC). This disease typically originates from the normal squamous mucosa that lines the open air spaces in the H&N region. Cancers are most often caused by irritants and carcinogens from cigarette smoke, alcohol, and chewing tobacco although more recent studies also show a role for human papilloma virus (HPV). Once cancerous, these mucosal neoplasms have access to associated lymphatic drainage, leading to local/regional spread of the disease to neck lymph nodes. Staging for these cancers is defined by the size and invasion pattern of the primary site cancer (T stage) as well as the presence, size, and location of regional nodes (N stage). More rarely these cancers spread to regions beyond the H&N region, which can be defined as metastases (M stage). TNM staging is an important determinant of both prognosis and oncologic treatment decision-making for every patient. Because of limitations of physical exam, imaging has rapidly become a critical part of staging. CT was the initial imaging modality of choice and provided volumetric anatomic assessment based on contrast uptake into tumor versus normal mucosa. The size of local-regional lymph nodes could readily be determined and lymph nodes larger than 1.5 cm were more likely involved in cancer. Unfortunately, CT imaging was not very sensitive with limited involvement of the cancer. The advent of PET/CT provided a more sensitive imaging tool that detected cancer based on the metabolic differences in glucose metabolism with tumors showing greater FDG uptake than normal tissues. FDG PET/CT is an effective tool for diagnosis and is effective in accurately detecting sites of cancer involvement and thereby improves the accuracy of staging and helps define areas for surgical excision and radiation therapy. FDG PET/CT has thus become the most important staging and therapy-planning tool for H&N cancers. Finally, the ultimate success of the treatment can be assessed with this more sensitive tool, although it requires a sufficiently long interval (generally 2–3 months) after radiation and/or surgery to allow the inflammation of treatment to abate. FDG PET/CT is now a standard in many centers for staging, clinical decision-making, and response assessment. Despite the advances associated with FDG PET/CT use, the previously described applications are mostly based on standard visual inspection of images with only limited use of quantitative indices. If quantification is employed, the most commonly used metrics to characterize lesions are SUVmax, SUVmean, metabolic tumor volume (MTV), and total lesion glycolysis (TLG). PERCIST is an approach that has proposed another index, SUVpeak (i.e., the maximum average activity in a 1 cm sphere inside the tumor of interest) to assess response to therapy. The work presented here focuses on improvements that can be achieved by employing more automated and therefore more consistent approaches to quantitative analysis of tumors and associated lymph nodes. We argue that highly automated segmentation algorithms can provide greater accuracy and consistency in defining radiotherapy treatment volumes and are important for quantifying responses to therapy.

Problem statement and requirements

Segmenting H&N lesions in PET images is demanding, as demonstrated by the examples shown in Fig. 1, which are part of our evaluation set (Sec. 3.A). Also, for applications like outcome prediction, it is not a priori known if segmenting and quantitatively describing the primary tumor only is sufficient or if all lesions need to be quantified individually. Thus, to answer this question, all lesions need to be segmented individually and can be later combined for comparison. Specifically, our requirements for a head and neck cancer (HNC) segmentation approach are as follows.

FIG. 1.

Examples depicting the complexity of HNC segmentation in FDG PET scans. [(a)–(c)] Volume rendering of PET scans, showing the primary cancer [arrow in (d)–(i)] and lymph nodes with uptake. [(d)–(f)] Rendering of segmentations of individual lesions shown in (a)–(c) combined with a volume rendering of the corresponding PET volume. [(g)–(i)] Examples of PET cross sections with outlined tumor segmentations. (g) Necrotic primary tumor with metabolically active adjacent lymph node. [(h) and (i)] Cases with multiple lesions and varying degrees of tracer uptake.

The segmentation method must offer the ability to individually segment lesions with varying contrast that are in close proximity to each other or to normal structures with FDG uptake, which is especially important for HNC FDG PET imaging (Fig. 1). The segmentation approach must allow physicians to segment lesions in a time-efficient manner, as required for clinical use. Note that due to the complexity of HNC FDG PET images, the (correct) interpretation of scans (i.e., locating lesions, differentiating between normal and abnormal structures, etc.) is already time intensive, and therefore, this point becomes even more important for this specific application. The method must be intuitive to use. Also, in the case of semiautomated segmentation methods, it must allow physicians to make corrections to semiautomated segmentation results as deemed appropriate. The approach must offer good agreement with manual segmentation, which is the de facto standard. The method must offer better consistency to reduce intra- and interoperator variability. This is especially important for applications like radiation treatment planning, where operator induced variability can compromise outcome.

Related work

The definition of the gross tumor volume (GTV) or calculation of FDG uptake metrics like average SUV, MTV, or total glycolytic volume (TGV) requires the segmentation of target structures like primary tumors and metabolically active lymph nodes in volumetric FDG PET images. For PET image segmentation, a number of approaches have been proposed, including adaptive thresholding, “standard” region growing, a combination of adaptive region-growing and dual-front active contours, k-means-based partitioning into tumor and background regions, a gradient-based method, a combination of two segmentation methods to reduce inconsistencies, and fuzzy locally adaptive Bayesian (FLAB) method. For an overview of published methods, the reader is referred to the review by Foster et al. Recently, a number of graph-based segmentation methods have been proposed for PET image segmentation, and a summary is given in Table I. Similarly as in other medical image analysis applications, these methods show great promise and represent a quite powerful segmentation framework. However, formulating a suitable graph-based cost function for HNC segmentation (Fig. 1) is not straightforward due to potentially contradicting/conflicting demands/requirements (e.g., lesion with inhomogeneous uptake vs. lesions in close proximity). Such an issue can be avoided by utilizing graph-segmentation algorithms in an interactive fashion, allowing the user to contribute expert knowledge, if needed. For example, in case of random walks and graph cuts, the user can add additional object and background seeds to alter the segmentation result. However, these changes can affect the segmentation globally, which can lead to unexpected effects, making segmentation refinement unintuitive. In contrast, we have developed approaches that follow the just-enough-interaction (JEI) principle for segmentation of livers, lymph nodes, and lungs in 3D (Ref. 25) and 4D (3D inspiration and expiration scan) MDCT scans as well as for segmentation of IVUS data sets. The idea behind this principle is that the user can provide simple cues to guide the segmentation algorithm in an efficient and intuitive manner. Besides the belief that the general JEI principle can offer improved performance in PET image analysis, no such approach for PET image segmentation has been available so far. Specifically, this paper represents the first demonstration that JEI strategy is suitable for this application domain.

TABLE I.

Overview of graph-based segmentation methods utilized for PET image segmentation.

Bagci et al. (Refs. 28 and 29)	Random-walk-based segmentation of lesions in PET images
Bagci et al. (Ref. 29)	Random-walk-based cosegmentation of PET and volumes produced with other imaging modalities
Ballangan et al. (Ref. 30)	Graph-cuts-based segmentation of lung cancer in PET images
Han et al. (Ref. 31) and Song et al. (Ref. 32)	Cosegmentation of single lesions in PET and CT images utilizing a Markov Random Field (MRF) approach; the optimization is solved using a graph-cuts-based method

Contribution of this work

In this paper, we present a clinically relevant JEI-based approach for FDG PET image segmentation, which utilizes a graph-based segmentation method. For this purpose, we transform the segmentation task into a suitable graph-based optimization problem. In addition, we introduce several efficient interaction approaches to enable the user to modify the initial segmentation result. To demonstrate the applicability and performance of our approach for the task of HNC segmentation, we evaluate our approach on a large cohort of FDG PET scans and compare its results to manual segmentation, the current clinical de facto standard for HNC segmentation in FDG PET volumes.

METHODS

We utilize a highly automated optimal surface segmentation (OSS) approach, which is a variant of the LOGISMOS (layered optimal graph image segmentation of multiple objects and surfaces) segmentation framework, for the segmentation of uptake in FDG PET volumes. OSS was introduced by Li et al. The basic idea behind this approach is to formulate a segmentation problem as a graph-based optimization problem, which enables finding an optimal solution (i.e., segmentation surface) according to the utilized cost function. For more information on OSS, the reader is referred to Li et al. In Subsections 2.A–2.E, the approach for lesion segmentation in FDG PET volumes is described in detail. First, we introduce the main segmentation approach (Secs. 2.A–2.C). Second, we outline several additional segmentation modes, which allow effective handling of frequently occurring situations like lesions in close proximity (Sec. 2.D). Third, we describe an approach that enables the user to efficiently refine segmentations (Sec. 2.E), if needed.

Graph construction

To construct a graph that represents the segmentation problem, the user first needs to indicate a lesion to be segmented. For this purpose, the user specifies a rough center point ce inside of lesion k. This can be done in two ways: (a) the user-selected point ce is directly utilized as ce or (b) ce can be automatically recentered to the highest uptake voxel within a search radius of 7 mm from ce. Option (b) typically leads to more consistent results because a property of the PET image I is utilized to adjust the location of the center point. Therefore, it is used as the default setting, which can be changed by the user. However, recentering may cause problems, for example, when segmenting lesions with necrotic centers. Based on ce, a graph structure is generated (Fig. 2). For this purpose we assume that the lesion to be segmented is roughly spherical in shape, which is frequently the case for lesions in the H&N area. For more complex shaped lesions, additional center points can be used, and individual segmentation results will be automatically combined with a logical OR operation. A graph G = (V, E), consisting of nodes V and edges E, is constructed by placing a spherical mesh with radius r and ncolumn evenly spaced mesh vertices p with i ∈ {0, 1, …, ncolumn − 1} centered around ce. The volume (voxels) inside the spherical mesh will be denoted as Mregion. Then, columns with sample points (nodes) are introduced between ce and mesh vertices p. The spacing between nodes is gap, resulting in nnode nodes that represent sample points of the physical volume [Fig. 2(a)]. The node n is closest to the center, while node n is furthest away from the center. All remaining nodes are placed in order between them. In addition to nodes, several edges are added to represent the segmentation problem. First, for all nodes on a column, the edges {n, n} with infinite capacity are added to E if j ≠ 0 to ensure that only one node can be selected on a column [Fig. 2(b)]. Second, for all nodes j and j′, where j′ = max(j − sc, 0) on every pair of adjacent columns i and i′, edges {n, n} and {n, n} with infinite capacity are added to E to implement a hard smoothness constraint [Fig. 2(c)]. Thus, when solved, the nodes of two adjacent surface points on columns cannot be more than sc nodes apart. Third, a soft smoothness constraint is included by adding edges {n, n} and {n, n} with capacity sp to E [Fig. 2(d)]. Consequently, for a possible solution, the cost is increased by sp times the difference in j between surface nodes on a neighboring column.

FIG. 2.

Graph construction for OSS. (a) Centerpoint ce and utilized node structure; the spherical mesh is symbolized by a circle. For each node, a cost is assigned. (b) Intracolumn edge structure (red) with infinite capacity, pointing toward the center. [(c) and (d)] Intercolumn edge structure. (c) The hard smoothness constraint is implemented by introducing edges with infinite capacity shown in green. (d) The soft smoothness constraint is implemented by introducing bidirectional edges with low finite capacity shown in cyan. (See color online version.)

Based on experiments performed on a small set of PET volumes dedicated for algorithm development, we found that r = 60.0 mm, ncolumn = 1026, gap = 1.0 mm (resulting in nnode = 60), sc = 5, and sp = 0.005 work well for our application.

Cost function design

For graph-based segmentation, nodes n that represent the object boundary should incur low costs, while nodes that represent the selected object (uptake region), adjacent objects, or background should incur high costs to make it less likely that the object surface (mesh) passes through these nodes. To generate node costs, uptake values up(n) at corresponding spatial node locations are sampled from the PET volume by means of linear interpolation and subsequently converted into node costs cbase(n). This conversion is based on the analysis of local image properties around the center point ce, which are robustly estimated. Our cost function design is based on an adaptive strategy to mimic the typical tracing preference of radiation oncologists [Fig. 3(a)]. The boundary drawn by the radiation oncologist [Fig. 3(a)] almost reaches voxels with background uptake, whereas a typical segmentation approach based on the isocontour corresponding to 50% of the maximum lesion uptake-to-background uptake ratio has boundary voxels with much higher uptake values [Fig. 3(b)]. In this context, we observed that there is no linear relation between the average uptake value at the boundary of the manual segmentation and the maximum uptake to background uptake ratio. Also, note that the segmentation boundary appears “noisy” in the sagittal cross section depicted in Fig. 3(a), because the manual segmentation was performed in an axial slice-by-slice fashion. The proposed approach to cost function calculation addresses all these issues [Fig. 3(c)]. All specific design decisions and parameter selection are based on experiments performed on the same set of PET volumes that have been used for optimizing graph construction parameters (Sec. 2.A).

FIG. 3.

Comparison of PET segmentation approaches. The same sagittal cross section is shown for all segmentations. (a) Manual slice-by-slice segmentation result. (b) Result of a 50% isocontour segmentation approach. (c) Result of proposed graph-based segmentation method.

Local image properties

Based on our assumption of roughly spherical object shape, we introduce the notion of a shell to robustly measure local uptake parameters. A shell Ω is the set of all nodes with the same node level j: Ω = {n ∣ i = 0, …, ncolumn − 1}. The average uptake for each shell Ω is robustly estimated by upΩ(j) = median{up(n) ∣ n ∈ Ω}. A typical plot of upΩ(j) as a function of j is given in Fig. 4(a). Specifically, we are interested in finding the “peak” pe and the “knee” kn uptake values [Fig. 4(a)], which will be utilized for determining a threshold Th that is used for calculating cbase(n). The peak pe is given by and represents the maximum uptake in a shell. The knee kn is the approximate uptake value at which the object fades into the background, but is not intended to be the background itself. To find kn, the gradient is calculated for j = 1, …, nnode − 2 [Fig. 4(b)]. Then the index of the node representing the steepest descent in upΩ is found with [Fig. 4(c)]. The linear center bias term (nnode − (j + 1))/nnode helps dealing with some rare, occasional cases of one large or several small outside objects with uptake appearing in the shell profile. To find the point where the shell uptake transitions to background values, we utilize a modified version of , such that it never decreases if j > jlow

FIG. 4.

Calculation of local image statistics. (a) Plot of upΩ with points of interest pe and kn. (b) The gradient of upΩ. (c) The gradient with added center bias to detect jlow. (d) Deriving jhi from . Points on the curve that are labeled with low and hi mark the location of indices jlow and jhi, respectively.

The node index where upΩ starts leveling off is jhi [Fig. 4(d)] with . The knee node index is found by evaluating and kn = upΩ(jknee).

Threshold calculation

Once the peak and knee are determined, the threshold Th can be calculated evaluating Th = kn + Th%(pe − kn) with A plot of Eq. (2) as a function of the ratio γ = pe/kn is provided in Fig. 5, and the rationale behind this design is as follows. If γ = 2, roughly a 50% threshold is used. For larger γ-values (e.g., pe ≫ kn), the Th% will be lower than 50%, and for smaller γ-values, Th% will be increased moderately. With this empirical design, the typical tracing behavior of radiation oncologists is mimicked; higher uptake values above background are included to form a perceived safe tumor margin definition (Fig. 3). High acceptance (i.e., avoiding the urge of users to manually edit/postprocess segmentations that were generated by an algorithm) of computer-generated thresholds/segmentations is important to achieve lower inter- and intraobserver variability. Furthermore, Eq. (2) can be tailored to specific needs or applications. For example, variants based on some standard threshold methods can be used as an alternative to focus more on volume estimation than treatment. Some examples are Th% = 0.4 and Th% = 0.5, which are based on typical 40% and 50% values of maximum thresholds that are fairly common. Also, note that during calculation of Th, the most inner and outer shells are not considered because they represent extreme solutions that are not of interest.

FIG. 5.

Plot of Th% as a function of γ.

Cost function

Once the threshold Th is found, the cost function cbase(n) is constructed as follows (Fig. 6). Essentially, cbase consists of the following three components:

FIG. 6.

Cost function design. (a) Histogram H of a region around a typical lesion with corresponding envelope function . (b) Example of a cost function; the individual components are clearly visible. The left part follows , while the right part is linear. The lowest point where both parts meet is at Th. (c) Example of a typical cost profile with and without creject on a single column i, as a function of j, paired with the uptake along the column.

The cost reflects the likeliness of the uptake to be part of the background. For this purpose, the image volume inside Mregion is isotropically resampled, and a normalized image histogram H with max(H) = 1 is generated. is then generated by taking the right-to-left monotonic increasing envelope function of H [Fig. 6(a)]. The cost is defined as cbase(n) = 0. The cost is a linear function given by cbase(n) = (up(n) − Th)/(up(ce) − Th), which has the value of 0 at the threshold uptake and 1 at the center uptake. The final cost function is given by c(n) = cbase(n) + creject(n). The term creject(n) is added to reject trivial solutions or parts of adjacent, unrelated objects. It is defined as with r1(n) = j < jmin, , and jmin = 3. Therefore, the costs of nodes in close proximity to the center (j < jmin) as well as nodes further away where the uptake has fallen below the median of spherical region Mregion are increased by rej = 6, far above the typical cost range between 0 and 1. An example of a complete cost function profile is given in Fig. 6(c).

Segmentation

Once the graph is constructed and all costs have been determined, a globally optimal solution can be found in low degree polynomial time, as described by Li et al. The result is a set of nodes with exactly one node per column selected, which represents the segmentation boundary. To visually represent the segmentation result, the initial spherical triangle mesh can be utilized by moving mesh vertices to selected boundary nodes on the same column. Several examples of segmentations and their cost functions are shown in Fig. 7.

FIG. 7.

Examples of lesion segmentation results. [(a)–(c)] Segmentation results with a marked column along the x-axis. [(d)–(f)] Plots of the uptake and cost function corresponding to (a)–(c).

Once the user has produced a valid segmentation of lesions found in a PET volume, the meshes are converted to labeled volumes by means of voxelization, resulting in segmentations S. Voxelization can lead to a result where some voxels are not within a 6-neighborhood of the main part of the lesion that includes ce. Such voxels are removed. Also, the user has the option to close one-voxel-wide gaps between adjacent lesions (e.g., hot lymph nodes within a chain of nodes).

Segmentation options

Beyond the base algorithm (Secs. 2.A–2.C), there are alternative modes that can adapt the behavior of the algorithm to better handle frequently occurring situations that are ambiguous. Thus, the options enable the user to provide additional expert knowledge, which is utilized to modify the behavior of the algorithm accordingly. For our method, the following segmentation options were implemented, which are described in appendices. Label Avoidance prevents new lesion segmentation from overwriting existing segmentations (Appendix A). Splitting simplifies individually segmenting lesion in close proximity with similar uptake (Appendix B). Necrotic Mode simplifies the segmentation of necrotic lesions (Appendix C).

JEI-Based refinement approach

Designing a cost function that is appropriate for all possible situations is difficult. Consequently, while the above-described base algorithm will perform flawlessly for the majority of lesions found in H&N FDG PET scans, suboptimal results can occur in some cases. For practical applicability in clinical trials and routine care, it is important that even difficult cases can be processed with the same segmentation tool without major additional effort. Thus, we utilize the JEI principle to effectively deal with segmentation errors of the base algorithm. The basic idea behind JEI is that user input is kept at a minimum during refinement of a segmentation. Thus, instead of manually correcting (local) errors (i.e., user driven boundary delineation), the user provides only high-level input to the algorithm to correct the segmentation boundary by utilizing refinement modes of the approach. Also, an undo function is provided that allows the user to reverse a refinement step. Two examples of JEI-based refinement are given in Fig. 8, and a detailed description of refinement modes is given in the sections below.

FIG. 8.

Examples of JEI-based segmentation refinement. (a) The segmentation surface expanded outward too much. (b) A part of the lesion was excluded. (c) The segmentation shown in (a) is corrected with one mouse click by using the global refinement option, affecting the whole boundary. (d) The segmentation shown in (b) is corrected with one mouse click by using local refinement, affecting only a local portion of the boundary. (See color online version.)

Global refinement

Global refinement changes the value of Th in order to modify the boundary of the segmentation. The user can place a point R at an image location where the lesion boundary should go through, and the corresponding uptake value is utilized as a new value for Th, which will change the cost function and thus the entire segmentation surface. In addition, the closest column to R is modified to force the result (surface) through the closest node to R, which will be called n. This is accomplished by adding another cost change function to cbase during the calculation of the final cost c. As a result of this cost change, the entire graph-based optimization needs to be rerun. If the user is not satisfied with the result, he/she can repeat the process, which will override the previous settings. An example for a “single click” global refinement action is provided in Figs. 8(a) and 8(c). As can be seen, a single user provided point [red dot in Fig. 8(c)] is sufficient to update the whole object surface.

Local refinement

Local refinement allows the user to correct cases where only a portion of the segmentation boundary needs to be corrected. This is accomplished as described below. User-interaction. To start the local refinement process, the user needs to inspect the segmentation and, if needed, specify a surface point R through which the correct surface should go [blue point in Fig. 8(d)]. Because the user can perform multiple local refinement interactions, the index p is used to keep track of them. The node closest to R will be denoted as n(i, j). Local search for similar columns. Subsequent to finding n(i, j), neighboring columns of the surface segment to be altered are determined based on the similarity of the uptake pattern around n(i, j) (Fig. 9). For this purpose, a breadth-first search (BFS) on columns with a hard constraint of ds(i, i) ≤ 5 is performed. The function ds(i, i) represents the number of edges on the shortest path on selected columns (mesh vertices) between the surface mesh vertices a and b (graph geodesic) that are associated with columns i and i, respectively. An uptake pattern on a column is considered similar if and |j − j| ≤ ds(i, i) is fulfilled with . Because the BFS can leave “holes” in the set of selected columns, left out columns are added in a “closing” step; columns that were skipped during BFS due to dissimilarity are included, if two thirds of its immediate neighbors were selected by smc. Finally, for all selected columns with i≠i, the node with the highest profile similarity is stored in the set Ψ.

FIG. 9.

Examples illustrating local refinement. (a) In this example, the vector around R on the column marked C is compared to vectors on adjacent columns within range to find the best match for the refinement. Note that the vector is smaller than in practice and for simplicity, only columns in a plane are shown. (b) Illustration of BFS to find neighboring columns whose costs need to be adapted. Note that in this case, comparisons are based on column C.

Modification of costs. To locally refine the segmentation result, the costs on column i itself and selected neighboring columns in set Ψ are modified. On column i, the cost update function is given by For all nodes of columns that include a node , the function is utilized; for the definition of function notch see Eq. (A2) in Appendix A. Thus, the cost is decreased around the target. The decrease is narrow in close proximity to column i, but becomes much wider further away. For all other nodes, c is equal to zero. The motivation behind this design is as follows. First, it forces the boundary to pass trough n. Second, it models uncertainty further away from the user specified refinement point. Third, it enables a smooth transition to the unchanged portions of the surface. Calculating a new solution. After adding c to the original cost function c, a new solution is calculated and the segmentation result is displayed. Calculating a new solution can be done efficiently by building on the previously calculated one, as outlined by Boykov and Jolly. Multiple local refinement points can be utilized for a lesion, each resulting in a cost change, which are accumulated to update c. The only exceptions are columns directly associated with user-selected refinement points, which will not have their costs further changed.

METHOD VALIDATION

Image data

For method validation, 60 PET/CT scans from 59 different subjects with H&N cancer were utilized, which were acquired with different scanners and reconstruction parameters (Table II) between 2004 and 2008. Out of this set, 59 scans were performed pretreatment, and for one patient, an additional post-treatment scan with uptake was included. All subjects were injected with 370 MBq ± 10% of [F-18]FDG with an uptake time of 90 min ± 10%. In all cases subjects were fasted for >4 h and had blood glucose <200 mg/dl. Because of the interest in the H&N region, patients were imaged with arms down, and CT-based attenuation correction was performed. All reconstructions were performed with 2D OSEM iterative algorithms (Table II). Siemens Biograph 40 PET images were reconstructed onto a 168 × 168 pixel image matrix, while Siemens Biograph Duo PET and GE Medical Systems Discovery LS images were reconstructed onto a 128 × 128 pixel image matrix. The number of axial image slices ranged from 191 to 545. The complexity of cases to segment covered a wide spectrum, ranging from simple to complex (Figs. 10 and 1). Primary cancers included the following anatomical regions: base of tongue, oropharynx, pyriform sinus, tonsil, hypopharynx, and nasopharynx. The cases had varying TNM stage, but no metastasis. An experienced radiation oncologist (JMB) inspected all scans and identified primary tumors and all positive (hot) lymph nodes. Overall, the PET scans include a total of 230 different lesions with 59 primary tumors and 171 lymph nodes, and the average number of lesions was 3.83/scan. All scans were assessed in terms of complexity and classified into the following categories: low, med, and high.

TABLE II.

Utilized PET-CT scanners and image reconstruction parameters.

PET-CT scanner	Number	Voxel size (mm)	Reconstruction algorithm
Siemens Biograph 40	13	3.394 × 3.394 × 2.025	2D OSEM, 4 iterations,
Siemens Biograph 40	2	3.394 × 3.394 × 5.000	8 subsets and a 7 mm Gaussian filter
Siemens Biograph Duo	1	3.432 × 3.432 × 3.375	2D OSEM, 2 iterations,
Siemens Biograph Duo	43	3.538 × 3.538 × 3.375	8 subsets and a 5 mm Gaussian filter
GE Medical Systems Discovery LS	1	4.297 × 4.297 × 4.250	2D OSEM, 2 iterations, 28 subsets

FIG. 10.

Range of complexity of utilized H&N PET image data. (a) Case with low complexity (single primary tumor) and (b) a case with high complexity with primary cancer and multiple hot lymph nodes in close proximity.

Experimental setup

To assess the performance of our semiautomated algorithm, it was implemented as an extension for 3D Slicer, a multiplatform, free, and open source software package for visualization and medical-image computing. In addition, the manual segmentation (2D drawing) tools offered by 3D Slicer were used for comparison. Three physicians (experts) with different levels of radiation oncology experience (one faculty professor—JMB and two physician residents with instructions—KAP and TC) participated in our validation experiment. Before the start of the experiment, all experts were trained on both segmentation methods and tools, using a separate set of ten H&N PET scans. For validation, the 60 PET scans described in Sec. 3.A were randomly divided into three sets of 20 scans each. The partitioning was stratified such that each set contains approximately the same range of case complexities. These three sets were processed sequentially in four processing steps. In each step, the complete set of 20 PET scans was segmented in a random sequence by alternating the use of semiautomated and manual segmentation methods such that after two steps, all 20 cases in a set were processed with both methods by each physician. To assess intraoperator variability, two additional steps were performed by all three experts. Thus, all 230 lesions present in the 60 PET scans partitioned into three sets were segmented four-times by three experts, resulting overall in 2760 segmentations. Half were done by using the semiautomated (JEI) method and half with the standard manual segmentation method. For both methods, PET scans were loaded into the segmentation software using a default display setting, consisting of a window of 6 SUV and level of 3 SUV. The experts were allowed to change this setting if deemed necessary for the segmentation task at hand (e.g., chain of active lymph nodes with similar uptake). For segmentation, a set of indicator images was provided to the experts that roughly identified the object to be segmented and the corresponding object label to be used (Fig. 11) for each lesion to avoid differences in the interpretation of the PET scans. The indicator images were generated by an experienced radiation oncologist (JMB) with access to medical patient records.

FIG. 11.

Example of an indicator image provided to experts to specify what lesions should be segmented and the label (seven in this case) that should be assigned.

Independent reference standard and quantitative indices

Segmentation accuracy. Due to the lack of a ground truth for lesions, the following approach was used to form an independent reference for comparison. For a given expert, the four manual segmentations performed by the two other experts were combined to produce a segmentation reference by utilizing the Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm proposed by Warfield et al. To assess segmentation accuracy, the Dice coefficient was computed between each expert’s observed and reference segmentations. Given two segmentations, A and B, the Dice coefficient is given by D = (2|A∩B|)/(|A| + |B|). Coefficient values close to one indicate high agreement, and values close to zero indicate low agreement. Segmentation agreement. For the assessment of inter- and intraoperator segmentation agreement, the following approach was used. To assess intraoperator agreement, the Dice coefficient between trial one and two with a given method was calculated. To measure interoperator agreement, Dice coefficients for a given method were calculated for all user pairs within each trial. Time and user effort. For each user, the time required to segment all lesions in a given PET scan was recorded for both methods. Additionally, for the semiautomated approach, the number of actions (specifying a center point for segmentation, refinement operations, etc.) including and excluding discarded actions was recorded and analyzed.

Statistical analysis of performance indices

Statistical analysis was performed on results from the tumor segmentations. Reported Dice coefficient means, 95% confidence intervals, and p-values were obtained with linear mixed effects regression models. Random effects were included in the models for experts and patients in order to account for repeated segmentation within the two factors. Estimated intra- and interoperator variability in segmentation accuracy Dice coefficients are reported as standard deviations. Intraoperator variability measures the amount of variation in the trial one and two segmentations about their average. Interoperator variability measures the variation in trial one and two averages across experts.

RESULTS

Segmentation accuracy

For both methods, the average Dice coefficient is given in Table III. The Dice coefficients were not found to be statistically different (p = 0.2145). Also, Table III provides summaries of the intra- and interoperator standard deviation of segmentation accuracy assessed with the Dice coefficient. The provided 95% confidence intervals (CIs) show that intra- and interoperator standard deviations are significantly lower for the semiautomated method compared to the manual segmentation approach.

TABLE III.

Estimated mean Dice coefficient, intraoperator variability, and interoperator variability of segmentation accuracy.

	Manual		Semiautomated
Measure	Value	95% CI	Value	95% CI
Dice coefficient mean	0.764	(0.741, 0.786)	0.766	(0.718, 0.815)
Intraoperator standard deviation	0.044	(0.0419, 0.0452)	0.037	(0.0359, 0.0387)
Interoperator standard deviation	0.049	(0.0462, 0.0513)	0.043	(0.0410, 0.0456)

Segmentation agreement

Table IV summarizes results for intra- and interoperator segmentation agreement analysis. In both cases, the semiautomated method shows a significantly higher agreement compared to the manual method. Figure 12 depicts typical examples of intra- and interoperator variation in HNC segmentation.

TABLE IV.

Estimated mean Dice coefficients for intra and interoperator segmentation agreement.

	Manual		Semiautomated
Agreement	Mean	95% CI	Mean	95% CI
Intraoperator	0.770	(0.747, 0.794)	0.926	(0.902, 0.949)
Interoperator	0.713	(0.699, 0.727)	0.905	(0.891, 0.919)

FIG. 12.

Example of intra- and interoperator segmentation agreement for manual and semiautomated segmentation methods. [(a)–(d)] Manual slice-by-slice segmentation results. [(e)–(h)] Semiautomated full 3D segmentation results. (i) Same PET image as in images (a)–(h), but with a different grey-value transfer function, showing uptake peaks corresponding to individual lymph nodes in close proximity.

Time and user effort

Table V and the box-plots provided in Fig. 13 summarize the time required for producing manual and semiautomated PET segmentations per HNC PET scan. The 95% confidence intervals in Table V show that the proposed semiautomated method is significantly faster than its manual counterpart. Note that while the average time for segmenting all lesions (average: 3.83 lesions) with our semiautomated method in a PET scan was 3.74 min, the segmentation algorithm itself runs at interactive speeds. Thus, almost all of the reported time is spent by the user on understanding the scene and inspecting produced segmentations.

TABLE V.

Estimated mean times in minutes with standard deviation (SD) and 95% confidence intervals (CIs) for manual and semiautomated segmentations per PET data set.

Method	Mean	SD	95% CI
Manual	8.88	7.1	(6.47, 11.28)
Semiautomated	3.74	3.3	(2.40, 5.08)

FIG. 13.

Boxplots of operator segmentation times for the manual and semiautomated segmentation methods per PET scan.

Table VI provides statistics about the number of actions required by experts to perform a semiautomated segmentation of all lesions in a PET scan. The plots in Fig. 14 show the accumulative percentage of cases that were completely segmented in dependence of the number of actions required. Differences between used (all) and actually necessary (final) actions in Table VI and Fig. 14 indicate that the user effectiveness can be further increased with additional training.

TABLE VI.

Statistics of actions required by expert 1 to expert 3 for the semiautomated method. All actions denotes the number of actions actually performed by the expert, and final actions denotes the number of actions that would have been required to generate the segmentations (i.e., undone actions are not counted).

		Number of actions
Expert	Actions	Mean	Median	For 90%
1	All	1.57	1	3
2	All	2.43	1	6
3	All	1.80	1	3
1	Final	1.36	1	2
2	Final	1.37	1	2
3	Final	1.18	1	2

FIG. 14.

Plots showing the accumulative number of cases completely segmented in dependence of number of user actions required for the semiautomated method for experts one, two and three. “All actions” denotes the number of actions actually performed by the expert, and “final actions” denotes the user actions that were actually required to perform the segmentations (i.e., undone actions are not counted).

DISCUSSION

Performance

Our JEI graph-based segmentation approach offers a high degree of automation, requires little user interaction, and enables clinically practical and efficient computer-aided segmentation refinement. Compared to a manual segmentation approach, this combination results in a versatile segmentation tool that showed equivalent average segmentation error with significantly reduced standard deviation. The performed assessment of user agreement showed significantly higher intra- and interoperator consistency for the semiautomated approach compared to manual segmentation. This can be seen by the example provided in Fig. 12; the manual segmentation approach leads to considerable variation in segmentations produced by the same user as well as across users, whereas little variation is observed between the segmentations produced with the semiautomated method. Thus, the approach meets the requirements defined in Sec. 1.B. Despite the relatively short training phase, all three experts were able to produce valid segmentations of target lesions in all utilized PET scans. However, as Table VI and Fig. 14 show, training of users is important to achieve good efficiency. Based on the plots shown in Fig. 14, we estimate that more than 90% of the 230 lesions can be segmented with one or two user input actions (mouse clicks), which will likely reduce the required user interaction time (Table V and Fig. 13) even further.

Current limitations

As can be seen from Fig. 14, in some rare cases, more than ten user actions were necessary to produce a segmentation of a lesion. This issue can be addressed by adding suitable segmentation modes and/or refinement tools that enables the user to more efficiently handle such cases. While the results presented in Sec. 4 are very promising in terms of segmentation error, segmentation agreement, and interaction time, it does not offer insight into the impact of different PET image reconstruction methods on segmentation results. Thus, we plan to study this aspect by utilizing PET phantoms in the near future. Harmonization of image acquisition and quality assurance is another critical component of assuring optimal clinical trials, target definitions, and response assessment that is also being addressed nationally. In the current implementation, only the PET image information of PET-CT scans is utilized. However, the chosen graph-based LOGISMOS framework is well suited for expanding our approach to segmenting PET and CT volumes at the same time. In addition, the method can be adapted for segmenting dynamic PET (volume and time) scans. Also, by defining a suitable cost function, our segmentation approach can be adapted for segmenting PET scans with tracers other than FDG.

Impact

Target or lesion definition is a critical component for both radiation therapy treatment planning and delivery as well as for response assessment of tumors after treatment. Our current standard involves a process wherein the radiation oncologist or radiologist manually outlines a tumor or lesion. This is inherently both time consuming and relatively inconsistent because of the nature of both human perception and limits of human dexterity. Algorithmic tools have the capacity to speed the process and at the same time improve consistency. While concerns exist regarding automation of critical treatment-determining steps and decision-making metrics, the ability to improve consistency is unquestionably paramount, particularly when considering clinical trials in which multiple institutions and investigators will determine targets and hence outcomes. The ability to substantially improve consistency enables better comparisons of tumor control probabilities. Although substantial quality assurance efforts within the radiation therapy community have been extremely effective and robust in consistently defining dose delivery parameters across institutions using a variety of treatment planning and delivery systems for clinical trials, the same level of robustness has not been possible for tumor definitions. Algorithmic tools are an essential method for achieving this improved robustness. Our tool illustrates a 26.9% improvement in segmentation agreement across physicians combined with a 20.3% improvement of agreement within the same physician contouring the same target. In addition, the tool reduced the time required for segmentation by 57.9% compared to the standard manual method per PET scan. In considering radiation therapy trials, this can impact both the cost by diminishing the numbers of patients needed and decreasing the time for defining the tumor target for treatment. Additional tools to apply these algorithmic benefits for response assessment will be both useful for outcome determination as well as potentially for therapy adaptation.

CONCLUSIONS

We have presented a novel approach for the highly computer-aided segmentation of lesions in H&N FDG PET volume data sets. By utilizing the JEI paradigm, a good trade-off between automation and input required from human experts was achieved, enabling segmentation of simple and complex cases with the same tool in an efficient and consistent manner. Due to the higher intra- and interoperator segmentation agreement compared to manual segmentation without a loss in segmentation accuracy, the proposed method is well suited for applications like image-guided radiation treatment planning as well as image based assessment of treatment response or treatment outcome prediction, which all benefit from these traits. The application of these tools in multi-institutional clinical trials can improve their efficiency and cost-effectiveness.

34 in total

1. Liver segmentation in contrast enhanced CT data using graph cuts and interactive 3D segmentation refinement methods.

Authors: Reinhard Beichel; Alexander Bornik; Christian Bauer; Erich Sorantin
Journal: Med Phys Date: 2012-03 Impact factor: 4.071

2. Optimal graph search segmentation using arc-weighted graph for simultaneous surface detection of bladder and prostate.

Authors: Qi Song; Xiaodong Wu; Yunlong Liu; Mark Smith; John Buatti; Milan Sonka
Journal: Med Image Comput Comput Assist Interv Date: 2009

3. LOGISMOS--layered optimal graph image segmentation of multiple objects and surfaces: cartilage segmentation in the knee joint.

Authors: Yin Yin; Xiangmin Zhang; Rachel Williams; Xiaodong Wu; Donald D Anderson; Milan Sonka
Journal: IEEE Trans Med Imaging Date: 2010-07-19 Impact factor: 10.048

4. Optimal surface segmentation in volumetric images--a graph-theoretic approach.

Authors: Kang Li; Xiaodong Wu; Danny Z Chen; Milan Sonka
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2006-01 Impact factor: 6.226

5. A novel PET tumor delineation method based on adaptive region-growing and dual-front active contours.

Authors: Hua Li; Wade L Thorstad; Kenneth J Biehl; Richard Laforest; Yi Su; Kooresh I Shoghi; Eric D Donnelly; Daniel A Low; Wei Lu
Journal: Med Phys Date: 2008-08 Impact factor: 4.071

6. Computer-aided lymph node segmentation in volumetric CT data.

Authors: Reinhard R Beichel; Yao Wang
Journal: Med Phys Date: 2012-09 Impact factor: 4.071

7. Metastatic neck disease. Evaluation by computed tomography.

Authors: M Friedman; V K Shelton; M Mafee; P Bellity; V Grybauskas; E Skolnik
Journal: Arch Otolaryngol Date: 1984-07

8. Lung tumor segmentation in PET images using graph cuts.

Authors: Cherry Ballangan; Xiuying Wang; Michael Fulham; Stefan Eberl; David Dagan Feng
Journal: Comput Methods Programs Biomed Date: 2012-11-10 Impact factor: 5.428

9. Computed tomography of cervical lymph nodes. Staging and management of head and neck cancer.

Authors: M H Stevens; H R Harnsberger; A A Mancuso; R K Davis; L P Johnson; J L Parkin
Journal: Arch Otolaryngol Date: 1985-11

Review 10. Quality assurance for clinical trials.

Authors: Geoffrey S Ibbott; Annette Haworth; David S Followill
Journal: Front Oncol Date: 2013-12-19 Impact factor: 6.244

18 in total

1. Simultaneous cosegmentation of tumors in PET-CT images using deep fully convolutional networks.

Authors: Zisha Zhong; Yusung Kim; Kristin Plichta; Bryan G Allen; Leixin Zhou; John Buatti; Xiaodong Wu
Journal: Med Phys Date: 2019-01-04 Impact factor: 4.071

2. Classification and evaluation strategies of auto-segmentation approaches for PET: Report of AAPM task group No. 211.

Authors: Mathieu Hatt; John A Lee; Charles R Schmidtlein; Issam El Naqa; Curtis Caldwell; Elisabetta De Bernardi; Wei Lu; Shiva Das; Xavier Geets; Vincent Gregoire; Robert Jeraj; Michael P MacManus; Osama R Mawlawi; Ursula Nestle; Andrei B Pugachev; Heiko Schöder; Tony Shepherd; Emiliano Spezi; Dimitris Visvikis; Habib Zaidi; Assen S Kirov
Journal: Med Phys Date: 2017-05-18 Impact factor: 4.071

3. Gross tumor volume segmentation for head and neck cancer radiotherapy using deep dense multi-modality network.

Authors: Zhe Guo; Ning Guo; Kuang Gong; Shun'an Zhong; Quanzheng Li
Journal: Phys Med Biol Date: 2019-10-16 Impact factor: 3.609

4. Graph-based segmentation of abnormal nuclei in cervical cytology.

Authors: Ling Zhang; Hui Kong; Shaoxiong Liu; Tianfu Wang; Siping Chen; Milan Sonka
Journal: Comput Med Imaging Graph Date: 2017-01-31 Impact factor: 4.790

5. Multi-site quality and variability analysis of 3D FDG PET segmentations based on phantom and clinical image data.

Authors: Reinhard R Beichel; Brian J Smith; Christian Bauer; Ethan J Ulrich; Payam Ahmadvand; Mikalai M Budzevich; Robert J Gillies; Dmitry Goldgof; Milan Grkovski; Ghassan Hamarneh; Qiao Huang; Paul E Kinahan; Charles M Laymon; James M Mountz; John P Muzi; Mark Muzi; Sadek Nehmeh; Matthew J Oborski; Yongqiang Tan; Binsheng Zhao; John J Sunderland; John M Buatti
Journal: Med Phys Date: 2017-02 Impact factor: 4.071

6. Identifying ¹⁸F-FDG PET-metabolic radiomic signature for lung adenocarcinoma prognosis via the leveraging of prognostic transcriptomic module.

Authors: Jin Li; Yixin Liu; Wenlei Dong; Yang Zhou; Jingquan Wu; Kuan Luan; Lishuang Qi
Journal: Quant Imaging Med Surg Date: 2022-03

7. Joint Tumor Segmentation in PET-CT Images Using Co-Clustering and Fusion Based on Belief Functions.

Authors: Chunfeng Lian; Su Ruan; Thierry Denoeux; Hua Li; Pierre Vera
Journal: IEEE Trans Image Process Date: 2018-10-05 Impact factor: 10.856

Review 8. The Use of Quantitative Imaging in Radiation Oncology: A Quantitative Imaging Network (QIN) Perspective.

Authors: Robert H Press; Hui-Kuo G Shu; Hyunsuk Shim; James M Mountz; Brenda F Kurland; Richard L Wahl; Ella F Jones; Nola M Hylton; Elizabeth R Gerstner; Robert J Nordstrom; Lori Henderson; Karen A Kurdziel; Bhadrasain Vikram; Michael A Jacobs; Matthias Holdhoff; Edward Taylor; David A Jaffray; Lawrence H Schwartz; David A Mankoff; Paul E Kinahan; Hannah M Linden; Philippe Lambin; Thomas J Dilling; Daniel L Rubin; Lubomir Hadjiiski; John M Buatti
Journal: Int J Radiat Oncol Biol Phys Date: 2018-06-30 Impact factor: 7.038

9. Impact of total variation regularized expectation maximization reconstruction on the image quality of ⁶⁸Ga-PSMA PET: a phantom and patient study.

Authors: Feng-Jiao Yang; Shu-Yue Ai; Runze Wu; Yang Lv; Hui-Fang Xie; Yun Dong; Qing-Le Meng; Feng Wang
Journal: Br J Radiol Date: 2021-02-16 Impact factor: 3.039

10. DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research.

Authors: Andriy Fedorov; David Clunie; Ethan Ulrich; Christian Bauer; Andreas Wahle; Bartley Brown; Michael Onken; Jörg Riesmeier; Steve Pieper; Ron Kikinis; John Buatti; Reinhard R Beichel
Journal: PeerJ Date: 2016-05-24 Impact factor: 2.984