| Literature DB >> 31993925 |
Coreline N Burggraaff1, Fareen Rahman2, Isabelle Kaßner3, Simone Pieplenbosch1, Sally F Barrington4, Yvonne W S Jauw1,5, Gerben J C Zwezerijnen5, Stefan Müller3, Otto S Hoekstra5, Josée M Zijlstra1, Henrica C W De Vet6, Ronald Boellaard7.
Abstract
PURPOSE: This pilot study aimed to determine interobserver reliability and ease of use of three workflows for measuring metabolic tumor volume (MTV) and total lesion glycolysis (TLG) in diffuse large B cell lymphoma (DLBCL). PROCEDURES: Twelve baseline [18F]FDG PET/CT scans from DLBCL patients with wide variation in number and size of involved organs and lymph nodes were selected from the international PETRA consortium database. Three observers analyzed scans using three workflows. Workflow A: user-defined selection of individual lesions followed by four automated segmentations (41%SUVmax, A50%SUVpeak, SUV≥2.5, SUV≥4.0). For each lesion, observers indicated their "preferred segmentation." Individually selected lesions were summed to yield total MTV and TLG. Workflow B: fully automated preselection of [18F]FDG-avid structures (SUV≥4.0 and volume≥3ml), followed by removing non-tumor regions with single mouse clicks. Workflow C: preselected volumes based on Workflow B modified by manually adding lesions or removing physiological uptake, subsequently checked by experienced nuclear medicine physicians. Workflow C was performed 3 months later to avoid recall bias from the initial Workflow B analysis. Interobserver reliability was expressed as intraclass correlation coefficients (ICC).Entities:
Keywords: Diffuse large B cell lymphoma; Metabolic tumor volume; PET/CT; Total lesion glycolysis
Mesh:
Year: 2020 PMID: 31993925 PMCID: PMC7343740 DOI: 10.1007/s11307-020-01474-z
Source DB: PubMed Journal: Mol Imaging Biol ISSN: 1536-1632 Impact factor: 3.488
Interobserver reliability of semi-automated MTV and TLG assessment for the different workflows
| MTV | TLG | |||||
|---|---|---|---|---|---|---|
| Mean(range) | Mean CoV(range) | ICC(95%CI) | Mean(range) | Mean CoV(range) | ICC(95%CI) | |
| Workflow A (individual lesion selection) | ||||||
| 41%MAX | 1106(33–4991) | 65.54(0–164.38) | 0.43(0.08–0.76) | 6236(471–21,431) | 54.57(0–151.84) | 0.37(0.02–0.72) |
| A50%P | 550(34–4153) | 36.74(0–139.73) | 0.86(0.68–0.95) | 5736(245–45,441) | 26.76(0–118.26) | 0.93(0.82–0.98) |
| SUV≥2.5 | 2399(73–7404) | 13.34(0–54.21) | 0.96(0.91–0.99) | 15,902(347–55,588) | 7.11(0–33.81) | 0.99(0.98–1.00) |
| SUV≥4.0 | 1289(30–5688) | 13.78(0–83.59) | 0.94(0.86–0.98) | 13,617(220–50,068) | 11.32(0–82.52) | 0.97(0.93–0.99) |
| MV2 | 1505(59–6258) | 22.68(0–83.59) | 0.92(0.80–0.97) | 14,422(301–51,908) | 15.84(0–82.52) | 0.97(0.91–0.99) |
| MV3 | 927(33–4654) | 33.54(0–154.17) | 0.91(0.79–0.97) | 12,181(229–43,669) | 24.91(0–135.92) | 0.96(0.89–0.99) |
| Workflow B (automated preselection) | ||||||
| SUV≥4.0, Volume≥3ml | 1004(23–5723) | 2.32(0–10.43) | 1.00(1.00–1.00) | 8446(189–50,779) | 1.85(0–7.49) | 1.00(1.00–1.00) |
| Workflow C (automated preselection with manual modification) | ||||||
| Final MTV | 1115(53–5589) | 16.71(0–109.46) | 0.92(0.82–0.98) | 8610(284–48,079) | 13.33(0–111.83) | 0.97(0.93–0.99) |
MV, majority vote; MTV, metabolic tumor volume; CoV, coefficient of variation; ICC, intraclass correlation coefficient; CI, confidence interval; TLG, total lesion glycolysis
Mean analysis time for the different workflows in minutes (mean ± standard deviation (range))
| Workflow | A individual lesion selection ( | B automated preselection ( | C with manual modification ( |
|---|---|---|---|
| Observer 1 | 29.1 ± 20.8(5–63) | 7.2 ± 3.7(3–15) | 23.3 ± 13.4(5–45) |
| Observer 2 | Not reported* | Not reported* | 26.7 ± 15.6(10-62) |
| Observer 3 | 28.2 ± 13.7(15–60) | 7.3 ± 3.5(1–12) | 16.7 ± 9.7(8–42) |
| Mean | 28.7† | 7.3† | 22.2 |
*Observer 2 summed the total time for Workflow A + B; mean 27.3 ± 19.2 (7–75) minutes
†Mean value based on 2 observers
Most preferred method per observer for Workflow A
| Patient | Observer 1 | Observer 2 | Observer 3 |
|---|---|---|---|
| 1 | 41%MAX | 41%MAX | SUV≥4.0 |
| 2 | 41%MAX | 41%MAX/A50%P/SUV≥4.0 | SUV≥2.5 |
| 3 | A50%P | 41%MAX | SUV≥4.0 |
| 4 | SUV≥4.0 | A50%P | SUV≥4.0 |
| 5 | SUV≥4.0 | A50%P | SUV≥4.0 |
| 6 | SUV≥4.0 | 41%MAX/A50%P | SUV≥4.0 |
| 7 | A50%P | A50%P | SUV≥2.5 |
| 8 | A50%P | 41%MAX/A50%P | SUV≥4.0 |
| 9 | A50%P | 41%MAX | SUV≥4.0 |
| 10 | A50%P | A50%P | A50%P |
| 11 | SUV≥4.0 | 41%MAX | A50%P |
| 12 | 41%MAX/SUV≥4.0 | 41%MAX | A50%P |
Each observer indicated their “preferred segmentation” for individual lesions. The most preferred method per patient was defined as the method most often noted as “preferred segmentation”
Fig. 1.Scatterplot of MTV for Workflow A (user-defined selection with SUV≥4.0) and Workflow B (automated preselection). PET images represent examples of different MTV interpretations between the workflows. Top left images (patient 10): Workflow B contains only lymphoma lesions around the large vessels (left), while in Workflow A, the liver and spleen were also included in the lesion selection (right). Bottom right images (patient 8): in Workflow B, the large lesion was selected (left), while it was interpreted as not being lymphoma in Workflow A (right).
Fig. 2.Scatterplot of final MTV assessment in Workflow C.
Fig. 3.Scatterplot of MTV assessment in Workflow C (automated preselection before (C1)—and final MTV after manual modification (C2), in milliliters). Datapoints from two challenging patients (patients 10 and 11) are indicated by lines. The numbers in the boxes refer to the patient numbers described in the main text.
Fig. 4.Bland-Altman plot showing effect of manual modification of MTV assessment in Workflow C (automated preselection before (C1)—and final MTV after manual modification (C2)). Solid line: mean value, upper- and lower limit of agreement without exclusion of outliers. Dashed line: mean value, upper- and lower limit of agreement after exclusion of patients 10 and 11.