| Literature DB >> 34136402 |
Xinxin Zhi1,2,3, Jin Li4, Junxiang Chen1,2,3, Lei Wang5, Fangfang Xie1,2,3, Wenrui Dai4, Jiayuan Sun1,2,3, Hongkai Xiong4.
Abstract
BACKGROUND: Endoscopic ultrasound (EBUS) strain elastography can diagnose intrathoracic benign and malignant lymph nodes (LNs) by reflecting the relative stiffness of tissues. Due to strong subjectivity, it is difficult to give full play to the diagnostic efficiency of strain elastography. This study aims to use machine learning to automatically select high-quality and stable representative images from EBUS strain elastography videos.Entities:
Keywords: endobronchial ultrasound; image selection; lymph nodes; machine learning; strain elastography
Year: 2021 PMID: 34136402 PMCID: PMC8201408 DOI: 10.3389/fonc.2021.673775
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1The process of automatic selection of representative images. Frames were extracted from the video stream to construct a frame pool initially. Then, inferior frames are dropped during the quality evaluation procedure, and the eligible frames are kept as candidates for representative images. Next, the PCA was employed for dimension reduction. Ultimately, the clustering model select representative images from candidates. PCA, principal component analysis.
Figure 2Delineation of ROI on strain elastography images and output of quantitative parameters. Representative images of the machine learning group, expert group and trainee group were all input into a computer program developed by Matlab™. The schematic diagram showed an elastography image of a LN with nonspecific lymphadenitis in 4R station, and ROI was outlined on the elastography image. Results of the four quantitative methods including SAR, RGB, mean hue value and mean gray value were then output by the program. ROI, region of interest; LN, lymph node; SAR, stiff area ratio; B/G, elasticity ratio of blue/green; B/R, elasticity ratio of blue/red; G/R, elasticity ratio of green/red.
Characteristic of LNs in the training, validation and test sets.
| Characteristic of LNs | Training and validation sets No. (%) | Test set No. (%) |
|---|---|---|
|
| ||
| Long axis, mean ± SD, mm | 21.55 ± 6.71 | 22.48 ± 7.18 |
| Short axis, mean ± SD, mm | 17.90 ± 9.56 | 17.23 ± 6.45 |
|
| ||
| Long axis, mean ± SD, mm | 25.50 ± 9.48 | 24.35 ± 8.49 |
| Short axis, mean ± SD, mm | 16.70 ± 6.73 | 16.45 ± 7.02 |
|
| ||
| 2L | 1 (0.24) | 0 (0.00) |
| 2R | 8 (1.93) | 1 (1.10) |
| 3P | 2 (0.48) | 0 (0.00) |
| 4L | 19 (4.58) | 7 (7.69) |
| 4R | 135 (32.53) | 30 (32.97) |
| 7 | 160 (38.55) | 26 (28.57) |
| 10L | 2 (0.48) | 1 (1.10) |
| 10R | 3 (0.72) | 1 (1.10) |
| 11L | 32 (7.71) | 10 (10.99) |
| 11Rs | 32 (7.71) | 5 (5.49) |
| 11Ri | 19 (4.58) | 10 (10.99) |
| 12L | 1 (0.24) | 0 (0.00) |
| 12R | 1 (0.24) | 0 (0.00) |
|
| ||
|
| 256 (61.69) | 53 (58.24) |
| Adenocarcinoma | 110 (26.51) | 25 (27.47) |
| Squamous carcinoma | 39 (9.40) | 5 (5.49) |
| NSCLC-NOS | 13 (3.13) | 4 (4.40) |
| Small cell lung cancer | 60 (14.46) | 15 (16.48) |
| Large cell neuroendocrine carcinoma | 1 (0.24) | 0 (0.00) |
| NET-NOS | 11 (2.65) | 2 (2.20) |
| Unknown type of lung cancer | 13 (3.13) | 1 (1.10) |
| Carcinosarcoma | 1 (0.24) | 0 (0.00) |
| Lymphoma | 3 (0.72) | 0 (0.00) |
| Metastatic tumors (non-lung primary malignancy) | 5 (1.20) | 1 (1.10) |
|
| 159 (38.31) | 38 (41.76) |
| Nonspecific lymphadenitis | 97 (23.37) | 16 (17.58) |
| Sarcoidosis | 53 (12.77) | 15 (16.48) |
| Tuberculosis | 9 (2.17) | 7 (7.69) |
※The size of LNs on CT images was measured on 393 LNs in the training and validation sets and 88 LNs in the test set. A total of 25 LNs were missing on CT in both groups.
LNs, lymph nodes; NSCLC-NOS, non-small cell lung cancer not otherwise specified; NET-NOS, neuroendocrine tumor not otherwise specified.
Figure 3Representative images selected by machine learning, expert and trainee groups. (A) 1–3 are representative images selected by the machine learning model; (B) 1–3 are representative images selected by the three experts; (C) 1–3 are representative images selected by the three trainees.
Differences between images within each group by qualitative grading score.
| p value | |||
|---|---|---|---|
| Machine learning group | Expert group | Trainee group | |
| Image 123 | 0.210 | 0.036 | 0.205 |
| Image 12 | 0.134 | 0.058 | 0.862 |
| Image 13 | 0.088 | 0.029 | 0.105 |
| Image 23 | 0.637 | 0.366 | 0.059 |
Diagnostic efficiency of the three groups by qualitative grading score.
| Group | Sen | Spe | PPV | NPV | Acc | FPR | FNR |
|---|---|---|---|---|---|---|---|
| Machine learning 1 | 84.91% | 78.95% | 84.91% | 78.95% | 82.42% | 21.05% | 15.09% |
| Machine learning 2 | 83.02% | 73.68% | 81.48% | 75.68% | 79.12% | 26.32% | 16.98% |
| Machine learning 3 | 79.25% | 71.05% | 79.25% | 71.05% | 75.82% | 28.95% | 20.75% |
| Expert 1 | 92.45% | 73.68% | 83.05% | 87.50% | 84.62% | 26.32% | 7.55% |
| Expert 2 | 90.57% | 73.68% | 82.76% | 84.85% | 83.52% | 26.32% | 9.43% |
| Expert 3 | 88.68% | 78.95% | 85.45% | 83.33% | 84.62% | 21.05% | 11.32% |
| Trainee 1 | 62.26% | 60.53% | 68.75% | 53.49% | 61.54% | 39.47% | 37.74% |
| Trainee 2 | 64.15% | 71.05% | 75.56% | 58.70% | 67.03% | 28.95% | 35.85% |
| Trainee 3 | 69.81% | 60.53% | 71.15% | 58.97% | 65.93% | 39.47% | 30.19% |
Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value; Acc, accuracy; FPR, false positive rate; FNR, false negative rate.
Comparison of quantitative mean values between machine learning and expert groups.
| Variable | p value |
|---|---|
| SAR | 0.801 |
| B/G | 0.693 |
| Mean hue value | 0.862 |
| Mean gray value | 0.514 |
SAR, stiff area ratio; B/G, elasticity ratio of blue/green.
Differences in CV values of quantitative methods among the three groups.
| Indicator | Machine learning (mean ± SD) | Experts(mean ± SD) | Trainees(mean ± SD) | Machine learning | Machine learning | Experts |
|---|---|---|---|---|---|---|
| SAR | 0.127 ± 0.109 | 0.167 ± 0.124 | 0.200 ± 0.156 | 2.18E−03 | 3.22E−06 | 5.68E−02 |
| B/G | 0.079 ± 0.061 | 0.105 ± 0.067 | 0.127 ± 0.070 | 6.69E−03 | 2.72E−06 | 2.86E−02 |
| Mean hue value | 0.036 ± 0.029 | 0.042 ± 0.024 | 0.053 ± 0.036 | 1.44E−01 | 7.57E−05 | 9.50E−03 |
| Mean gray value | 0.013 ± 0.044 | 0.020 ± 0.062 | 0.022 ± 0.062 | 3.96E−01 | 2.56E−01 | 8.07E−01 |
CV, coefficient of variation; SD, standard deviation; SAR, stiff area ratio; B/G, elasticity ratio of blue/green.
Figure 4ROC curves of four quantitative methods for machine learning, expert and trainee groups. (A–D) illustrate four quantitative indicators including SAR, B/G, mean hue value and mean gray value, respectively. ROC, receiver operating characteristic; SAR, stiff area ratio; B/G, elasticity ratio of blue/green.
Diagnostic efficiency of the three groups by quantitative methods.
| Group | AUC | Cut-off | Sen | Spe | PPV | NPV | Acc |
|---|---|---|---|---|---|---|---|
|
| |||||||
| SAR | 0.819 | 0.402 | 84.91% | 81.58% | 86.54% | 79.49% | 83.52% |
| B/G | 0.798 | 1.176 | 75.47% | 81.58% | 85.11% | 70.45% | 78.02% |
| Mean hue value | 0.801 | 133.762 | 84.91% | 73.68% | 81.82% | 77.78% | 80.22% |
| Mean gray value | 0.805 | 194.632 | 81.13% | 78.95% | 84.31% | 75.00% | 80.22% |
|
| |||||||
| SAR | 0.822 | 0.403 | 84.91% | 73.68% | 81.82% | 77.78% | 80.22% |
| B/G | 0.812 | 1.116 | 86.79% | 73.68% | 82.14% | 80.00% | 81.32% |
| Mean hue value | 0.808 | 134.870 | 84.91% | 78.95% | 84.91% | 78.95% | 82.42% |
| Mean gray value | 0.809 | 194.329 | 84.91% | 78.95% | 84.91% | 78.95% | 82.42% |
|
| |||||||
| SAR | 0.746 | 0.452 | 71.70% | 71.05% | 77.55% | 64.29% | 71.43% |
| B/G | 0.750 | 1.043 | 81.13% | 63.16% | 75.44% | 70.59% | 73.63% |
| Mean hue value | 0.758 | 139.811 | 64.15% | 84.21% | 85.00% | 62.75% | 72.53% |
| Mean gray value | 0.744 | 195.976 | 64.15% | 78.95% | 80.95% | 61.22% | 70.33% |
SAR, stiff area ratio; B/G, elasticity ratio of blue/green; AUC, area under curve; Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value; Acc, accuracy.