| Literature DB >> 33555397 |
Takahiro Nakao1, Shouhei Hanaoka2, Yukihiro Nomura3, Masaki Murata4, Tomomi Takenaga3, Soichiro Miki3, Takeyuki Watadani5, Takeharu Yoshikawa3, Naoto Hayashi3, Osamu Abe2,5.
Abstract
The purposes of this study are to propose an unsupervised anomaly detection method based on a deep neural network (DNN) model, which requires only normal images for training, and to evaluate its performance with a large chest radiograph dataset. We used the auto-encoding generative adversarial network (α-GAN) framework, which is a combination of a GAN and a variational autoencoder, as a DNN model. A total of 29,684 frontal chest radiographs from the Radiological Society of North America Pneumonia Detection Challenge dataset were used for this study (16,880 male and 12,804 female patients; average age, 47.0 years). All these images were labeled as "Normal," "No Opacity/Not Normal," or "Opacity" by board-certified radiologists. About 70% (6,853/9,790) of the Normal images were randomly sampled as the training dataset, and the rest were randomly split into the validation and test datasets in a ratio of 1:2 (7,610 and 15,221). Our anomaly detection system could correctly visualize various lesions including a lung mass, cardiomegaly, pleural effusion, bilateral hilar lymphadenopathy, and even dextrocardia. Our system detected the abnormal images with an area under the receiver operating characteristic curve (AUROC) of 0.752. The AUROCs for the abnormal labels Opacity and No Opacity/Not Normal were 0.838 and 0.704, respectively. Our DNN-based unsupervised anomaly detection method could successfully detect various diseases or anomalies in chest radiographs by training with only the normal images.Entities:
Keywords: Anomaly detection; Chest radiograph; Deep learning; Generative adversarial network; Unsupervised learning; Variational autoencoder
Year: 2021 PMID: 33555397 PMCID: PMC8289984 DOI: 10.1007/s10278-020-00413-2
Source DB: PubMed Journal: J Digit Imaging ISSN: 0897-1889 Impact factor: 4.056
Fig. 1Overview of our anomaly detection system. (a) Anomaly detection based on reconstruction error. The anomaly (a lung mass in this figure) disappears after the reconstruction, and the total reconstruction error of an abnormal image is expected to be larger than that of a normal image. (b) Anomaly detection using code norm. Abnormal images will be out of the distribution of the normal images in the latent space (the standard Gaussian distribution ideally) and farther from the origin than normal ones. The 128-dimensional latent space is drawn as two-dimensional for the explanation
Details of the RSNA dataset
| Normal | Lung opacity | No lung opacity/not normal | Total | ||
|---|---|---|---|---|---|
| Age (Year) | Range | 2–91 | 1–92 | 1–92 | 1–92 |
| Mean (SD) | 45.0 (16.3) | 49.4 (16.4) | 45.6 (17.5) | 47.0 (16.8) | |
| Gender | Male | 5496 | 4158 | 7226 | 16,880 |
| Female | 4294 | 2948 | 5562 | 12,804 | |
| View position | PA | 7995 | 1614 | 6520 | 16,129 |
| AP | 1795 | 5492 | 6268 | 13,555 | |
| Total | 9790 | 7106 | 12,788 | 29,684 |
SD standard deviation, PA posteroanterior, AP anteroposterior
Details of splitting dataset in our study
| Training | Validation | Test | Total | |
|---|---|---|---|---|
| Normal | 6853 | 979 | 1,958 | 9,790 |
| Abnormal* | 0 | 6631 | 13,263 | 19,894 |
| Total | 6853 | 7610 | 15,221 | 29,684 |
*“No lung opacity/not normal” or “lung opacity”
Fig. 2Illustration of α-GAN model. The grayed-out components are used only for training and are not used for our anomaly detection method. 128-D: 128-dimensional
Architectures of the networks
| Generator | Encoder | |||||
|---|---|---|---|---|---|---|
| Layer | Activation | Output shape | Layer | Activation | Output shape | |
| (Latent vector) | 128 | (Input Image) | 256 × 256 × 1 | |||
| Linear | LReLU | 4 × 4 × 512 | Convolution 1 × 1 | LReLU | 256 × 256 × 8 | |
| Upsampling | 8 × 8 × 512 | Convolution 3 × 3 | LReLU | 256 × 256 × 16 | ||
| Convolution 3 × 3 | LReLU | 8 × 8 × 256 | Downsampling | 128 × 128 × 16 | ||
| Upsampling | 16 × 16 × 256 | Convolution 3 × 3 | LReLU | 128 × 128 × 32 | ||
| Convolution 3 × 3 | LReLU | 16 × 16 × 128 | Downsampling | 64 × 64 × 32 | ||
| Upsampling | 32 × 32 × 128 | Convolution 3 × 3 | LReLU | 64 × 64 × 64 | ||
| Convolution 3 × 3 | LReLU | 32 × 32 × 64 | Downsampling | 32 × 32 × 64 | ||
| Upsampling | 64 × 64 × 64 | Convolution 3 × 3 | LReLU | 32 × 32 × 128 | ||
| Convolution 3 × 3 | LReLU | 64 × 64 × 32 | Downsampling | 16 × 16 × 128 | ||
| Upsampling | 128 × 128 × 32 | Convolution 3 × 3 | LReLU | 16 × 16 × 256 | ||
| Convolution 3 × 3 | LReLU | 128 × 128 × 16 | Downsampling | 8 × 8 × 256 | ||
| Upsampling | 256 × 256 × 16 | Convolution 3 × 3 | LReLU | 8 × 8 × 512 | ||
| Convolution 3 × 3 | LReLU | 256 × 256 × 8 | Downsampling | 4 × 4 × 512 | ||
| Convolution 1 × 1 | Tanh | 256 × 256 × 1 | Linear | 128 | ||
| Discriminator | Code Discriminator | |||||
| Layer | Activation | output shape | Layer | Activation | output shape | |
| (Input Image) | 256 × 256 × 1 | (Latent vector) | 128 | |||
| Convolution 1 × 1 | LReLU | 256 × 256 × 8 | Linear | LReLU | 1500 | |
| Convolution 3 × 3 | LReLU | 256 × 256 × 16 | Linear | 1 | ||
| Downsampling | 128 × 128 × 16 | |||||
| Convolution 3 × 3 | LReLU | 128 × 128 × 32 | ||||
| Downsampling | 64 × 64 × 32 | |||||
| Convolution 3 × 3 | LReLU | 64 × 64 × 64 | ||||
| Downsampling | 32 × 32 × 64 | |||||
| Convolution 3 × 3 | LReLU | 32 × 32 × 128 | ||||
| Downsampling | 16 × 16 × 128 | |||||
| Convolution 3 × 3 | LReLU | 16 × 16 × 256 | ||||
| Downsampling | 8 × 8 × 256 | |||||
| Convolution 3 × 3 | LReLU | 8 × 8 × 512 | ||||
| Downsampling | 4 × 4 × 512 | |||||
| Linear | 1 | |||||
LReLU leaky rectified linear unit, Tanh hyperbolic tangent
Fig. 3Examples of anomaly location visualization. The original images are shown on the left side and the reconstruction error images overlaid on the original images are shown on the right side. a Mass. b Cardiomegaly (arrow) and pleural effusion (arrowheads). c Bilateral hilar lymphadenopathy. d Dextrocardia
Fig. 4Images with the a, b highest and c lowest code norm anomaly scores. The images in b are limited to the posteroanterior adult chest images and incorrectly rotated or color-inverted images are also excluded
Fig. 5Receiver operating characteristic (ROC) curves for per-image anomaly detection tasks. Each value in parentheses represents the area under the corresponding ROC curve and its 95% confidence interval. AUROC area under the ROC curve