| Literature DB >> 35831474 |
Jingang Zhang1, Runmu Su1,2, Qiang Fu3, Wenqi Ren4, Felix Heide5, Yunfeng Nie6.
Abstract
Hyperspectral imaging enables many versatile applications for its competence in capturing abundant spatial and spectral information, which is crucial for identifying substances. However, the devices for acquiring hyperspectral images are typically expensive and very complicated, hindering the promotion of their application in consumer electronics, such as daily food inspection and point-of-care medical screening, etc. Recently, many computational spectral imaging methods have been proposed by directly reconstructing the hyperspectral information from widely available RGB images. These reconstruction methods can exclude the usage of burdensome spectral camera hardware while keeping a high spectral resolution and imaging performance. We present a thorough investigation of more than 25 state-of-the-art spectral reconstruction methods which are categorized as prior-based and data-driven methods. Simulations on open-source datasets show that prior-based methods are more suitable for rare data situations, while data-driven methods can unleash the full potential of deep learning in big data cases. We have identified current challenges faced by those methods (e.g., loss function, spectral accuracy, data generalization) and summarized a few trends for future work. With the rapid expansion in datasets and the advent of more advanced neural networks, learnable methods with fine feature representation abilities are very promising. This comprehensive review can serve as a fruitful reference source for peer researchers, thus paving the way for the development of computational hyperspectral imaging.Entities:
Mesh:
Year: 2022 PMID: 35831474 PMCID: PMC9279412 DOI: 10.1038/s41598-022-16223-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Schematic diagrams of an RGB camera (a) and a typical hyperspectral imager (b). In an RGB image, each pixel is combined with three discrete color values, which is integrated from wide R, G, B spectra. In a hyperspectral image, each pixel is a continuous spectral curve that is filtered from a series of narrow spectral bands. FL focusing lens group, FO front objective, CL collimating lens group. The image data are from the KAUST-HS open-source dataset[26].
Properties of five open-source hyperspectral datasets.
| Dataset | Amount | Resolution | Spectral channels | Spectrum/(nm) | Featured scenes |
|---|---|---|---|---|---|
| CAVE[ | 32 | 31 | 400–700 | Skin, hair, food and drink | |
| ICVL[ | 203 | 31 | 400–700 | Urban, rural, indoor and plant | |
| BGU-HS[ | 286 | 31 | 400–700 | Urban, rural, indoor and plant | |
| ARAD-HS[ | 510 | 31 | 400–700 | Statue, vehicle and paint | |
| KAUST-HS[ | 409 | 34 | 400–730 | Vehicle, food, building and toy |
Figure 2Classification of state-of-the-art spectral reconstruction methods.
An overview of prior-based spectral reconstruction methods.
| Category | Methods | Priors |
|---|---|---|
| Dictionary learning | Sparse coding[ | Sparsity |
| SR A+[ | Sparsity, local euclidean linearity | |
| Multiple non-negative sparse dictionaries[ | Spatial structure similarity, spectral correlation | |
| Local linear embedding sparse dictionary[ | Color and texture, local linearity | |
| Spatially constrained dictionary learning[ | Spatial context | |
| Manifold learning | SR manifold mapping[ | Low-dimensional manifold |
| Gaussian process | SR Gaussian process[ | Spectral physics, spatial structure similarity |
Overview of data-driven deep learning methods. Depth is the number of convolutional layers. Filters the number of convolution kernels. Framework is the original platform used in the references without excluding the usage of any other framework. The loss functions are defined in the Supplementary Material.
| Category | Methods | Depth | Filters | Loss function | Framework | Optimizer |
|---|---|---|---|---|---|---|
| Linear CNN | HSCNN[ | 5 | 64 | Caffe | Adam | |
| SR2D/3DNet[ | 5 | 64 | Keras/Tensorflow | Adam | ||
| Residual HSRCNN[ | 6 | 64 | Caffe | SGD | ||
| U-Net | SRUNet[ | 5 | 128 | Pytorch | SGD | |
| SRMSCNN[ | 10 | 1024 | Pytorch | Adam | ||
| SRMXRUNet[ | 56 | 4096 | FastAI | AdamW | ||
| SRBFWU-Net[ | 4 | 512 | Pytorch | Adam | ||
| GAN | SRCGAN[ | 8 | 512 | Keras | Adam | |
| SAGAN[ | 10 | 1024 | Pytorch | Adam | ||
| Dense network | SRTiramisuNet[ | 23 | 16 | Keras | Adam | |
| HSCNN+[ | 160 | 64 | Pytorch/Tensorflow | Adam | ||
| Residual network | SREfficientNet[ | 9 | 128 | Tensorflow | Adam | |
| SREfficientNet+[ | 21 | 128 | Tensorflow | Adam | ||
| Attention network | SRAWAN[ | 61 | 200 | Pytorch | Adam | |
| SRHRNet[ | 57 | 256 | Pytorch | Adam | ||
| SRRPAN[ | 135 | 64 | Pytorch | Adam | ||
| Multi-branch network | SRLWRDNet[ | 40 | 32 | Keras | Adam | |
| SRPFMNet[ | 9 | 64 | Pytorch | Adam |
Figure 3A glimpse of various deep neural network architectures used for spectral reconstruction from RGB images. represents the RGB image is upsampled in the spectral domain so that the 3D dimension is consistent with the H.
Reconstruction accuracy comparison of representative SR methods in terms of RMSE, MRAE and SAM on the BGU-HS and ARAD-HS datasets. Top two best results are highlighted in bold and underline respectively.
| Category | Method | BGU-HS | ARAD-HS | ||||
|---|---|---|---|---|---|---|---|
| RMSE | MRAE | SAM | RMSE | MRAE | SAM | ||
| Dictionary learning | Sparse coding | 51.48 | 0.0808 | 5.01 | 0.0331 | 0.0787 | 6.46 |
| SR A+ | 26.09 | 0.0448 | 2.83 | 0.0226 | 0.0725 | 4.61 | |
| Linear CNN | HSCNN | 17.006 | 0.0190 | – | – | – | – |
| SR-2DNet | 21.394 | 0.020 | – | – | – | – | |
| SR-3Dnet | 20.010 | 0.018 | – | – | – | – | |
| U-Net | SRUNet | 15.88 | 0.0156 | 1.11 | 0.0152 | 0.0395 | 2.74 |
| SRMSCNN | 19.28 | 0.0231 | 1.47 | 0.0235 | 0.0724 | 4.91 | |
| SRMXRUNet | – | – | – | – | 0.0454 | – | |
| SRBFWU-Net | – | – | – | 0.0151 | 0.0434 | – | |
| Dense network | SRTiramisuNet | 20.98 | 0.0272 | 1.57 | 0.0251 | 0.0850 | 4.34 |
| HSCNN-R | 13.911 | 0.0145 | 1.05 | 0.0143 | 2.63 | ||
| HSCNN-D | – | – | – | ||||
| Attention network | SRAWAN | – | |||||
| SRHRNet | – | – | 0.0423 | ||||
Figure 4Performance comparison of selective SR methods using the residual heat map at 640 nm channel. The ground truth image is from the BGU-HS open-source dataset[27]. MRAE errors of Sparse Coding[29] (a), SRTiramisuNet[59] (b), SRMSCNN[35] (c), SRUNet[55] (d), HSCNN-R[60] (e), HSCNN-D[60] (f), and SRAWAN[63] (g) in the spatial domain.
Figure 5Spectral error comparison of SR methods on selected spatial locations. Data from BGU-HS (a) and ARAD-HS (b).