| Literature DB >> 34068573 |
Abstract
Recently, deep learning-based techniques have shown great power in image inpainting especially dealing with squared holes. However, they fail to generate plausible results inside the missing regions for irregular and large holes as there is a lack of understanding between missing regions and existing counterparts. To overcome this limitation, we combine two non-local mechanisms including a contextual attention module (CAM) and an implicit diversified Markov random fields (ID-MRF) loss with a multi-scale architecture which uses several dense fusion blocks (DFB) based on the dense combination of dilated convolution to guide the generative network to restore discontinuous and continuous large masked areas. To prevent color discrepancies and grid-like artifacts, we apply the ID-MRF loss to improve the visual appearance by comparing similarities of long-distance feature patches. To further capture the long-term relationship of different regions in large missing regions, we introduce the CAM. Although CAM has the ability to create plausible results via reconstructing refined features, it depends on initial predicted results. Hence, we employ the DFB to obtain larger and more effective receptive fields, which benefits to predict more precise and fine-grained information for CAM. Extensive experiments on two widely-used datasets demonstrate that our proposed framework significantly outperforms the state-of-the-art approaches both in quantity and quality.Entities:
Keywords: contextual attention mechanism; dense connection of dilated convolution; image inpainting; implicit diversified Markov random fields
Year: 2021 PMID: 34068573 PMCID: PMC8126100 DOI: 10.3390/s21093281
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The overall architecture of our method. Region-wise convolution indicates using different convolution filters for different regions, more details can be found in paper [15]. In this architecture, 256 by 256 and 32 denote the size and channels of the feature map respectively.
Figure 2The framework of the dense fusion block. “Conv-3-2” indicates the 3 by 3 convolution layer and the dilated rate is 2. is element-wise summation. The output channels of all convolutional layers are 64, except for the last layer which is 256.
Figure 3Qualitative comparisons of different methods on discontinuous missing areas.
Figure 4Qualitative comparisons of different methods on continuous missing areas.
Quantitative comparisons on discontinues missing region, where the bold indicates the best performance, and the underline denotes the sub-optimal results, + indicates the higher is better, while—indicates the lower is better.
| MASK | CelebA-HQ | Paris StreetView | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CA | PConv | EC | PIC | GC | RWC | Ours | CA | PConv | EC | PIC | GC | RWC | Ours | ||
| PSNR + | 0–10% | 34.89 | 34.24 | 34.58 | 34.69 | 39.28 |
|
| 35.30 | 34.20 | 34.56 | 34.02 | 37.85 |
|
|
| 10–20% | 27.54 | 31.01 | 31.22 | 31.31 | 32.65 |
|
| 28.83 | 30.52 | 30.91 | 30.26 | 30.97 |
|
| |
| 20–30% | 24.14 | 28.21 | 28.58 | 28.61 | 29.07 |
|
| 25.56 | 27.67 | 28.13 | 27.30 | 27.01 |
|
| |
| 30–40% | 22.83 | 26.95 | 27.35 | 27.56 | 27.92 |
|
| 23.85 | 26.31 | 26.70 | 25.83 | 25.96 |
|
| |
| SSIM + | 0–10% | 0.972 | 0.938 | 0.945 | 0.944 | 0.984 |
|
| 0.972 | 0.948 | 0.950 | 0.947 | 0.983 |
|
|
| 10–20% | 0.904 | 0.908 | 0.917 | 0.912 | 0.950 |
|
| 0.902 | 0.909 | 0.915 | 0.904 | 0.937 |
|
| |
| 20–30% | 0.838 | 0.872 | 0.879 | 0.876 | 0.908 |
|
| 0.835 | 0.869 | 0.871 | 0.850 | 0.873 |
|
| |
| 30–40% | 0.763 | 0.839 | 0.847 | 0.843 | 0.873 |
|
| 0.754 | 0.810 | 0.829 | 0.799 | 0.828 |
|
| |
| L1−(10−3) | 0–10% | 5.40 | 13.40 | 12.94 | 13.00 | 4.10 |
|
| 5.06 | 13.34 | 13.19 | 13.48 | 4.26 |
|
|
| 10–20% | 18.09 | 16.22 | 16.00 | 6.03 | 8.68 |
|
| 12.49 | 17.13 | 16.79 | 17.56 | 9.92 |
|
| |
| 20–30% | 24.43 | 21.03 | 20.33 | 20.18 | 16.82 |
|
| 21.46 | 22.17 | 21.78 | 23.64 | 18.20 |
|
| |
| 30–40% | 32.98 | 24.12 | 23.29 | 23.08 | 18.40 |
|
| 29.93 | 27.09 | 25.99 | 28.37 | 23.22 |
|
| |
| L2−(10−3) | 0–10% | 0.61 | 0.54 | 0.50 | 0.48 | 0.19 |
|
| 0.42 | 0.55 | 0.52 | 0.58 | 0.29 |
|
|
| 10–20% | 2.25 | 1.12 | 0.98 | 0.95 | 0.69 |
|
| 1.51 | 1.19 | 1.13 | 1.27 | 1.09 |
|
| |
| 20–30% | 4.64 | 1.94 | 1.75 | 1.70 | 1.58 |
|
| 3.28 | 2.33 | 2.10 | 2.54 | 2.73 |
|
| |
| 30–40% | 6.19 | 2.55 | 2.23 | 2.11 | 1.95 |
|
| 4.68 | 3.15 | 2.80 | 3.38 | 3.37 |
|
| |
| FID− | 0–10% | 1.97 | 1.58 | 1.47 | 1.37 | 0.37 |
|
| 8.32 | 5.52 | 4.47 | 6.12 | 3.06 |
|
|
| 10–20% | 8.95 | 2.77 | 2.61 | 2.29 | 1.37 |
|
| 25.37 | 13.12 | 10.33 | 14.06 | 12.66 |
|
| |
| 20–30% | 13.98 | 4.26 | 3.74 | 3.20 | 2.40 |
|
| 36.14 | 17.53 | 15.67 | 21.94 | 24.58 |
|
| |
| 30–40% | 27.06 | 6.18 | 5.40 | 4.36 | 3.55 |
|
| 57.93 | 30.06 | 26.26 | 34.20 | 34.18 |
|
| |
Quantitative comparisons on continues missing region, where the bold indicates the best performance, and the underline denotes the sub-optimal results, + indicates the higher is better, while—indicates the lower is better.
| MASK | CelebA-HQ | Paris Street View | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CA | PConv | EC | PIC | GC | RWC | Ours | CA | PConv | EC | PIC | GC | RWC | Ours | ||
| PSNR + | 0–10% | 32.11 | 32.02 | 32.70 | 32.91 | 36.43 |
|
| 32.83 | 31.48 | 33.09 | 32.30 | 33.98 |
|
|
| 10–20% | 25.33 | 27.33 | 28.05 | 28.00 | 29.00 |
|
| 26.39 | 27.14 | 28.74 | 27.18 | 27.28 |
|
| |
| 20–30% | 22.86 | 24.59 | 25.11 | 24.81 | 25.63 |
|
| 23.40 | 24.47 | 26.02 | 24.26 | 24.33 |
|
| |
| 30–40% | 20.21 | 21.32 | 21.99 | 21.58 | 22.31 |
|
| 20.69 | 21.73 |
| 21.60 | 21.79 |
|
| |
| SSIM + | 0–10% | 0.971 | 0.936 | 0.945 | 0.942 |
|
|
| 0.962 | 0.901 | 0.933 | 0.926 | 0.960 |
|
|
| 10–20% | 0.905 | 0.898 | 0.906 | 0.901 |
|
|
| 0.900 | 0.880 | 0.895 | 0.881 | 0.903 |
|
| |
| 20–30% | 0.863 | 0.845 | 0.866 | 0.853 | 0.896 |
|
| 0.837 | 0.833 | 0.844 | 0.828 | 0.847 |
|
| |
| 30–40% | 0.790 | 0.790 | 0.801 | 0.792 | 0.835 |
|
| 0.756 | 0.757 | 0.782 | 0.748 | 0.770 |
|
| |
| L1−(10−3) | 0–10% | 7.15 | 15.38 | 14.26 | 14.21 | 4.95 |
|
| 6.28 | 15.10 | 14.14 | 14.83 | 5.78 |
|
|
| 10–20% | 17.26 | 20.25 | 19.77 | 19.91 | 11.05 |
|
| 15.60 | 22.67 | 19.55 | 22.42 | 14.49 |
|
| |
| 20–30% | 27.49 | 28.12 | 26.93 | 27.69 | 18.98 |
|
| 27.03 | 28.85 | 26.23 | 31.54 | 24.46 |
|
| |
| 30–40% | 45.36 | 41.68 | 39.75 | 41.24 | 31.91 |
|
| 42.51 | 40.24 | 36.98 | 45.10 | 37.66 |
|
| |
| L2−(10−3) | 0–10% | 1.23 | 0.96 | 0.84 | 0.64 | 0.54 |
|
| 0.82 | 1.14 | 0.73 | 0.97 | 0.72 |
|
|
| 10–20% | 3.93 | 2.55 | 2.15 |
| 1.49 | 1.58 |
| 3.08 | 3.03 | 1.92 | 3.12 | 2.81 |
|
| |
| 20–30% | 6.92 | 4.87 | 4.00 |
| 3.07 | 3.21 |
| 6.00 | 5.56 |
| 5.79 | 5.23 | 3.56 |
| |
| 30–40% | 12.99 | 9.21 | 8.01 |
| 6.38 | 6.66 |
| 10.60 | 9.23 |
| 9.77 | 8.91 | 6.53 |
| |
| FID− | 0–10% | 1.09 | 1.63 | 1.56 | 1.48 | 0.53 |
|
| 7.52 | 9.14 | 6.72 | 8.39 | 6.21 |
|
|
| 10–20% | 3.28 | 3.06 | 2.84 | 2.38 |
| 1.49 |
| 15.44 | 14.03 |
| 13.32 | 15.79 | 12.45 |
| |
| 20–30% | 8.02 | 4.15 | 4.02 | 3.31 |
| 3.48 |
| 27.66 | 23.45 |
| 24.53 | 27.41 | 21.56 |
| |
| 30–40% | 12.44 | 6.23 | 5.86 | 4.63 |
| 6.20 |
| 38.53 | 36.23 |
| 37.37 | 38.85 |
| 30.53 | |
Figure 5Qualitative results of ablation studies on discontinuous missing regions. (Best viewed with zoom-in.)
Figure 6Qualitative results of ablation studies on continuous missing regions. (Best viewed with zoom-in.)
Quantitative results of ablation studies on discontinuous missing regions, where + indicates the higher is better, while—indicates the lower is better, the bold indicates the best performance.
| CelebA-HQ | Paris Street View | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MASK | L2 | L1 | CL | SL | IM | IM+AD | BL+AT | BL(Rcdc) | All | L2 | L1 | CL | SL | IM | IM+AD | BL+AT | BL(Rcdc) | All | |
| PSNR + | 0–10% | 39.40 | 41.49 | 41.47 | 41.45 | 41.46 | 41.60 | 41.55 | 41.61 |
| 40.49 | 41.63 | 41.31 | 41.18 | 41.69 | 41.74 | 42.14 | 42.01 |
|
| 10–20% | 32.87 | 34.76 | 34.13 | 34.05 | 34.75 | 34.88 | 34.85 | 34.93 |
| 33.66 | 34.57 | 34.23 | 34.01 | 34.63 | 34.70 | 35.00 | 34.90 |
| |
| 20–30% | 29.35 | 30.95 | 30.26 | 30.19 | 30.95 | 31.09 | 31.04 |
|
| 29.92 | 30.57 | 30.20 | 30.03 | 30.54 | 30.69 | 31.03 | 30.99 |
| |
| 30–40% | 28.35 | 30.02 | 29.26 | 29.18 | 30.02 | 30.14 | 30.12 |
| 30.18 | 28.82 | 29.51 | 29.16 | 28.96 | 29.63 | 30.03 | 29.92 | 29.87 |
| |
| SSIM + | 0–10% | 0.983 | 0.990 | 0.990 | 0.990 | 0.989 | 0.989 | 0.989 | 0.989 |
| 0.987 | 0.989 | 0.989 | 0.989 | 0.989 | 0.989 |
|
|
|
| 10–20% | 0.947 | 0.966 | 0.962 | 0.962 | 0.965 | 0.965 | 0.965 | 0.965 |
| 0.954 | 0.962 | 0.960 | 0.960 | 0.961 | 0.962 | 0.964 | 0.963 |
| |
| 20–30% | 0.907 | 0.935 | 0.925 | 0.925 | 0.933 | 0.933 | 0.933 | 0.934 |
| 0.909 | 0.922 | 0.916 | 0.916 | 0.920 | 0.921 | 0.925 | 0.923 |
| |
| 30–40% | 0.872 | 0.913 | 0.898 | 0.899 | 0.911 | 0.912 | 0.912 | 0.912 |
| 0.878 | 0.896 | 0.889 | 0.889 | 0.896 | 0.897 | 0.902 | 0.899 |
| |
| L1−(10−3) | 0–10% | 4.54 | 3.80 | 1.73 | 1.70 | 3.77 | 3.76 | 3.78 | 3.76 |
| 2.11 | 1.73 | 2.59 | 2.54 | 1.74 | 1.73 | 1.64 | 1.64 |
|
| 10–20% | 8.15 | 7.00 | 6.03 | 5.92 | 6.91 | 6.85 | 6.92 | 6.86 |
| 7.03 | 5.68 | 6.59 | 6.51 | 5.78 | 5.67 | 5.36 | 5.35 |
| |
| 20–30% | 14.70 | 11.16 | 11.12 | 10.80 | 10.96 | 10.80 | 10.92 | 10.76 |
| 13.75 | 11.69 | 12.39 | 12.22 | 11.56 | 11.18 | 10.60 | 10.59 |
| |
| 30–40% | 19.32 | 13.75 | 14.61 | 14.25 | 13.50 | 13.32 |
| 13.29 | 13.67 | 18.06 | 14.67 | 15.87 | 15.68 | 14.54 | 14.54 | 13.78 | 13.76 |
| |
| L2−(10−3) | 0–10% | 0.19 | 0.14 | 0.13 | 0.14 | 0.14 | 0.14 | 0.14 | 0.13 |
| 0.16 | 0.14 | 0.16 | 0.16 | 0.14 |
|
|
|
|
| 10–20% | 0.64 | 0.45 | 0.51 | 0.52 | 0.45 | 0.44 | 0.44 | 0.43 |
| 0.60 | 0.53 | 0.58 | 0.60 | 0.52 | 0.52 |
| 0.50 |
| |
| 20–30% | 1.39 | 1.02 | 1.19 | 1.20 | 1.01 | 0.99 | 1.00 | 0.97 |
| 1.42 | 1.31 | 1.44 | 1.49 | 1.29 | 1.28 | 1.23 | 1.23 |
| |
| 30–40% | 1.72 | 1.23 | 1.46 | 1.48 | 1.22 | 1.20 | 1.20 | 1.18 |
| 1.81 | 1.63 | 1.79 | 1.87 | 1.65 | 1.61 | 1.54 | 1.51 |
| |
| FID− | 0–10% | 0.51 | 0.30 | 0.27 | 0.24 | 0.24 | 0.22 | 0.22 | 0.21 |
| 2.39 | 2.01 | 1.96 | 1.80 | 1.39 | 1.37 | 1.26 | 1.32 |
|
| 10–20% | 1.84 | 1.11 | 1.07 | 0.93 | 0.84 | 0.77 | 0.78 | 0.74 |
| 8.68 | 7.47 | 7.31 | 6.67 | 5.47 | 4.91 | 4.68 | 4.85 |
| |
| 20–30% | 3.53 | 2.28 | 2.19 | 1.79 | 1.67 | 1.51 | 1.53 | 1.45 |
| 20.77 | 18.59 | 16.22 | 14.37 | 12.17 | 10.23 | 9.98 |
| 9.90 | |
| 30–40% | 4.81 | 3.02 | 3.01 | 2.39 | 2.21 | 1.93 | 1.95 | 1.87 |
| 28.51 | 25.04 | 27.71 | 19.62 | 15.82 | 13.40 | 13.97 | 13.20 |
| |
Quantitative results of ablation studies on continuous missing regions, where + indicates the higher is better, while—indicates the lower is better, the bold indicates the best performance.
| CelebA-HQ | Paris Street View | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MASK | L2 | L1 | CL | SL | IM | IM+AD | BL+AT | BL(Rcdc) | All | L2 | L1 | CL | SL | IM | IM+AD | BL+AT | BL(Rcdc) | All | |
| PSNR + | 0–10% | 34.70 | 37.65 | 37.00 | 37.03 | 37.17 | 37.29 | 37.44 | 38.19 | 38.30 | 34.39 | 36.92 | 35.49 | 35.67 | 37.16 | 37.26 | 37.47 | 37.58 |
|
| 10–20% | 27.95 | 29.81 | 29.08 | 28.98 | 29.72 | 29.84 | 29.93 | 29.95 |
| 28.22 | 29.94 | 28.90 | 28.95 | 30.13 | 30.22 | 30.47 | 30.32 |
| |
| 20–30% | 24.77 | 26.39 | 25.71 | 25.58 | 26.31 | 26.44 | 26.44 | 26.53 |
| 25.31 | 27.01 | 25.79 | 25.83 | 27.04 | 27.11 | 27.28 | 27.38 |
| |
| 30–40% | 21.65 | 23.14 | 22.54 | 22.42 | 23.02 | 23.15 | 23.30 | 23.34 |
| 22.81 | 23.93 | 22.95 | 22.83 | 24.01 | 24.19 | 24.23 |
| 24.31 | |
| SSIM + | 0–10% | 0.975 | 0.982 | 0.980 | 0.980 | 0.980 | 0.981 | 0.981 |
| 0.982 | 0.974 | 0.979 | 0.976 | 0.976 | 0.979 | 0.980 | 0.980 | 0.980 |
|
| 10–20% | 0.931 | 0.947 | 0.938 | 0.939 | 0.944 | 0.945 | 0.946 |
|
| 0.924 | 0.935 | 0.926 | 0.925 | 0.935 | 0.936 | 0.937 | 0.937 |
| |
| 20–30% | 0.881 | 0.904 | 0.887 | 0.889 | 0.900 | 0.901 | 0.901 |
|
| 0.868 | 0.886 | 0.869 | 0.868 | 0.886 | 0.887 | 0.889 | 0.889 |
| |
| 30–40% | 0.821 | 0.851 | 0.824 | 0.824 | 0.846 | 0.846 |
|
| 0.849 | 0.811 | 0.830 | 0.803 | 0.800 | 0.830 | 0.832 | 0.833 | 0.834 |
| |
| L1−(10−3) | 0–10% | 6.62 | 2.86 | 3.13 | 3.04 | 5.15 | 5.09 | 5.03 | 2.69 |
| 4.43 | 3.96 | 4.33 | 4.20 | 3.71 | 2.97 | 2.95 | 2.92 |
|
| 10–20% | 14.89 | 8.85 | 9.69 | 9.42 | 10.80 | 10.61 | 10.38 | 8.39 |
| 13.37 | 10.44 | 11.53 | 11.22 | 9..91 | 9.13 | 8.92 | 9.04 |
| |
| 20–30% | 23.84 | 16.61 | 18.18 | 17.76 | 18.22 | 17.88 | 17.60 | 17.50 | 17.19 | 23.63 | 17.88 | 20.30 | 19.97 | 17.37 | 16.68 | 16.82 | 16.93 |
| |
| 30–40% | 42.68 | 29.01 | 31.61 | 30.93 | 30.57 | 30.24 | 28.71 |
|
| 37.10 | 29.85 | 33.05 | 32.72 | 28.58 | 27.81 | 27.57 | 27.36 |
| |
| L2−(10−3) | 0–10% | 0.66 | 0.43 | 0.52 | 0.52 | 0.46 | 0.45 | 0.45 | 0.43 |
| 0.60 | 0.45 | 0.53 | 0.55 | 0.43 | 0.42 | 0.42 | 0.40 |
|
| 10–20% | 2.12 | 1.51 | 1.77 | 1.77 | 1.55 | 1.52 | 1.52 | 1.47 |
| 2.08 | 1.58 | 1.86 | 1.94 | 1.53 | 1.52 |
| 1.51 | 1.46 | |
| 20–30% | 4.14 | 3.08 | 3.58 | 3.62 | 3.15 | 3.06 | 3.01 | 2.96 |
| 3.78 | 2.85 | 3.60 | 3.76 | 2.86 | 2.91 | 2.81 | 2.81 |
| |
| 30–40% | 8.32 | 6.34 | 7.24 | 7.30 | 6.50 | 6.35 | 6.08 | 6.00 |
| 6.61 | 5.67 | 6.92 | 7.18 | 5.53 | 5.42 | 5.51 |
| 5.44 | |
| FID− | 0–10% | 0.91 | 0.61 | 0.58 | 0.52 | 0.53 | 0.49 | 0.47 | 0.45 |
| 7.58 | 5.93 | 5.32 | 4.95 | 4.72 | 3.88 |
| 3.79 | 3.83 |
| 10–20% | 2.99 | 1.96 | 1.89 | 1.61 | 1.68 | 1.51 | 1.46 | 1.45 |
| 19.02 | 15.08 | 16.61 | 13.73 | 14.01 | 12.67 | 12.31 | 12.31 |
| |
| 20–30% | 6.18 | 3.69 | 3.51 | 2.84 | 3.02 | 2.67 | 2.63 | 2.63 |
| 30.95 | 28.48 | 28.86 | 23.89 | 25.29 | 21.49 | 20.55 | 20.21 |
| |
| 30–40% | 11.15 | 6.53 | 5.98 | 4.60 | 4.98 | 4.48 | 4.10 | 4.33 |
| 44.30 | 40.31 | 40.80 | 32.89 | 37.71 | 34.80 | 32.60 | 32.61 | 30.53 | |