A multi-focus color image fusion algorithm based on low vision image reconstruction and focused feature extraction

https://doi.org/10.1016/j.image.2021.116533Get rights and content

Highlights

  • A new color multi-focus image fusion based on super-resolution is proposed.

  • The deep ResNet is used to enhance the detailed information of the image.

  • A new focus area detection method based on the structure gradient is proposed.

Abstract

Multi-focus image fusion is a process of generating fused images by merging multiple images with different degrees of focus in the same scene. In multi-focus image fusion, the accuracy of the detected focus area is critical for improving the quality of the fused image. Combining the structural gradient, we propose a multi-focus color image fusion algorithm based on low vision image reconstruction and focus feature extraction. First, the source images are input into the deep residual network (ResNet) to conduct the low vision image reconstructed by the super-resolution method. Next, an end-to-end restoration model is used to improve the image details and maintain the edges of the image through rolling guidance filter. What is more, the difference image is obtained from the reconstructed image and the source image. Then, the fusion decision map is generated based on the focus area detection method based on structural gradient. Finally, the source image and the fusion decision map are used for weighted fusion to generate a fusion image. Experimental results show that our algorithm is quite accurate in detecting the edge of the focus area. Compared with other algorithms, the proposed algorithm improves the recognition accuracy of decision focus and defocused areas. It can well retain the detailed texture features and edge structure of the source image.

Introduction

Recently, image fusion has gradually developed into one of the hot sub-fields of image processing. Its primitive purpose is to integrate different information from multiple images obtained in the same scene into one image [1]. Due to the limitation of the depth of field (DOF) in optical devices, it is difficult to use a camera to clearly capture an image that different objects have different DOFs. Therefore, multi-focus image fusion is proposed to obtain a clearer fusion image by fusing multiple images in the same scene with different depths of field. Image fusion algorithms can be divided into four categories: image fusion methods based on spatial domain, image fusion methods based on transform domain, image fusion methods based on hybrid transformation and image fusion methods based on deep learning [2]. Image fusion methods based on the spatial domain can be further divided into pixel-based methods, block-based methods and region-based methods.

In block-based method, source images are decomposed into fixed-size blocks. And the focus blocks are selected by comparing the focus characteristics of the corresponding blocks of different source images. The region-based method is similar to the block-based method, while the image is divided into irregular-sized regions through some segmentation techniques. The focus areas are detected in these regions, and then image fusion is conducted. Image fusion methods based on spatial domain include fusion algorithm based on guided filtering [3], multi-focus image fusion algorithm based on local binary mode metric [4], enhanced random walk in dual-scale [5], multi-focus image fusion algorithm based on probabilistic filtering and area correction [6], multi-focus image fusion algorithm based on content-adaptive blur [7], image fusion based on human visual perception [8], multi-focus image fusion based on conditional random field optimization [9], multi-focus image fusion based on low-rank representation [10] and image fusion based on distributed compressed sensing [11]. Region-based and block-based methods tend to produce block effects at the boundary and get poor fusion quality in the end. Based on the above studies, we adopt the pixel-based method to avoid artifacts and blocking effects generated by the block-based and region-based methods. Furthermore, the proposed method obtains the fusion image by estimating the weights corresponding to each pixel in each source image.

In image fusion method based on the transform domain, wavelet transform and multi-scale geometric transformation are used to decompose the image. Then, different criteria are used to fuse the decomposed high-frequency and low-frequency coefficients. For example, Liang et al. [12] proposed a fusion algorithm based on regional mosaic Laplacian pyramid transformation, which can not only obtain more perceptive fused images but also reduce noise in fusion images. In literature [13], Aymaz et al. proposed a fusion algorithm based on stationary wavelet transform, which can be well applied to color image fusion. Combining with sparse representation, Aishwarya et al. proposed an image fusion algorithm based on discrete wavelet transform in literature [14], which used the sparsity of wavelet coefficients for image fusion and achieved good fusion results. To make better use of the directionality of wavelet transform, a fusion algorithm based on dual-tree complex wavelet transform is given in literature [15]. With the continuous development of multi-scale geometric transformation, image fusion algorithms based on Contourlet and Shearlet have also achieved advantageous results [16], [17]. Because the image fusion algorithms based on transform domain are prone to image distortion and have poor spatial continuity, image fusion algorithms based on hybrid transform combine spatial domain and transform domain have gradually become mainstream in the field of image fusion. For instance, Liu et al. in literature [18] proposed a multi-focus image fusion algorithm based on the dual-channel spiking cortical model in non-subsampled Shearlet domain. He et al. [19] proposed a multi-focus image fusion algorithm based on multi-scale transformation and focus discovery.

The image fusion method based on deep learning mainly used the convolutional neural network (CNN) to fuse multi-focus images. Liu et al. [20] proposed a multi-focus image fusion method based on CNN. This method regards the generation process of the focus map as a classification problem, and the function of the fusion rule is similar to the classifier used in general classification tasks. This algorithm effectively integrates the judgments of the focus area and the design of the image fusion rules. In literature [21], Mostafa et al. proposed an integrated CNN fusion framework. First, they established three training datasets including the original datasets, which were input to three channels of network. Through the network of three channels, focus area and defocus area of source image were classified and the final decision map was generated for fusion. An end-to-end multi-level dense network was proposed in literature [22], which can better extract the global and local features of images. All of the foregoing methods had improved the quality of image fusion in various degrees.

As mentioned above, the essence of the image fusion algorithm is to generate a decision map by comparing the corresponding focus characteristics from different source images. After post-processing such as consistency verification, the final decision map is used for source image weighted fusion [23], [24], [25]. The accuracy of the decision map depends on the detection of the focus area. Since multi-focus image fusion is essentially an effective synthesis of focus areas of multiple source images in the same scene, the accuracy of the focus detection method and the quality of the source image will affect the final fusion result. Therefore, finding a method to accurately detect the focus area has become the core task of image fusion. The most used focus detection measure is variant, which includes the spatial frequency (SF), the energy of the image gradient (EOG), the energy of the Laplacian of the image (EOL) and sum-modified Laplacian (SML), etc. Many researchers have also proposed numerous new focus detection methods. Liu et al. [26] proposed a new algorithm based on dense scale invariance to detect the focus and defocus area of the source image. This method primarily registered the source images and then fused the registered source images. A multi-focus image fusion algorithm based on the Laplacian energy in the DCT domain was proposed in [27]. Ma et al. [5] proposed a multi-focus image fusion algorithm based on the two-scale random walk algorithm. They used structure gradient as focus area measurement. And the random walk algorithm from a probability perspective was used to generate a decision map, which can better identify focus areas. Due to the structure gradient has unique advantages in distinguishing the flat, edge and corner regions of the image, we use the structure gradient to detect the focus area.

The quality of the image fusion result is affected by the quality of the source image. To obtain the best fusion result, image super-resolution reconstruction methods are used to reconstruct the low-resolution source image to restore the image detail information lost during image acquisition. In this paper, to improve the performance of the multi-focus color image fusion, we propose a multi-focus color image fusion algorithm based on low vision image reconstruction and focused feature extraction. First, the color source images are input into ResNet for super-resolution reconstruction. By using rolling guidance filter, we can facilitate a more accurate calculation of the structure gradient of the image. Next, the difference images are obtained from the reconstructed images and the source images, and then the focus area detection method based on the structure gradient is used to generate the initial decision map. Finally, to further enhance the image fuse performance, a small area removal strategy is employed to refine the initial decision map to obtain the final decision map.

The main contributions of the proposed method can be briefly described as following: (1) Combining structure gradient, we propose a focus feature extraction algorithm based on super-resolution reconstruction of low-resolution images in color multi-focus image fusion. (2) To enhance detail information of the fusion image, the deep residual network with a new degradation model is used for image super-resolution reconstruction. (3) The focus area detection method based on structure gradient can accurately distinguish focus area and defocus area of the image. Simultaneously, the rolling guidance filter also maintains good spatial consistency in the edge areas of fusion image. Compared with other algorithms, the proposed method can effectively avoid artificial artifacts appear in edge areas.

The rest of this paper is arranged as follows: In Section 2, we introduced some related work. In Section 3, some basic theories were mainly introduced. In Section 4, the proposed color image fusion algorithm was described in detail. In Section 5, the experiments showed the effectiveness of the proposed algorithm. Finally, we gave some conclusions in Section 6.

Section snippets

Related work

Recently, neural network has been used widely in image fusion, image super-resolution, image de-blurring, image de-raining and image de-noising with its powerful feature-learning function [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32]. Consequently, CNN is also used in multi-focus image fusion, which started from Liu et al. in literature [20]. Zhang et al. [33] proposed an image fusion framework based on convolutional neural networks, which implemented an

Residual block

Because CNN can effectively extract image features, it is widely used in image processing. The shallow network can only learn some local features, while deep features such as detailed texture information can be learned by deepening the number of layers of the network. As the number of network layers growing, network training will become more difficult and may cause the problem of vanishing gradient. And the training effect of the model will deteriorate with the increasing of depth. We use the

Proposed algorithm

The fusion result of multi-focus images depends on the quality of the source images and the accuracy of the focus area obtained by the detection method. Therefore, combining super-resolution reconstruction and structure gradient, we propose a multi-focus image fusion algorithm based on low vision image reconstruction and a new focused feature extraction method. The specific framework of the proposed algorithm is shown in Fig. 5.

In Fig. 5, the proposed method can be seen in the flowchart.

Experiment setting

To evaluate the performance of the proposed algorithm effectively and comprehensively in multi-focus image fusion, we compared the proposed method with other thirteen representative algorithms on public datasets. The comparison algorithms are as follows: (1) Multi-focus image fusion algorithm based on the convolutional neural network (CNN) proposed in [20]; (2) Multi-focus image fusion algorithm based on dense-scale feature invariance (DSIFT) proposed in literature [26]; (3) Multi-focus image

Conclusion

This paper first uses a degradation model that is more suitable for multi-focus images to conduct end-to-end mapping learning for high-quality image super-resolution reconstruction. The model manages to obtain more detailed texture information of the source image and improve the contrast of the image. Secondly, noise smoothing and image edge preservation are conducted by combining rolling guidance filter and then the detection algorithm of structure gradient focus area is used to extract the

CRediT authorship contribution statement

Shuaiqi Liu: Conceptualization, Writing – review & editing, Supervision, Funding acquisition. Jian Ma: Methodology, Software, Validation, Writing – original draft, Formal analysis, Data curation. Yang Yang: Visualization, Formal analysis. Tian Qiu: Investigation, Writing – review & editing. Hailiang Li: Writing – review & editing. Shaohai Hu: Project administration. Yu-dong Zhang: Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under Grant 62172139 and 62172030, Natural Science Foundation of Hebei Province, China under Grant F2020201025, F2019201151 and F2018210148, Science Research Project of Hebei Province under Grant BJ2020030, Foundation of President of Hebei University under Grant XZJJ201909, Natural Science Foundation of Hebei University under Grant 2014-303 and 8012605. Open Foundation of Guangdong Key Laboratory of Digital Signal

References (54)

  • LiuY. et al.

    Multi-focus image fusion with a deep convolutional neural network

    Inf. Fusion

    (2017)
  • BaiX. et al.

    Quadtree-based multi-focus image fusion using a weighted focus-measure

    Inf. Fusion

    (2015)
  • ZhangY. et al.

    Boundary finding based multi-focus image fusion through multi-scale morphological focus-measure

    Inf. Fusion

    (2017)
  • LiuY. et al.

    Multi-focus image fusion with dense SIFT

    Inf. Fusion

    (2015)
  • ZhangY. et al.

    IFCNN: A general image fusion framework based on convolutional neural network

    Inf. Fusion

    (2020)
  • GaiD. et al.

    Multi-focus image fusion method based on two stage of convolutional neural network

    Signal Process.

    (2020)
  • LiuY. et al.

    Entropy-based image fusion with joint sparse representation and rolling guidance filter

    Entropy

    (2020)
  • QiuX. et al.

    Guided filter-based multi-focus image fusion through focus region detection

    Signal Process.-Image Commun.

    (2019)
  • ZhouZ. et al.

    Multi-scale weighted gradient-based fusion for multi-focus images

    Inf. Fusion

    (2014)
  • LiuY. et al.

    A general framework for image fusion based on multi-scale transform and sparse representation

    Inf. Fusion

    (2015)
  • TangH. et al.

    Pixel convolutional neural network for multi-focus image fusion

    Inform. Sci.

    (2018)
  • LiW. et al.

    Structure-aware image fusion

    Optik-Int. J. Light Electron. Opt.

    (2018)
  • LiS. et al.

    Image fusion with guided filtering

    IEEE Trans. Image Process

    (2013)
  • FaridM. et al.

    Multi-focus image fusion using content adaptive blurring

    Inf. Fusion

    (2018)
  • BouzosO. et al.

    Conditional random field model for robust multi-focus image fusion

    IEEE Trans. Image Process

    (2019)
  • ZhangQ. et al.

    Exploring a unified low rank representation for multi-focus image fusion

    Pattern Recognit.

    (2020)
  • LiangK. et al.

    A multi-focus image fusion method via region mosaicking on Laplacian pyramids

    PLoS One

    (2018)
  • Cited by (0)

    View full text