Abstract

This paper proposes an effective and robust method for image alignment and recovery on a set of linearly correlated data via Frobenius and norms. The most popular and successful approach is to model the robust PCA problem as a low-rank matrix recovery problem in the presence of sparse corruption. The existing algorithms still lack in dealing with the potential impact of outliers and heavy sparse noises for image alignment and recovery. Thus, the new algorithm tackles the potential impact of outliers and heavy sparse noises via using novel ideas of affine transformations and Frobenius and norms. To attain this, affine transformations and Frobenius and norms are incorporated in the decomposition process. As such, the new algorithm is more resilient to errors, outliers, and occlusions. To solve the convex optimization involved, an alternating iterative process is also considered to alleviate the complexity. Conducted simulations on the recovery of face images and handwritten digits demonstrate the effectiveness of the new approach compared with the main state-of-the-art works.

1. Introduction

Image alignment and recovery [1] has found applications in a variety of areas such as medical imaging, wireless sensor networks, surveillance, batch image denoising, and computational imaging. Image recovery can also be used in background extraction, where the low-rank component corresponds to the background and the sparse component captures the foreground. However, this problem faces some severe challenges such as illumination variation, occlusion, outliers, and heavy sparse noises. It is thus important to develop robust image recovery algorithms to tackle the abovementioned adverse effects.

A variety of algorithms have been reported for image alignment and recovery problem. For example, Peng et al. [2] considered a robust algorithm for sparse and low-rank decomposition (RASL) to remove the potential impact of outliers and sparse errors incurred by corruption and occlusion, but it still lacks to perform well when the potential impact of outliers and heavy sparse noise is large in a large number of images. To tackle this problem, Likassa et al. [3] addressed a modified RASL via incorporating affine transformation with rank prior information which boosted the performance of the algorithm. Ebadi and Izquierdo [4] proposed an efficient robust principal component analysis by using some approximation for image recovery. However, they do not have the potential to remove the impact of outliers in big data. A robust principal component analysis (RPCA) algorithm was addressed in [5] based on convex program, which is guaranteed to recover the low-rank matrix despite gross sparse errors; however, the existing RPCA method is known to be extremely fragile to the presence of gross corruptions. To tackle this dilemma, Chen et al. [6] proposed a nonconvex plus quadratic penalized low-rank and sparse decomposition (NQLSD) method to fit the low-rank model and then used a robust fitting function to reduce the influence of corruption and occlusion on image alignment, where it is still questionable due to large time complexity. Song et al. [7] addressed an online robust image alignment approach that incorporates geometric transformations which directly linearize the objective function by warping update and take the advantage of closed form solution and stochastic gradient descent updating scheme, which corresponds to an efficient inverse composition algorithm, beside tackling the performance problem of image alignment faced by Wu et al. [8]. Liu et al. [9] considered an amendment algorithm based on convex program to mitigate the subspace clustering problem which is guaranteed to exactly recover the row space of the original data and perform robust subspace clustering and error correction in an efficient and effective way. Oh et al. [10] proposed an effective algorithm which uses partial sum of singular values (PSSV) instead of the nuclear norm to recover the image. This new approach, inspired by modification of the objective function, guarantees for better low rank and convergence and is more robust to outliers even when the number of observations is small. This approach lacks to perform well when the total number of observations is large. He et al. [11] considered a similar convex relaxation algorithm with guaranteed efficiently. Though several RPCA algorithms exist to deal with the potential impact of outliers and heavy sparse noises, effective and efficient algorithms need to be developed. To mitigate this issue, the authors of [12, 13] developed robust algorithms, which can well handle the grossly corrupted data. However, in very high-dimensional cases such as image feature extraction, recovery, and alignment, it lacks better performance and low computational complexity.

In this paper, we propose a novel robust algorithm for image recovery and alignment via affine transformations, Frobenius norm, and norm. To be robust against miscellaneous adverse effects such as occlusions, outliers, and heavy sparse noises, the new algorithm integrates affine transformations with low-rank plus sparse decomposition, where the low-rank component lies in a union of disjoint subspaces, so the distorted or misaligned images can be rectified to render more faithful image representation. However, inspired by [2, 14], an extra concept of the Frobenius norm and the norm is now incorporated in the decomposition process so as to be more resilient to errors, outliers, and occlusions in recovery of images. As such, the parameters to be iteratively updated in our algorithm during the decomposition process are different from those in [2, 6, 10]. In addition, the norm and the Frobenius norm, which enjoy the advantages of the and norms, are employed to remove the correlated samples across the images, enabling the new approach to be more resilient to outliers and large variations in the images. The determination of the variables involved and the affine transformations are cast as a convex optimization problem. Consequently, the distorted or misaligned images can be rectified by affine transformations, Frobenius norm, and norms to render more accurate image decomposition.

The affine transformations are aggregated with the low-rank plus sparse representation, where the low-rank component lies in a union of subspaces instead of one single subspace. These transformations can fix the distortion or misalignment in a batch of corrupted images to render more faithful image decomposition, thereby being more robust against heavy sparse errors and outliers. Because large errors may happen in images, which will impact the accuracy of image recovery, the norm is utilized. This new norm when combined with the affine transformations can further enhance the performance. This is used to normalize the adverse effects of outliers and heavy sparse noises in big data. The search of the optimal parameters and affine transformations is first cast as a convex optimization programming. Afterwards, the alternating direction method (ADMM) approach is employed and the newly developed and modified set of equations is established to update the parameters involved and the affine transformations iteratively. Conducted simulations show that the new algorithm excels the state-of-the-art works in terms of accuracy of image alignment and recovery on some public datasets. The major contributions of this paper include the following:(1)The affine transformations are incorporated in the new model to fix the distorted or misaligned images so as to be robust with heavy sparse errors and outliers(2)The ADMM approach is employed to solve the new convex optimization problem, and a set of updating equations is developed to iteratively solve this problem(3)In the new method, a set of affine transformations and Frobenius and norms are considered to boost the performance of the new method(4)A novel modified RASL based on convex program is proposed for highly linearly correlated data which can prune out occlusions and illuminations utilizing the partial column low rank as a prior information through adding an extra term during the decomposition process(5)To the best of authors’ knowledge, it is the first time modified robust image alignment sparse low-rank decomposition is attempted in such a decomposition process to tackle the image alignment problem in the wider applications of face recognition, video surveillance, and health care(6)To solve the problem of convex optimization, an alternative iterative process is addressed to reduce the complexity and simultaneously enhance the recover performance in the image alignment problem

Several considerable research studies have been carried out in the areas of image alignment and image recovery with rank minimization. For instance, Waters et al. [15] proposed a new greedy algorithm via an affine rank minimization to prune out the potential impact of the sparse errors. Lia and Fang [16] suggested an image aligning method via explicitly considering the issue of the spatially varying illumination multiplication with the error biased factors with low-order polynomials. However, it does not work well when there are severe outliers and sparse errors in the data.

To relax the standard nuclear norm, Gu et al. [17] addressed the weight nuclear norm minimization problem, which adaptively assigns the weights on different singular values that can minimize the final ranks. Moreover, Kang et al. [18] proposed a nonconvex rank approximation so as to further reduce the ranks. To tackle the dilemma of overestimated ranks, the authors of [19, 20] suggested the RPCA algorithms, which consider the decomposition of the original images into two broad components. Likassa et al. [12, 13] considered a novel algorithm to tackle the misalignment dilemma, which is designed to find the low-rank component from the illuminated data. Oh et al. [21] presented a rank minimization algorithm which simultaneously aligns the low-input dynamic change images and detects the outliers. To improve the performance of [21, Erichson et al. 22] addressed a randomized algorithm for finding the low-rank part after decomposing via rank minimization. Podosinnikova et al. [23] developed a robust PCA to minimize the reconstruction error. Shahid et al. [24] involved the RPCA spectral graph through addition of some regularization terms. Shakeri and Zhang [25] proposed an online sequential framework to find the clean part via pruning out the sparse corruptions. Hu et al. [26] introduced an approximate of the low-rank assumption for the matrix via a low-rank regularization to solve the face image denoising problems. Wright et al. [5] proposed a RPCA for image decomposition as low-rank and sparse errors; however, it lacks scalability. Kang et al. [18] addressed a robust method via nonconvex rank approximation. Zhang and Lerman [27] and Rahmani and Atia [28] addressed a robust subspace recovery to tackle the influence of annoying effects. However, its complexity is jeopardized when the outliers and sparse noises are heavy in the data. The authors of [29, 30] addressed a robust subspace learning and RPCA with rank minimization to tackle the potential impact of occlusions, illuminations, outliers, and heavy sparse noises. Shang et al. [31] proposed a novel method for rank minimization using double nuclear norm with nuclear hybrid norm penalties to alleviate the adverse dilemmas.

Extensive approaches have also been proposed to solve the low-rank subspace decomposition problems. Zhang and Yang [32] addressed the linear subspace clustering method via the low-rank decomposition. Zhao et al. [33] addressed a robust discriminant low-rank representation to obtain the multiple subspace structures. Ma et al. [34] addressed a generalized algorithm from which the information lying in the high information data is taken from the low-dimensional subspaces. Lerman and Maunu [35] addressed the subspace recovering approach to get the low-rank part from the large data. Liu et al. [9] addressed the representation of images to pinpoint the low-rank structures from the illuminated data. Recently, Rao et al. [36] introduced a compressed sensing technique for subspace segmentation. Elhamifar and Vidal [37] considered sparse subspace slustering (SSC) which used the image sparse produced by reduction [5] to represent the weight and the matrix which is called as an affinity. The subspace segmentation is done via subspace clustering method for instance [38] and subspace clustering [39]. They, however, are not robust against occlusions and illuminations. To tackle these setbacks, Li et al. [40] proposed a transformation-dependent approach via joint alignment of corrupted samples and learning subspace representation. Shen et al. [41] addressed a subspace clustering approach via the dictionary pursuit to reduce the complexity yet with satisfactory performance. Wu et al. [42] suggested a decomposition relying on a new method to mitigate the potential impact of noises in motion segmentation. Li et al. [43] considered the original images into a 3-dimensiona-based tensor; however, there is a serious problem due to the distortion. It is, however, very time consuming. Ding and Fu [44] addressed a multiview via subspace information of methods via the low rank to get the clean low dimensional which are free of subspaces corrupted from high-dimensional data.

3. Problem Formulation

Given a set of well-aligned images of the same object which are linearly correlated, where and denote the weight and height of each image, respectively. More precisely, if we let vec: denote the operator that selects an pixel region of interest from an image and stacks it as a vector, then we can create a matrix which is a low-rank matrix. If the images are misaligned because of the partial corruption and occlusions, such errors usually occur in a small region of an image and have arbitrarily large magnitudes; these errors can be modelled as sparse errors and are denoted by . To tackle the problem of misalignment, we employ the domain transformations . The transformed images can be constructed in the form of the following matrix: , where , is a well-aligned version of image and the operator denotes transformation [2]. The solution of is intractable due to the nonlinearity and complicated dependence of on the transformation . This can be solved by linearizing about the current estimate of when the change in is small [2, 6, 45]. For number of parameters and , we can write , where denotes the Jacobian of the image with regard to the transformations and denotes the standard basis for that satisfies.

Each affine transformation can be represented by a vector of parameters, yielding . Specifically, if the initial transformations are known, we can change . This allows the problem to be relaxed to be the following convex optimization problem in which we seek , , and by incorporating a new term of affine transformations and the Frobenius and norms.

To make the new approach more resilient to outliers and heavy sparse noise, the norm, which combines the advantages of the L1 and L2 norm, is used here. The regularizer is considered as the rotational invariant of the L1 norm and can effectively handle outliers [46]. Also, as observed in [13, 47], the regularizer can also achieve better sparsity promotion than the L1 norm. The L1 norm may yield a biased estimation as it ignores the extreme values and cannot handle the collinearity of features. In contrast, the norm is more stable and has the ability to better preserve the spatial information than the L1 regularizer, as demonstrated in [13, 47]. Additionally, the norm is superior to the nonconvex norms when the signals are not sparse or when the matrix is not strictly low rank [48, 49]. The overall problem can thus be posted as an optimization problem given bywhere represents the original data matrix, is the low-rank component and is the sparse error matrix. The norm of is denoted by . To solve the problem of nonlinearity of the constraint arises due to the complicated dependence of on the transformations , a linearization procedure is summarized in Algorithm 1, [2].

Input: images initial transformations, and
 0: while Not Converged do
(1)Step 1: normalize the images
(2)Step 2: solve the Linearized Convex Optimization
(3)Step 3: compute the Jacobian matrices with respect to

4. Proposed Algorithm

To solve the constraint optimization problem in (1), we used the augmented Lagrangian multiplier [2, 50], which iteratively estimates both the Lagrange multiplier and the optimal solution by minimizing the augmented Lagrangian function. The basic idea of the ADMM method [51] is to search for a saddle point of the augmented Lagrangian function instead of directly solving the original constrained optimization problem.where, for simplicity, we denote , in which , is a Lagrangian multiplier matrix, is a positive penalty parameter, denotes the matrix inner product, and is the Frobenius norm. Directly solving (2) in the first iteration is difficult, so as in [2, 50], we solve the problem iteratively in an alternating manner. In the augmented Lagrangian multiplier method, the unknowns in the augmented Lagrangian function are iteratively minimized one by one.

First, to update in (3), we fix all , , and as constant. It is difficult to solve the above function directly, so we choose to minimize the augmented Lagrange function approximately by adopting an alternating strategy: minimize the function against only one of the four unknowns , , , and Z at a time:

We note that the problem in (3) is completely separable and involves solving a convex program. Thus, following [31, 5254] and using the procedures of [55] and the concept of soft thresholding operator that can be solved efficiently using augmented Lagrangian multiplier, the update can be given as

So, can be determined by

Again, by ignoring all irrelevant terms of , it can be simplified as

By using lemma [56], the update of the column of , , is given bywhere denotes the Euclidean norm and .

Finally, we need to update , so we have to keep all other parameters in (8) as constant and employ the augmented Lagrangian multiplier to the following:

To update the parameter in (8), we keep all other parameters constant to obtain the optimal solution of the required parameters. By applying the augmented Lagrangian function and singular-value threshold, we can get the final update of as follows:where denotes Moore–Penrose pseudoinverse of .

The Lagrangian multiplier is updated by using the following equation:

For easy references about the updates of optimal parameters, Algorithm 2 is summarized.

Input: , , , , , , ,
 Maximum iteration and
 0: while Not Converged
(1) Step 1: update L by (4) and (5)
(2) Step 2: update S by (7)
(3) Step 4: update by (9)
(4) Step 5: update Z by (10)
(5) = 

5. Simulations and Discussions

This section conducts some simulations to demonstrate the effectiveness of our approach for the recovery of face images and handwritten digits. Four baselines are employed, including RASL [2], NQLSD [6], PSSV [10], and MRASL [3]. Firstly, we emphasized on examining the effectiveness of our newly proposed method both visually in removing occlusions and illuminations from highly linearly correlated data. Secondly, we further checked the image similarity quantitatively to describe the performance of our algorithm using the statistical measures of similarity, mainly the peak signal-to-noise ratio. This can be done using the peak signal-to-noise ratio () [57], which is defined aswhere both the original image and the recovered image are of size .

Datasets. To implement the new method, we considered two different public dataset handwritten digits taken from the MINST database [58] and dummy face images taken from the wild database [59].

5.1. Face Image Recovery Alignment

First, we consider the dataset which contains 30 face images of sizes taken from 100 images of a dummy head that are perturbed and occluded from the labeled faces in the wild database [59]. These are real-world face images with uncontrolled misalignment, under varying illuminations. Figure 1 shows the recovered image based on our algorithm and the other three baselines. Again, as an illustration, some recovered dummy images based on the proposed method and the aforementioned baselines are given in Figure 1, where the dummy face images with different corruptions are depicted in Figure 1(a). The recovered images by the aforementioned algorithms are shown in Figures 1(b)1(e), from which we can see that the visual quality of the proposed method is better than all of the baselines. This is consistent with the numerical results in Table 1. We can see from Figure 1 that the new method provides more clear visual quality compared with the other three baselines. This justifies the effectiveness of our new algorithm in removing the perturbed, occluded, and illuminated data from a highly linearly correlated data.

5.2. Handwritten Digit Image Recovery Alignment

Next, we conduct simulations on the handwritten digits. 30 images of “3” taken from MINST database are used to verify the effectiveness of our algorithm. The simulation results based on the performance of our proposed method shown in Figure 2(f) along with the other baselines are given in Figures 2(b)2(e). The results achieved by our method are better than those of the previous work to recover image because of including an extra term in the form of partial column low rank as a rank prior information. The low-rank component of four different approaches are depicted and compared with the original. Our method performs better than NQLSD, PSSV, and RASL, showing its effectiveness on the result of the recovered images shown in Figure 2. This can also be justified, and as an illustration, some visual images of the recovered handwritten digits based on the aforementioned methods are shown in Figure 2(f), from which we can see that the proposed method provides better alignment and recovers the corrupted handwritten images better compared with the other four baselines. As shown in Figure 2(e), the recovered handwritten images provide clearer visual quality by properly removing the adverse effects such as outliers and heavy sparse noise. This is in agreement with the results in Table 1 and further justifies that the proposed approach is more resilient to outliers and heavy sparse noise. To further validate the performance of our method, we also compare the PSNRs of the aforementioned methods, as shown in Table 1, from which we can find that the new algorithm indeed provides the largest PSNR compared with the other three baselines.

We can observe that, by adding an extra term in the form of affine transformations and Frobenius and L2,1 norms, our approach attained a larger mean PSNR compared with the approaches proposed in [2, 3, 6, 10], indicating better image recovery and having a capability to remove errors. Relatively speaking, adding a new term in the form of partial column low rank as a rank prior information to our model enhanced the performance of our newly proposed algorithm because the mean peak signal-to-noise ratio value obtained is better than the other three baselines. The advantage of our method is that it obtains more stable estimations of image recovery, which is more robust for errors, outliers, and occlusions.

6. Conclusions

In this work, we considered a new algorithm for robust image alignment and recovery with rank minimization via the Frobenius and the norms. The search of the affine transformations and the Frobenius and norms are considered in the optimization formulation as a convex constrained optimization problem. Then, this is used to alleviate the potential impact of annoying effects by correcting the distorted images. The ADMM approach is then employed, and a new set of equations is established to alternatively update the optimization parameter and the affine transformations. Moreover, the convergence of these new updating equations is scrutinized as well. Conducted simulations show that the new method performs better than other methods in terms of precision on five public databases.

Data Availability

The data used in this article are freely available for the user.

Conflicts of Interest

The author declares that there are no conflicts of interest.