1 Introduction

Censor and Elfving [12] introduced the following Split Convex Feasibility Problem (SCFP), see also [11],

$$\begin{aligned} \mathrm{find}~ x \in C ~\mathrm{such~ that}~ Ax\in Q, \end{aligned}$$
(1)

where \(A:\mathbb {R}^k \rightarrow \mathbb {R}^m\) is a bounded and linear operator, \(C\subseteq \mathbb {R}^k\) and \(Q\subseteq \mathbb {R}^m\) are nonempty, closed and convex sets. Hereafter, we let S represent the set of solutions to SCFP (1).

Originally the SCFP was introduced in Euclidean spaces, and afterwards extended to infinite dimensional spaces as well as applied successfully in the field of intensity-modulated radiation therapy (IMRT) treatment planning, see [11,12,13, 15].

Since the introduction of the SCFP, many authors have introduced various iterative methods for solving it, see, for example, [10, 14, 17,18,19, 23, 29, 30, 38, 41,42,43, 49, 52]. It is known that (see, e.g., [9]) \(x \in C\) solves the SCFP (1) if and only if x solves the fixed point problem:

$$\begin{aligned} { x=P_C(x- \lambda A^t(I-P_Q)Ax), \lambda >0} \end{aligned}$$

and consequently, the following Byrne’s CQ method [9] was introduced:

$$\begin{aligned} { x_{n+1} = P_C(x_{n} - \lambda A^t(I-P_Q)Ax_n), \ n\ge 1} \end{aligned}$$
(2)

where \(A^t\) denotes the transpose of A.

Weak convergence of the CQ method is guaranteed under the assumption that \(\lambda \in (0, 2/\Vert A\Vert ^2)\). So, an implementation of (2) requires a norm estimation of the bounded linear operator A, or the spectral radius of the matrix \(A^tA\) in finite-dimensional framework. This fact might effect the applicability of the method in practice, see [26, Theorem 2.3]. So, in order to circumvent this scenario, López et al. [30] introduced a modification of the CQ method (2) by replacing the step-size \(\lambda \) in (2) with the following adaptive step:

$$\begin{aligned} \tau _n=\frac{\rho _nf(x_n)}{\Vert \nabla f(x_n)\Vert ^{2}}, \ \ \ n\ge 1, \end{aligned}$$
(3)

where \(\rho _n\in (0,4)\), \(f(x_n)=\frac{1}{2}\Vert (I-P_Q)Ax_n\Vert ^{2}\) and \(\nabla f(x_n)=A^{t}(I-P_Q)Ax_n\) for all \(n\ge 1\). There exists many other modifications of the CQ algorithm, see, for example, [20, 23, 24, 44, 49].

Following the heavy ball method of Polyak [39], Nesterov [37] introduced the following iterative step:

$$\begin{aligned} y_{n}= & {} x_{n}+\theta _n(x_{n}-x_{n-1}),\nonumber \\ x_{n+1}= & {} y_{n}-\lambda _{n}\nabla f(y_{n}), \ \ \ n\ge 1, \end{aligned}$$
(4)

where \(\theta _{n}\in [0,1)\) is an inertial factor and \(\lambda _{n}\) is a positive sequence. It was shown via numerical experiments in the field of image reconstruction, that (4) and other associated methods, such as [1, 2, 6,7,8, 18, 21, 31, 32, 34], have greatly improved the performance of their non-inertial algorithms, that is, when \(\theta _n=0\). Hence this idea is also referred to as inertial algorithms.

In this spirit, several inertial-type methods for solving SCFPs have been proposed recently, see [16, 44,45,46,47, 51], just to name a few. In particular, Dang et al. [18] (see also [17]) proposed the following inertial relaxed CQ algorithms for solving SCFPs:

$$\begin{aligned} x_{n+1}=P_{C_n}(I-\lambda A^t(I-P_{Q_n})Ay_n), \end{aligned}$$
(5)

and

$$\begin{aligned} x_{n+1}=(1-\alpha _n)y_n+\alpha _nP_{C_n}(I-\lambda A^t(I-P_{Q_n})Ay_n), \end{aligned}$$
(6)

where \({ y_n=x_n+\theta _n(x_n-x_{n-1})}, \alpha _n \in (0,1)\), \(\lambda \in (0, 2/\Vert A\Vert ^2)\) and \(\theta _n \in [0,\overline{\theta }_n]\) with \(\overline{\theta }_n:= \min \{\theta , (\max \{n^2\Vert x_n-x_{n-1}\Vert ^2, n^2\Vert x_n-x_{n-1}\Vert \})^{-1} \}, \theta \in [0,1)\).

An important observation regarding the above inertial methods [16,17,18, 44,45,46,47, 51], is that the sequence \(\{x_n\}\) generated by these inertial-type methods does not have a monotonic behaviour with respect to \(x^* \in S\) and can move or swing back and forth around S, see, for example, [5, 31]. This could explain why such inertial extrapolation step does not converge faster than its counterpart non-inertial methods, see, e.g., [33].

In a direction to resolve the above issue, an alternated inertial method was introduced recently in [36]. This alternated inertia method shown to exhibit attractive performances in practice including monotonicity of \(\{\Vert x_{2n}-x^*\Vert \}\), see [27, 28] for more details.

Motivated by the above works, we propose a new relaxed CQ method with alternated inertial procedure for solving SCFPs. We establish global convergence of our scheme under some easy to verify assumptions. Moreover, the parameters controlling the inertial factor, that is, \(\theta _n\) can be chosen as close as possible to 1 (when \(\mu \) tends to zero in (10)). This is opposite to many other related methods that restrict it to less than 1, see, e.g., [16,17,18, 44,45,46,47, 51].

The outline of the paper is a follows. Definitions, basic concepts and useful results are presented in Sect. 2. The method and its analysis is given in Sect. 3 and then some numerical experiments in the field of signal processing which illustrate the effectiveness and applicability of our proposed scheme is presented in Sect. 4. Final remarks are presented in Sect. 5.

2 Preliminaries

We start by recalling some definitions and basic results.

A mapping \( T: \mathbb {R}^k \rightarrow \mathbb {R}^k \) is called

  1. (a)

    nonexpansive if \(\Vert Tx-Ty\Vert \le \Vert x-y\Vert \), for all \(x,y \in \mathbb {R}^k\);

  2. (b)

    firmly nonexpansive if \(\Vert Tx-Ty\Vert ^2 \le \Vert x-y\Vert ^2-\Vert (I-T)x-(I-T)y \Vert ^2\) for all \(x, y\in \mathbb {R}^k\). Equivalently, \(\Vert Tx-Ty\Vert ^2 \le \langle x-y, Tx-Ty \rangle \)for all \(x, y\in \mathbb {R}^k\).

It is shown in [25] that T is firmly nonexpansive if and only if \(I-T\) is firmly nonexpansive.

Let C be a nonempty, closed and convex subset of \(\mathbb {R}^k\). For any point \(u \in \mathbb {R}^k\), there exists a unique point \(P_C u \in C\) such that

$$\begin{aligned} \Vert u-P_C u\Vert \le \Vert u-y\Vert ~~\forall y \in C, \end{aligned}$$

\(P_C\) is called the metric projection of \(\mathbb {R}^k\) onto C. Some important properties of the metric projection are listed next, for this and more see [4]. We know that \(P_C\) is a firmly nonexpansive mapping of \(\mathbb {R}^k\) onto C. It is also known that \(P_C\) satisfies

$$\begin{aligned} \langle x-y, P_C x-P_C y \rangle \ge \Vert P_C x-P_C y\Vert ^2~~\forall x, y \in \mathbb {R}^k. \end{aligned}$$
(7)

Furthermore, \(P_C x\) is characterized by the property

$$\begin{aligned} P_Cx \in C \quad \text {and} \quad \langle x-P_C x, P_C x-y \rangle \ge 0~~\forall y \in C. \end{aligned}$$
(8)

This characterization implies that

$$\begin{aligned} \Vert x-y\Vert ^2\ge \Vert x-P_Cx\Vert ^2+\Vert y-P_Cx\Vert ^2~~\forall x \in \mathbb {R}^k, \forall y \in C. \end{aligned}$$
(9)

Let the function \(f: \mathbb {R}^k \rightarrow \mathbb {R}\), an element \(g \in \mathbb {R}^k\) is said to be a subgradient of f at x if

$$\begin{aligned} f(y)\ge f(x)+\langle y-x,g\rangle , ~~\forall y \in \mathbb {R}^k. \end{aligned}$$

The subdifferential of f at x, \(\partial f(x)\), is defined by

$$\begin{aligned} \partial f(x):=\{g \in \mathbb {R}^k: f(y)\ge f(x)+\langle y-x,g\rangle , ~~\forall y \in \mathbb {R}^k\}. \end{aligned}$$

Lemma 2.1

([10]) Let C be nonempty, closed and convex subset of \(\mathbb {R}^k\) and \(x \in \mathbb {R}^k\). Consider the function \(f(x):=\frac{1}{2}\Vert (I-P_Q)Ax\Vert ^2\),then

  1. (i)

    the function f is convex and differentiable.

  2. (ii)

    The gradient of f at x is defined as \(\nabla f(x)=A^t(I-P_Q)Ax\).

  3. (iii)

    \(\nabla f\) is \(\Vert A\Vert ^2\)-Lipschitz continuous.

The next basic lemma is useful for our analysis.

Lemma 2.2

Let \(x, y \in \mathbb {R}^k \). Then

  1. (i)

    \( \Vert x+y\Vert ^2=\Vert x\Vert ^2+2\langle x,y\rangle +\Vert y\Vert ^2\);

  2. (ii)

    \(\Vert x+y\Vert ^2 \le \Vert x\Vert ^2+2 \langle y,x+y\rangle \);

  3. (iii)

    \( \Vert \alpha x+\beta y\Vert ^2=\alpha (\alpha +\beta )\Vert x\Vert ^2+\beta (\alpha +\beta )\Vert y\Vert ^2-\alpha \beta \Vert x-y\Vert ^2, \quad \forall \alpha , \beta \in \mathbb {R}.\)

3 The algorithm

In the light of [22], we consider a relaxed CQ method with alternated inertial extrapolation step in which C and Q in (1) are level sets of convex functions given by

$$\begin{aligned} C:=\{x \in \mathbb {R}^k: c(x)\le 0\} \end{aligned}$$

and

$$\begin{aligned} Q:=\{x \in \mathbb {R}^m: q(x)\le 0\} \end{aligned}$$

where \(c:\mathbb {R}^k\rightarrow \mathbb {R}\) and \(q:\mathbb {R}^m\rightarrow \mathbb {R}\) are convex functions. By [3, Fact 7.2 (iii)], c and q are subdifferentiable on C and Q, respectively, and c and q are bounded on bounded sets.

For \(n \ge 1\), define

$$\begin{aligned} C_n:=\{x\in \mathbb {R}^k:c(w_n)\le \langle \xi _n, w_n-x\rangle \} \end{aligned}$$

and

$$\begin{aligned} Q_n:=\{y\in \mathbb {R}^m:q(Aw_n)\le \langle \zeta _n, Aw_n-y\rangle \} \end{aligned}$$

with \(\xi _n \in \partial c(w_n)\) and \(\zeta _n \in \partial q(Aw_n)\), respectively. It can be easily seen that \(C_n\supset C\) and \(Q_n\supset Q\) for all n. Consequently, since \(C_n\) and \(Q_n\) are two half-spaces, the projections onto these sets have a closed formulas and hence easy to compute. From now on we define for all \(x \in \mathbb {R}^k\): \(f_n(x):=\frac{1}{2}\Vert (I-P_{Q_n})Ax\Vert ^2\) and \(\nabla f_n(x)=A^t(I-P_{Q_n})Ax\).

figure a

Remark 3.1

  1. (a)

    As mentioned in the introduction, by adding the inertial extrapolation step in (5) and (6) to the classical CQ algorithm (2), the new generated sequence \(\{x_n\}\) can move or swing back and forth around S and hence we do not have monotonicity of \(\{\Vert x_n-x^*\Vert \}, x^* \in S\). This matter can affect the convergence speed of the CQ methods with inertial extrapolation step, and sometimes would not even converge faster than the original CQ methods. In order to circumvent this scenario and regain monotonicity to some extent (see Lemma 3.2 below), we introduce the inertial extrapolation step (11).

  2. (b)

    Observe that if \(\theta _n =0\), then Algorithm 1 reduces to the methods proposed in [23, 40, 50].

  3. (c)

    Our scheme allows to choose the parameters controlling the inertial factor \(\theta _n\) as close as possible to 1, when \(\mu \) tends to zero in (10). This is more flexible than the methods in [16,17,18, 44,45,46,47, 51]. In general a wise choice of \(\theta _n\) in Step 2 of Algorithm 1 enables acceleration of our method.

  4. (d)

    Observe that we make use of an Armijo line search rule Algorithm 1, which is similar to [23] and hence following [23, Lemma 3.1], the search rule in Algorithm 1 ends after a finite number of iterations. Furthermore,

    $$\begin{aligned} \frac{\mu l}{\Vert A\Vert ^2} <\tau _n \le \gamma , \forall n \ge 1. \end{aligned}$$

3.1 Convergence Analysis

We give the convergence analysis of Algorithm 1 under the assumption that the solution set of the SCFP (1) is nonempty.

Lemma 3.2

Suppose that the solution set of the SCFP (1) is nonempty, that is, \(S\ne \emptyset \) and \(\{x_n\}\) is any sequence generated by Algorithm 1. Then \(\{x_{2n}\}\) is Fejér monotone with respect to S (i.e., \(\Vert x_{2n+2}-z\Vert \le \Vert x_{2n}-z\Vert , \forall z \in S\)).

Proof

Pick a point z in S. Then

$$\begin{aligned}&\Vert x_{2n+2}-z\Vert ^2=\Vert P_{C_{2n+1}}(w_{2n+1}-\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1}))-z\Vert ^2\nonumber \\&\quad \le \Vert (w_{2n+1}-z)-\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1})\Vert ^2-\Vert x_{2n+2}-w_{2n+1}\nonumber \\&\qquad +\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1})\Vert ^2 \nonumber \\&\quad =\Vert w_{2n+1}-z\Vert ^2-2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),w_{2n+1}-z\rangle -\Vert x_{2n+2}-w_{2n+1} \Vert ^2\nonumber \\&\qquad -2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-w_{2n+1}\rangle \nonumber \\&\quad =\Vert w_{2n+1}-z\Vert ^2-2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),w_{2n+1}-\bar{x}_{2n+1}\rangle \nonumber \\&\qquad -2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),\bar{x}_{2n+1}-z\rangle -\Vert x_{2n+2}-w_{2n+1} \Vert ^2 \nonumber \\&\qquad -2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-w_{2n+1}\rangle \nonumber \\&\quad =\Vert w_{2n+1}-z\Vert ^2-2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),\bar{x}_{2n+1}-z\rangle \nonumber \\&\qquad -2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \nonumber \\&\qquad - \Vert x_{2n+2}-\bar{x}_{2n+1}+\bar{x}_{2n+1}-w_{2n+1}\Vert ^2\nonumber \\&\quad =\Vert w_{2n+1}-z\Vert ^2-2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),\bar{x}_{2n+1}-z\rangle - \Vert x_{2n+2}-\bar{x}_{2n+1}\Vert ^2 \nonumber \\&\qquad - \Vert \bar{x}_{2n+1}-w_{2n+1}\Vert ^2-2\langle \bar{x}_{2n+1}-w_{2n+1}\nonumber \\&\qquad +\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle . \end{aligned}$$
(14)

By the fact that \(I-P_{Q_{2n+1}}\) is firmly-nonexpansive and \(\nabla f_{2n+1}(z)=0\), we get

$$\begin{aligned}&2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1}),\bar{x}_{2n+1}-z\rangle = 2\tau _{2n+1}\langle \nabla f_{2n+1}(\bar{x}_{2n+1})\nonumber \\&\qquad -\nabla f_{2n+1}(z),\bar{x}_{2n+1}-z\rangle \nonumber \\&\quad = 2\tau _{2n+1}\langle A^t(I-P_{Q_{2n+1}})A\bar{x}_{2n+1}-A^t(I-P_{Q_{2n+1}})Az,\bar{x}_{2n+1}-z\rangle \nonumber \\&\quad = 2\tau _{2n+1}\langle (I-P_{Q_{2n+1}})A\bar{x}_{2n+1}-(I-P_{Q_{2n+1}})Az,A\bar{x}_{2n+1}-Az\rangle \nonumber \\&\quad \ge \frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n+1}})A\bar{x}_{2n+1} \Vert ^2. \end{aligned}$$
(15)

By (8) and the fact that \(\bar{x}_{2n+1} \in C_{2n+1}\), we get

$$\begin{aligned} \langle \bar{x}_{2n+1}-w_{2n+1}+\tau _{2n+1} \nabla f_{2n+1}(w_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \ge 0. \end{aligned}$$
(16)

Consequently,

$$\begin{aligned}&-2\langle \bar{x}_{2n+1}-w_{2n+1}+\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \nonumber \\&\quad \le 2 \langle w_{2n+1}-\bar{x}_{2n+1}-\tau _{2n+1} \nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \nonumber \\&\qquad +2 \langle \bar{x}_{2n+1}-w_{2n+1}+\tau _{2n+1} \nabla f_{2n+1}(w_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \nonumber \\&\quad = 2 \tau _{2n+1} \langle \nabla f_{2n+1}(w_{2n+1})-\nabla f_{2n+1}(\bar{x}_{2n+1}),x_{2n+2}-\bar{x}_{2n+1}\rangle \nonumber \\&\quad \le 2 \tau _{2n+1}\Vert \nabla f_{2n+1}(w_{2n+1})-\nabla f_{2n+1}(\bar{x}_{2n+1})\Vert \Vert x_{2n+2}-\bar{x}_{2n+1}\Vert \nonumber \\&\quad \le \tau _{2n+1}^2 \Vert \nabla f_{2n+1}(w_{2n+1})-\nabla f_{2n+1}(\bar{x}_{2n+1})\Vert ^2+\Vert x_{2n+2}-\bar{x}_{2n+1}\Vert ^2 \nonumber \\&\quad \le \mu ^2 \Vert w_{2n+1}-\bar{x}_{2n+1}\Vert ^2+\Vert x_{2n+2}-\bar{x}_{2n+1}\Vert ^2 \end{aligned}$$
(17)

Using (15), (16) and (17) in (14):

$$\begin{aligned} \Vert x_{2n+2}-z\Vert ^2\le & {} \Vert w_{2n+1}-z\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n+1}})A\bar{x}_{2n+1} \Vert ^2 \nonumber \\&-(1-\mu ^2) \Vert w_{2n+1}-\bar{x}_{2n+1}\Vert ^2. \end{aligned}$$
(18)

Now,

$$\begin{aligned} \Vert w_{2n+1}-z\Vert ^2= & {} \Vert x_{2n+1}+\theta _{2n+1}(x_{2n+1}-x_{2n})-z\Vert ^2 \nonumber \\= & {} \Vert (1+\theta _{2n+1})(x_{2n+1}-z)-\theta _{2n+1}(x_{2n}-z)\Vert ^2 \nonumber \\= & {} (1+\theta _{2n+1})\Vert x_{2n+1}-z\Vert ^2-\theta _{2n+1}\Vert x_{2n}-z\Vert ^2 \nonumber \\&+\theta _{2n+1}(1+\theta _{2n+1})\Vert x_{2n+1}-x_{2n}\Vert ^2. \end{aligned}$$
(19)

Using similar arguments in showing (18), one can get

$$\begin{aligned} \Vert x_{2n+1}-z\Vert ^2\le & {} \Vert w_{2n}-z\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert ^2 \nonumber \\&-(1-\mu ^2) \Vert w_{2n}-\bar{x}_{2n}\Vert ^2\nonumber \\= & {} \Vert x_{2n}-z\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert ^2 \nonumber \\&-(1-\mu ^2) \Vert w_{2n}-\bar{x}_{2n}\Vert ^2. \end{aligned}$$
(20)

Putting (20) and (19) into (18):

$$\begin{aligned} \Vert x_{2n+2}-z\Vert ^2\le & {} \Vert x_{2n}-z\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}(1+\theta _{2n+1})\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert ^2\nonumber \\&-(1-\mu ^2)(1+\theta _{2n+1})\Vert w_{2n}-\bar{x}_{2n}\Vert ^2\nonumber \\&+\theta _{2n+1}(1+\theta _{2n+1})\Vert x_{2n+1}-x_{2n}\Vert ^2\nonumber \\&-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n+1}})A\bar{x}_{2n+1} \Vert ^2 \nonumber \\&-(1-\mu ^2) \Vert w_{2n+1}-\bar{x}_{2n+1}\Vert ^2. \end{aligned}$$
(21)

Observe that

$$\begin{aligned} \Vert x_{2n+1}-x_{2n}\Vert\le & {} \Vert x_{2n+1}-\bar{x}_{2n}\Vert +\Vert x_{2n}-\bar{x}_{2n}\Vert \nonumber \\\le & {} \Vert w_{2n}-\tau _{2n} \nabla f_{2n}(\bar{x}_{2n})-w_{2n}+\tau _{2n} \nabla f_{2n}(w_{2n})\Vert +\Vert x_{2n}-\bar{x}_{2n}\Vert \nonumber \\\le & {} \tau _{2n} \Vert \nabla f_{2n}(\bar{x}_{2n})-\nabla f_{2n}(w_{2n})\Vert +\Vert x_{2n}-\bar{x}_{2n}\Vert \nonumber \\\le & {} (1+\mu )\Vert x_{2n}-\bar{x}_{2n}\Vert . \end{aligned}$$
(22)

Using (22) in (21):

$$\begin{aligned} \Vert x_{2n+2}-z\Vert ^2\le & {} \Vert x_{2n}-z\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}(1+\theta _{2n+1})\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert ^2\nonumber \\&-\Big [(1-\mu ^2)(1+\theta _{2n+1})-\theta _{2n+1}(1+\theta _{2n+1})(1+\mu )^2\Big ]\Vert w_{2n}-\bar{x}_{2n}\Vert ^2\nonumber \\&-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n+1}})A\bar{x}_{2n+1} \Vert ^2 -(1-\mu ^2) \Vert w_{2n+1}-\bar{x}_{2n+1}\Vert ^2\nonumber \\\le & {} \Vert x_{2n}-z\Vert ^2. \end{aligned}$$
(23)

Therefore,

$$\begin{aligned} \Vert x_{2n+2}-z\Vert \le \Vert x_{2n}-z\Vert . \end{aligned}$$

\(\square \)

Theorem 3.3

Suppose that \(S\ne \emptyset \) and \(\{x_n\}\) is any sequence generated by Algorithm 1. Then \( \{ x_n \} \) converges to a point in S.

Proof

By Lemma 3.2, we have that \(\underset{n\rightarrow \infty }{\lim }\Vert x_{2n}-z\Vert \) exists and this implies that \(\{x_{2n}\}\) is bounded. Furthermore, we get from (23) that

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert =0. \end{aligned}$$
(24)

From (23)

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert x_{2n}-\bar{x}_{2n}\Vert =0. \end{aligned}$$
(25)

Now, since \((I-P_{Q_{2n}})\) is nonexpansive,

$$\begin{aligned} \Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert\le & {} \Vert (I-P_{Q_{2n}})Ax_{2n}-(I-P_{Q_{2n}})A\bar{x}_{2n} \Vert \nonumber \\&+\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert \nonumber \\\le & {} \Vert Ax_{2n}-A\bar{x}_{2n} \Vert +\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert \nonumber \\\le & {} \Vert A\Vert \Vert x_{2n}-\bar{x}_{2n} \Vert +\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert . \end{aligned}$$
(26)

By (24) and (25), we get from (26) that

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert (I-P_{Q_{2n}})Ax_{2n} \Vert =0. \end{aligned}$$
(27)

Similarly, just like (27), one can show that

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert (I-P_{Q_{2n+1}})Ax_{2n+1} \Vert =0. \end{aligned}$$
(28)

Since \(\partial q\) is bounded on bounded sets, we get \(\delta >0\) such that \(\Vert \zeta _{2n}\Vert \le \delta \). Since \(P_{Q_{2n}}Ax_{2n} \in Q_{2n}\), we obtain from Algorithm 1 that

$$\begin{aligned} q(Ax_{2n})\le & {} \langle \zeta _{2n}, Ax_{2n}-P_{Q_{2n}}Ax_{2n}\rangle \nonumber \\\le & {} \delta \Vert (I-P_{Q_{2n}})Ax_{2n} \Vert \rightarrow 0, n\rightarrow \infty . \end{aligned}$$
(29)

Since \(\{x_{2n}\}\) is bounded, there exists \(\{x_{2n_j}\}\subset \{x_{2n} \}\) such that \(x_{2n_j}\rightarrow x^* \in \mathbb {R}^k\). Then continuity of q and (29) imply

$$\begin{aligned} q(x^*) \le \underset{n\rightarrow \infty }{\liminf }q(Ax_{2n_j}) \le 0. \end{aligned}$$

Thus, \(Ax^* \in Q\).

By (22) and (23), we get

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert x_{2n+1}-x_{2n} \Vert =0. \end{aligned}$$
(30)

Since \(x_{2n_j+1} \in C_{2n_j} \), then by definition of \(C_{2n_j}\),

$$\begin{aligned} c(w_{2n_j}) + \langle \xi _{2n_j}, x_{2n_j+1}-w_{2n_j}\rangle \le 0, \end{aligned}$$

where \(\xi _{2n_j} \in \partial c(x_{2n_j})\). By the boundedness of \(\{\xi _{2n_j}\}\) and (30), we get

$$\begin{aligned} c(x_{2n_j})=c(w_{2n_j})\le & {} \langle \xi _{2n_j},w_{2n_j}-x_{2n_j+1}\rangle \nonumber \\\le & {} \Vert \xi _{2n_j}\Vert \Vert w_{2n_j}-x_{2n_j+1}\Vert \nonumber \\= & {} \Vert \xi _{2n_j}\Vert \Vert x_{2n_j}-x_{2n_j+1}\Vert \rightarrow 0, j\rightarrow \infty . \end{aligned}$$
(31)

By continuity of c and \(x_{2n_j}\rightarrow x^*\), we get from (31) that

$$\begin{aligned} c(x^*) \le \underset{j\rightarrow \infty }{\liminf }c(x_{2n_j})\le 0. \end{aligned}$$

Thus, \(x^* \in C\). Therefore, \(x^* \in S\).

We next show that the sequence of odd terms \(\{x_{2n+1} \}\) converges to \(x^*\). Note that since \(\underset{n\rightarrow \infty }{\lim }\Vert x_{2n}-x^*\Vert \) exists and \(\underset{j\rightarrow \infty }{\lim }\Vert x_{2n_j}-x^*\Vert =0\), we get \(\underset{n\rightarrow \infty }{\lim }\Vert x_{2n}-x^*\Vert =0\). Therefore, \(x^*\) is unique.

Following the same arguments as in (14)-(18), one can show that

$$\begin{aligned} \Vert x_{2n+1}-x^*\Vert ^2= & {} \Vert P_{C_{2n}}(w_{2n}-\tau _{2n} \nabla f_{2n}(\bar{x}_{2n}))-x^*\Vert ^2\nonumber \\\le & {} \Vert w_{2n}-x^*\Vert ^2-2\tau _{2n}\langle \nabla f_{2n}(\bar{x}_{2n}),\bar{x}_{2n}-x^*\rangle - \Vert x_{2n+1}-\bar{x}_{2n}\Vert ^2 \nonumber \\&- \Vert \bar{x}_{2n}-w_{2n}\Vert ^2-2\langle \bar{x}_{2n}-w_{2n}\nonumber \\&+\tau _{2n} \nabla f_{2n}(\bar{x}_{2n}),x_{2n+1}-\bar{x}_{2n}\rangle \\\le & {} \Vert w_{2n}-x^*\Vert ^2-\frac{2\mu l}{\Vert A\Vert ^2}\Vert (I-P_{Q_{2n}})A\bar{x}_{2n} \Vert ^2 \nonumber \\&-(1-\mu ^2) \Vert w_{2n}-\bar{x}_{2n}\Vert ^2\\\le & {} \Vert w_{2n}-x^*\Vert ^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert x_{2n+1}-x^*\Vert \le \Vert w_{2n}-x^*\Vert =\Vert x_{2n}-x^*\Vert . \end{aligned}$$
(32)

Thus,

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert x_{2n+1}-x^*\Vert =0. \end{aligned}$$

Therefore, \(\underset{n\rightarrow \infty }{\lim }x_n=x^*\) and the desired result is obtained. \(\square \)

We give the following remark on our results.

Remark 3.4

  1. (a)

    When vanilla inertial extrapolation step (the case when \(w_n\) in (11) is computed as \(w_n= x_n+\theta _n(x_n-x_{n-1}), \forall n \ge 1\)) is added to methods for solving SCFP (1), the Fejér monotonicity of the generated sequence \(\{x_n\}\) with respect to S is lost. Here in our results in Lemma 3.2, we recover the Fejér monotonicity of \(\{x_{2n}\}\) with respect to S. This is one of the interesting properties of methods with alternated extrapolation step for solving SCFP (1).

  2. (b)

    Our methods of proof in Lemma 3.2 and Theorem 3.3 are simpler and different from the methods of proof given in other papers (see, e.g., [16,17,18, 44,45,46,47, 51]) which solve SCFP (1) using methods with vanilla inertial extrapolation step.

\(\Diamond \)

4 Numerical experiments

In this section, we use the SCFP (1) to model two real problems, the first is the recovery of a sparse signal and the second is image deblurring.

We make use of the well-known LASSO problem [48] which is the following.

$$\begin{aligned} \min \left\{ \frac{1}{2}\Vert Ax-b\Vert ^2_2: x\in \mathbb {R}^k, \Vert x\Vert _1 \le t \right\} , \end{aligned}$$
(33)

where \(A \in \mathbb {R}^{m\times k}, m < k, b \in \mathbb {R}^m\) and \(t > 0\). This problem, (33) exhibits the potential of finding a sparse solution of the SCFP (1) due to the \(\ell _1\) constraint.

Example 4.1

The first problem is focused in finding a sparse solution of the SCFP (1). We illustrate the advantages of our proposed scheme by comparing it with some related results in the literature, such as the methods in [23, 40, 50]. For the experiments the matrix A is generated from a normal distribution with mean zero and one variance. The true sparse signal \(x^*\) is generated from uniformly distribution in the interval \([-2,2]\) with random K position nonzero while the rest is kept zero. The sample data \(b = Ax^*\) (no noise is assumed).

In the algorithm’s implementations we choose the following parameters \(\gamma =1\), \(l=\mu = 0.5\) and the constant stepsize \(0.9*(2/L)\) for the relaxed CQ algorithm [50]. This parameters choices are arbitrary and valid theoretically, and here the goal is just to illustrate the performance of the methods. Clearly in a real-world scenario, one should have a deep investigation which involves intensive numerical simulations that can guaranteed optimal and performances. We limit the iterations number to 1000 and report the “Err” which is defined as \(\Vert x_{n+1} -x_n \Vert \). We also report the the objective function value (“Obj”).

Table 1 Numerical results obtained by all 4 CQ variants with \(m =120, n =512\)
Fig. 1
figure 1

The recovered sparse signal versus the original for the 4 CQ variants with \(m=120, n = 512\) and \(K=10\)

Fig. 2
figure 2

The recovered sparse signal versus the original for the 4 CQ variants with \(m=120, n = 512\) and \(K=20\)

Under certain condition on matrix A, the solution of the minimization problem (33) is equivalent to the \(\ell _0\)-norm solution of the underdetermined linear system. For the considered SCFP (1), we define \(C = \{x\in \mathbb {R}^k :\Vert x\Vert _1\le t \}\) and \(Q= \{b\}\). Instead of projecting onto the closed and convex set C (there exists no closed formula), we use subgradient projection. So, define the convex function \(c(x):=\Vert x\Vert _1-t\) and let \(C_n\) be defined by

$$\begin{aligned} C_n=\{x \in \mathbb {R}^k:c(w_n)+\langle \xi _n, x-w_n\rangle \le 0\}, \end{aligned}$$

where \(\xi _n \in \partial c(w_n)\). It can be easily seen that the subdifferential \(\partial c\) at \(x\in \mathbb {R}^k\) is (defined element wise)

$$\begin{aligned} {[}\partial c(x)]_i= \left\{ \begin{array}{lllll} &{} 1, &{} x_i>0,\\ &{} -1 \text { or } 1, &{} x_i=0\\ &{} -1, &{} x_i<0. \end{array} \right. \end{aligned}$$

Now, the orthogonal projection of a point \(x\in \mathbb {R}^k\) onto \(C_n\) can be calculated by the following,

$$\begin{aligned} P_{C_n}(x)= \left\{ \begin{array}{lllll} &{} x, &{}c(w_n)+\langle \xi _n, x-w_n\rangle \le 0,\\ &{} x-\frac{c(w_n)+\langle \xi _n, x-w_n\rangle }{\Vert \xi _n\Vert ^2}\xi _n, &{}\mathrm{otherwise}. \end{array} \right. \end{aligned}$$
Fig. 3
figure 3

The recovered sparse signal versus the original for the 4 CQ variants with \(m=120, n = 512\) and \(K=30\)

Fig. 4
figure 4

The objective function value for different values of \(\{\theta _n\}_{n=1}^\infty \)

In Table 1 we summarize the results and in Figs. 1, 2 and 3 we plot the exact K-sparse signal against the recovered signals and the objective function values obtained by the different methods. One can clearly see that the inertial term plays as a significant role in achieving a better solution with respect to a lower objective value and CPU time for the same number of iterations.

Next for \(K=20\) we illustrate the influence of the inertial parameter \(\theta \) as it approaches 1 as a function of \(\mu \rightarrow 0\) (taken as \(\frac{1}{n}\)). In Fig. 4 we plot the value of the objective function \(\frac{1}{2}\Vert Ax-b\Vert ^2_2\) after 1000 iterations for any value of \(\{\theta _n\}_{n=1}^\infty \) and other parameters are chosen as above.

Fig. 5
figure 5

Recovered images via the different algorithms

Example 4.2

In this example we wish to apply our algorithm to image deblurring problem. Given a convolution matrix \(A\in \mathbb {R}^{m\times k}\) and an unknown original image \(x\in \mathbb {R}^k\), we get \(b\in \mathbb {R}^m\), which is the known degraded observation. We also include unknown additive random noise \(v\in \mathbb {R}^m\) and get the following image recovery problem.

$$\begin{aligned} Ax=b+v. \end{aligned}$$
(34)

This problem can clearly fits into the setting of SCFP with \(C=\mathbb {R}^k\), if no noise is included in the observed image b then \(Q=\{b\}\) is a singleton and otherwise \(Q=\{y\in \mathbb {R}^m \mid \Vert y-(b+v)\Vert \le \varepsilon \}\) for small enough \(\varepsilon >0\).

We illustrate the effectiveness and performance of our proposed Algorithm 1 compared with [40, Alg. 4.1.] and the very recent result of Padcharoen et al. [35, Alg. 1] which is the inertial Tseng method. The test image is the Lenna image (https://en.wikipedia.org/wiki/Lenna) which went through a \(9\times 9\) Gaussian random blur and random noise. Clearly this problem’s structure differs from Example 4.1 but for simplicity we choose for Algorithm 1 compared with [40, Alg. 4.1.] the same parameters settings and for [35, Alg. 1] we choose the same choices as the authors did, that is, the inertial term \(\alpha _n=0.9\) and the step size \(\lambda _n=0.5-\frac{150n}{1000n+100}\). In Figs. 5 (a)-(k) we report all results that include the recovered images via the different algorithms, the difference between successive iterations and the signal to noise ratio (SNR\(=10\log \frac{\Vert x\Vert _2^2}{\Vert x-x_n\Vert _2^2}\)) with respect to the number of iterations.

The CPU time in seconds of the tested algorithms is reported in Table 2.

Table 2 Execution time of the different algorithms

From Figs. 5 and Table 2 it can be seen that the inertial methods: Algorithm 1 and [35, Alg. 1] generate reasonable and compatible results after only 30 iterations compared with the non-inertial method [40, Alg. 4.1.]. The two major advantages of our proposed Algorithm 1 compared with the other two algorithms is the higher SNR value and lower CPU time for generating the recovered image.

5 Final remarks

In this paper, we give global convergence result for Split Convex Feasibility problem using relaxed CQ method with alternated inertial extrapolation step. Our result extend and generalize some existing results in the literature and the primary numerical results indicate that our proposed method outperforms most existing relaxed CQ method for solving SCFP.