Abstract

In this paper, we reformulate the gridless direction of arrival (DoA) estimation problem in a novel reweighted covariance fitting (CF) method. The proposed method promotes joint sparsity among different snapshots by means of nonconvex Schatten-p quasi-norm penalty. Furthermore, for more tractable and scalable optimization problem, we apply the unified surrogate for Schatten-p quasi-norm with two-factor matrix norms. Then, a locally convergent iterative reweighted minimization method is derived and solved efficiently via a semidefinite program using the optimization toolbox. Finally, numerical simulations are carried out in the background of unknown nonuniform noise and under the consideration of coprime array (CPA) structure. The results illustrate the superiority of the proposed method in terms of resolution, robustness against nonuniform noise, and correlations of sources, in addition to its applicability in a limited number of snapshots.

1. Introduction

DoA estimation is a major task in array signal processing. It has wide applications in radar, sonar, wireless communications, underwater acoustics, and seismology [13]. The main goal of DoA estimation is to retrieve the direction information of several sources from the outputs of the antenna array. The study of DoA estimation has a long history. One of an outstanding class of methods is designated as subspace-based methods such as root multiple signal classification (RootMUSIC) [4] and the estimation of parameters by rotational invariant techniques (ESPRIT) [5], and their variants. In general, these methods have high estimation accuracy with low computational complexity, but they suffer from considerable limitations. For example, they require a priori knowledge of the number of sources that may be difficult to acquire in practice. Moreover, they need a sufficient number of data snapshots to estimate the data covariance matrix precisely. In addition, they are sensitive to source correlations that may lead to a rank deficiency in the sample data covariance matrix.

With the development of sparse representation and compressed sensing (CS) [6, 7], the last decade has received remarkable advances resulting in a number of sparse methods for DoA estimation. Sparse methods can be applied in several challenging scenarios, including cases with no knowledge of the number of sources, a limited number of snapshots, and highly correlated sources. The early grid-based sparse methods have been focused on discrete linear systems, meaning that the DoAs have to lie on a set of predefined grid points so that the collected snapshots can be sparsely represented under a finite discrete dictionary. However, in reality, the DoAs are continuously valued and nonlinear with the measured data; hence, there is no guarantee that the true DoAs will be on the grid. This results in a grid or basis mismatch problem [8, 9]. To address this issue, several off-grid methods have been proposed [1012]. However, these methods bring complexity issue while the grid selection remains a major problem since gridding still exists. Recently, gridless sparse methods have been developed [13, 14]. These methods eliminate the requirement of gridding by working directly in a continuous infinite dictionary and recover the DoAs by solving a semidefinite programming (SDP) problem.

The gridless sparse methods in DoA estimation are categorized into two methodologies. The first one is based on the concept of atomic norm minimization (ANM) [15]. While, the second methodology makes use of sparse representation of second-order statistics of the array output, which is designated as gridless CF.

ANM has been successfully applied in estimating a spectrally-sparse signal without any statistical assumptions in [12, 16]. It has been shown that the frequencies can be exactly estimated if they are sufficiently separated by at least , where denotes the size of full data. Later, this condition is reduced to in [17]. Initially, the atomic norm soft-thresholding (AST) method was introduced in [18] for full data case in the presence of stochastic noise. Then, the study was extended to the case of compressive data for a single measurement vector (SMV), where the problem was referred to as continuous CS (CCS) in [12]. Next, CCS was expanded to the case of multiple measurement vectors (MMVs) in [19]. In order to alleviate the limitation of estimation resolution due to the effect of separation condition, the reweighted atomic-norm minimization (RAM) was proposed in [20] with applicability to DoA estimation.

The gridless CF technique in DoA estimation is not new. The covariance matching estimation technique (COMET) in [21] and the asymptotic maximum likelihood (AML) technique in [22] are both consistent gridless DoA estimators as the signal-to-noise ratio (SNR) increases. In addition, the continuous sparse recovery (CSR) method [23] and the nuclear norm minimization method followed by MUSIC (NNM-MUSIC) [13], do not require the number of sources to estimate the DoAs precisely, while the knowledge of error bounds is essential. The gridless sparse iterative covariance-based estimation (GL-SPICE or GLS) method [24] (also referred to as the sparse and parametric approach (SPA) [25]) has many merits. Mainly, GLS is a hyperparameter-free method that can estimate the noise variance(s) consistently, also it is applicable to an arbitrary number of snapshots and robust against moderate sources correlation. However, in the case of finite snapshots, GLS suffers from the separation condition due to the theoretical relation with ANM. Recently, the covariance matrix reconstruction approach (CMRA) was developed in [26] as a super-resolution method. Nevertheless, CMRA faces difficulties in highly correlated sources scenarios. Furthermore, the study in [27] has proposed an improved version of CMRA (ICMRA) by applying a family of nonconvex penalties on the singular values of the covariance matrix. ICMRA has shown an enhancement in sparsity and resolution compared to CMRA. On the contrary, ICMRA suffers from unsatisfactory performance in the presence of low SNR and low number of snapshots. A clearer comparison of the abovementioned methods is provided in Table 1.

Concerning noise, many sparse methods assume that the antenna array noises are spatially uniform white with equal variances. This assumption helps in choosing a suitable regularization parameter in a sparse representation optimization formula. In several practical applications, this may be violated since the noise levels among all antennas exhibit some perturbations due to the imperfection of the array calibration, the nonidealities of the communication channels, or the mutual coupling between antennas. In many sparse methods, the antennas are sparsely placed so their noises are spatially uncorrelated and modeled as a diagonal covariance matrix [28, 29], but the antennas noise variances are no longer identical (i.e., nonuniform noise). Such a noise model is relevant in situations with CF-based methods [14], while ANM-based methods are favored when the noise is uniform with bounded power or when the sources are highly correlated. In fact, the nonuniform noise is a challenging problem in sparse estimation methods, where the sample covariance matrix is no longer low-rank. To eliminate this problem the recent research studies get rid of noise in two major methods, the first one is by using linear transform primarily and then solving noise-free optimization problem as in [30, 31]. However, this method needs to design a linear transform able to separate the noise from the signal, which is pointless in the general cases, specifically when the transformation requires prior knowledge of the noise parameters. The second method treats the noise variance as a variable to be optimized in the optimization problem [14, 25]. In this paper, the latter method is adopted.

Despite the convexity of the preliminary gridless optimization models, it has been shown that the estimation performance of such convex relaxation will degrade in the presence of measurement noise or when some assumptions are violated (e.g., spatial sources separation limit and signals correlation). Hence, the solution may significantly deviate from the original (i.e., rank minimization). In order to achieve a better approximation of the original solution, many nonconvex surrogate functions have been proposed, including norm, Logarithm, Laplace, and Schatten-p norm which are well-studied in the literature [32, 33]. In [27], the DoA problem has been formulated in a nonconvex minimization problem using a family of nonconvex penalties on the singular values of the covariance matrix resulting in an enhancement in both sparsity and resolution. Recently, we have proposed a high-resolution gridless method for noiseless joint sparse frequency estimation named modified reweighted atomic norm minimization (MRAM) [34]. MRAM was based on nonconvex relaxation of ANM by Schatten-p norm. This approximation has offered improved performance compared to trace norm relaxation. Hence, due to the one-to-one relation between frequency and direction information, we look forward to achieve a better gridless DoA estimation using Schatten-p quasi-norm penalty.

3. Main Contributions

Since, gridless CF is preferable in the presence of a nonuniform noise background, in this paper we develop a new gridless sparse method for DoA estimation based on recently developed gridless CF methods. By exploiting the positive semidefinite (PSD) and Toeplitz structure of the parameterized data covariance matrix, a gridless CF criterion is formulated in an efficient joint sparsity optimization problem. More specifically, we propose a novel objective function based on introducing the nonconvex Schatten-p quasi-norm as a sparsity metric. Then, for a more tractable and scalable optimization problem, the unified surrogate for Schatten-p quasi-norm with two-factor matrix norms is applied. Consequently, an iterative reweighted minimization method is derived with a SDP formulation. Regarding the major role of the array geometry and in order to acquire a high degree of freedom with less mutual coupling and a reasonable number of physical antennas, the coprime array (CPA) structure [23, 35] is adopted. Then, the new method has been tested, where the numerical results have shown an improved performance as compared to the state-of-art methods.

Notations 1. Throughout this paper, the bold letters are reserved for matrices and vectors. The transpose is denoted by , and the complex conjugate or Hermitian is denoted by . and represent matrix trace and rank, respectively. stands for the expectation. , , and refer to , nuclear, and Forbenius norms, respectively. means that is a positive semidefinite matrix.

4. Problem Formulation

Assume there are uncorrelated narrow-band far-field source signals , impinge from directions on a sparse linear CPA array. In particular, for a coprime number pair with . CPA consists of two uniform linear subarrays, the first consists of elements having interelement spacing of , whereas the other consists of elements having interelement spacing of , where is chosen to be half of the wavelength λ/2. According to the properties of coprime integers, the elements of the subarrays do not overlap with each other when they are aligned except the first element. Therefore, CPA consists of physical antennas in total. The CPA output data matrix can be considered as the output of virtual elements uniform linear array (ULA) by retaining the antennas indexed by . The observation model of CPA at t snapshot can be modeled as follows:where L is the number of snapshots that are jointly sparse, in the sense that they share the same support where the DoAs are encoded, and , , and are the array output, the source signal vector, and the noise powers vector, respectively. denotes the steering matrix, where is the steering vector of the kth source which is determined by the geometry of the antenna array and is given bywhere is the physical positions of the ith CPA element. Furthermore, it is assumed that are spatially and temporarily uncorrelated with independently and uniformly distributed phases. It follows that , where is the vector of the powers of sources.

Moreover, suppose the noise is independent of and satisfies , where denotes the noise variances vector whose elements are assumed to be unequal in this study. Briefly, the array output for snapshots can be modeled aswhere is the sources signals matrix and is the measurement noise matrix.

Concerning the main goal of DoA estimation, in this paper, we seek to estimate the unknown parameters given the measurement matrix .

5. Gridless CF with Nonconvex Schatten-p Minimization

In this section, we propose a new gridless method to estimate the parameters by introducing the nonconvex Schatten-p norm regularization into the same CF criterion as in Sparse Iterative Covariance-based Estimation (SPICE) method [36] and its gridless version method (GL-SPICE or GLS) [24, 25]. Mainly, the proposed method inherits the merits of GLS, with enhanced performance as it is verified by the numerical simulations in Section 6.

5.1. Gridless CF

The key idea of gridless CF is exploiting the structure of the data covariance matrix and performing the estimation under ungrided parameterized domain with specific statistical considerations (i.e., uncorrelated sources and reasonable number of snapshots). Based on the assumptions in Section 4, the uncorrelated data have the data covariance matrix given by

Note that exists in the presence of noise for . is a highly nonlinear function of the DoAs parameters. To linearize the problem, a reparameterization of using a PSD Toeplitz matrix is performed. Hence, according to Vandermonde decomposition lemma (see, e.g., [3]), can be linearly reparametrized aswhere for is a PSD Toeplitz matrix formed by the vector . denotes a selection matrix, such that the ith row of contains all zeros but a single 1 at the position. The importance of the parameterization in equation (5) is that is linear in the vector , which comprises the interesting parameters.

Recall, the sample covariance matrix is set as . In this paper, we are mainly interested in the case, where (i.e., exists with probability one) and (i.e., is rank-deficient, then can be uniquely determined [25]). As well, we consider the same CF criterion as in the SPICE method, which is given by

By using the linear parameterization in equation (5), the CF can be accomplished by minimizing leading to the following equivalences:

Hence, if f minimization is casted as SDP, then it is convex. Problem (9) was derived under the development of the GLS method. In particular, the GLS method transforms the direction estimation problem into the estimation of a PSD Toeplitz matrix in which the parameters of the directions are encoded. Therefore, once the SDP in problem (9) is solved, the parameters of interest can be obtained from the estimated covariance matrix by either using the conventional subspace methods or applying the Vandermonde decomposition lemma.

In spite of the strong theoretical guarantees of the GLS method, it suffers from considerable limitations according to Table 1. In fact, we believe that its performance can be further improved by employing the nonconvex penalty, which to the best of our knowledge, has not been studied in the literature. Note that the nonconvex Schatten-p quasi-norm regularization is able to achieve a more accurate and sparser solution than the trace norm. In this paper, we reformulate the minimization problem of f in a new objective function by introducing Schatten-p quasi-norm. Then, the resultant optimization problem is modeled as a SDP formula.

5.2. Schatten-p Norm

Let be the Schatten-p norm of the matrix , which is defined as the norm of its singular values , i.e.,

Recently, Schatten-p norm has received significant attention from researchers in various domains [27, 33], due to its outstanding properties that make it a good choice in many optimization problems (e.g., rank function relaxation). Specifically, equals to the nuclear norm when . As well, if , then the quasi-norm (for , is a quasi-norm because it does not satisfy the triangle inequality) tends to . Moreover, Zhang et al. [37] verified theoretically that with small requires significantly fewer measurements, with superior performance over the conventional nuclear norm in the majority of the problems.

5.3. The Proposed Method

Motivated by the properties of Schatten-p norm and GLS method, we develop a new method by replacing the term in problem (8) by its Schatten-p quasi-norm, leading to the following optimization problem:where is a regularization parameter and is a PSD matrix clearly:

According to equation (10), is given aswhere are descendingly sorted singular values of .

Since , then for any satisfyingthe bi-Schatten-p norm surrogate can be applied successfully [38]. In other words, the Schatten-p quasi-norm of matrix is equivalent to minimizing the product of the Schatten-p1 norm (or quasi-norm) and Schatten-p2 norm (or quasi-norm) of its two factor matrices. So, as an alternative to the nonconvex quasi-norm in (12), the following efficient factorization is used:

Notice that is fixed, so whatever the value of is on minimizing Problem (14), one only needs to minimize . Substituting the spectral regularization in Problem (11) with Problem (14) and setting , we infer the following problem:With respect to equation (13), if , Problem (15) is strongly convex; thus, it can be easily solved by off-the-shelf algorithms. When , then Problem (15) becomes nonconvex and must be relaxed. However, we empirically found that the later problem has superior properties in terms of estimation performance and solution sparsity. Therefore, in the following, we consider only the nonconvex model and work on the relaxation issue of Problem (15).

Let’s set for simplicity. Since the proposed objective function in Problem (15) is nonconvex, then it must be linearized. A popular locally convergent approach to the minimization of such a problem is the majorization maximization (MM) method [39]. Accordingly, at the lth iteration, the Schatten-p quasi-norm term will be replaced by its tangent plane, which is given aswhere >0 is a smoothing parameter. Notice that is constant. Consequently, by applying equation (16) into Problem (15), the resultant optimization formula will be rewritten as

In Problem (17), the Schatten-p quasi-norm term is reduced to the weighted nuclear norm (WNN), where and is the reweighting matrix in the lth iteration, whose elements are . Therefore, we name our method as reweighted p-GLS (Rwp-GLS). The matrix is updated based on the latest solution of the problem, and is obtained by using the eigen decomposition of so that .

Since is a concave function when , then at each iteration its value decreases significantly, faster than the decrease of its tangent plane. It follows that, by iteratively solving Problem (17), the objective function monotonically decreases and converges to a local minimum. Following the same steps in deriving Problem (9), at each iteration of Rwp-GLS, we need to solve the following SDP:

By solving the SDP in Problem (18) by off-the-shelf SDP solvers, e.g., SDPT3 [40], is obtained given the solution . Once is obtained, the following task is to estimate the parameters . Note that Rwp-GLS does not assume any knowledge of , which must be determined in order to acquire the final estimation step. Recall that ; hence, determination can be formulated as estimation. In Rwp-GLS, based on the eigenvalue decomposition of , is set as the number of eigenvalues that are greater than a predefined threshold. Considering the assumptions in Section 5.1, we acquire the parameters from uniquely by applying the Vandermonde decomposition lemma as (readers are referred to [25], for more information).

5.4. Convergence Analysis

As shown, Rwp-GLS method converges to a local minimum, here we analyze its convergence from other side. It is well known that WNN minimization problem is nonconvex in general, and no efficient methods in theory can guarantee the global minimum. According to [41], WNN converges weakly to the solution if the weights are nondescendingly ordered, through a weighted soft-thresholding operator. By observing that are nonnegative monotonically decreasing, then the ascending order of the resultant weights will be kept throughout the reweighting process. Hence, we can conclude that Rwp-GLS method converges weakly to the solution.

5.5. Computational Complexity

The overall computational complexity of Rwp-GLS has two dominated parts, the first one is related to the computation of the reweighted CF, which is an iterative process, and the second is the cost of estimation. To make the following discussion more sense, let’s set , and . According to the iterative part, the overall computational complexity equals the cost per iteration times the number of iterations. In each iteration, Rwp-GLS requires computing and , where the numbers of required floating-point operations (flops) are and , respectively. Then, solving the SDP must be performed, where the interior-point based SDPT3 solver requires flops [40], in the case of variable size, and as the dimension of the PSD matrix. Finally, applying the Vandermonde decomposition to estimate consumes up to .

6. Simulations and Results

In this section, the performance of Rwp-GLS method is tested and compared with state-of-art methods via numerical simulations. The implementation of Rwp-GLS is done using MATLAB v.14. a by SDPT3 solver of CVX toolbox [42], on a PC with 2 GHz CPU.

6.1. Simulation Setups

All the simulations are carried out for two equal-power uncorrelated signals (i.e. ), impinge onto 7-elements CPA with and with half-wavelength element spacing. Consider both signals and noise are i.i.d. Gaussian. The root mean squared error (RMSE) of the DoA estimation is computed as , where is the set of true directions, indicates the set of estimated directions and is the number of Monte Carlo runs. The SNR is defined for equal-power sources as [28],where are spatially uncorrelated white Gaussian random. Furthermore, to measure the variation of the noise on the antenna array, we use the worst noise power ratio (WNPR) as [28].

At first, we discuss the choice of and parameters and study the influence of applying different values of both on the performance of the proposed method. Since Problem (15) puts a penalty on , we plot the function together with , and norms in Figure 1. As shown in Figure 1(a), converges faster toward whenever and yields to a sparser optimizer. While, by fixing as small as 0.1 and using different values of , then will get close to norm as , and approach norm when as depicted in Figure 1(b).

However, after linearization, we expect that the resultant reweighting function in Problem (18) will play the same role as when and . Hence, throughout the iterative process, is initiated by and reduced with a factor in each iteration. Regarding the iterative process, the iteration is stopped if the relative change of at two consecutive iterations is less than , or the maximum number of iterations which sets to 10 is reached. Finally, the results in each experiment are evaluated after 100 Monte Carlo runs.

6.2. DoA Estimation Performance Metrics

Consider two uncorrelated sources with spatial separation limit is defined as [23],

Taking for example (i.e. ), dB with (i.e. ) and snapshots.

Initially, the success probability (SP) performance as a function of the number of iterations is examined. In Figure 2(a), we plot SP for two different values of , while each is repeated for three values of factor . As shown, SP of Rwp-GLS is the best all over the number of iterations when regardless of value. In contrast to the case of , where SP slightly varies with , and further it gets worse when the number of iteration becomes larger. However, using larger is somewhat better. This indicates again the outperformance of Rwp-GLS with remarkable convergence characteristics for small values of where the impact of becomes negligible.

Figure 2(b) points to the influence of the reweighting function choice that is applied in the first iteration. In the case of , it is observed that SP is better when a nontrivial weighting based on the sample covariance matrix of the measurements is used rather than the weightless first iteration.

Further, we test the performance of Rwp-GLS method for different values of with respect to different settings of WNPR and SNR. Figure 3(a) shows that the RMSE of Rwp-GLS perform the best, over the range for , where SNR = 0 dB. In Figure 3(b), we fix and repeat the test for . The results show that the best performance is realized for .

Regarding the results of the previous experiments, in the following simulations, we report the results considering the following settings: , , and , besides to adopt the weighted first iteration as the one tested in Figure 2(b).

6.3. Comparisons with Prior Arts

In this section, the DoA estimation performance of Rwp-GLS is evaluated and compared with the conventional RootMUSIC method, which is applied directly to the aforementioned CPA. In addition, we consider two remarkable gridless CF methods: SPA (we compare with SPA because it uses the same gridless CF procedure as GLS and agrees with the comparative methods in the model order estimation technique, while GLS adopts different technique) and ICMRA with Schatten-p penalty (ICMRAlp) to perform this comparison. These methods are derived under similar assumptions on sources and noise and applied with their best settings. We fix and all over the following simulations unless otherwise is mentioned. Moreover, to deliver a benchmark for the evaluation, the stochastic Cramer–Rao lower bound (CRLB) for nonuniform noise given in [30] is included.

Suppose two signals impinge onto the CPA from directions . By varying the SNR from −15 dB to 25 dB, Figure 4 shows that the performance of Rwp-GLS is the best, and it improves persistently with increasing SNR and follows the CRLB when SNR ≥ −5 dB. Thus, Rwp-GLS can separate the two incident signals precisely when SNR is as low as −5 dB, in contrast to RootMUSIC, which suffers fixed error at high SNR due to the limited number of array elements.

Figure 5 presents the CPU times of the experiment associated with Figure 4. Rwp-GLS is an iterative method with significant convergence characteristics due to well initialization and its computational complexity in one iteration has a similar complexity as SPA; then, Rwp-GLS costs a similar CPU time to SPA and differs in weighting computing. Note that the time usages of both Rwp-GLS and SPA become slightly greater when increasing the SNR because of the estimation. It is worth mentioning, ICMRAlp is an iterative method with convergence characteristics as Rwp-GLS, but without the ability to estimate the noise variances, which explains the slight decrease in CPU time-consuming.

In the following simulation, we evaluate the estimation performance of the Rwp-GLS versus the number of snapshots varying from 10 to 250. Consider the settings of , SNR = 0 dB, and . Figure 6 shows that the estimation performance of Rwp-GLS outperforms the other methods and it follows well the CRLB as . In contrast to ICMRAlp, whose performance degrades significantly when .

Next, we examine the performance of DoA estimation versus angle separation. Assume two signals impinge onto the CPA from directions . The RMSEs of the comparative methods are shown in Figure 7, with varying from to , dB, and . It can be observed that Rwp-GLS surpasses the other methods in localizing spatially adjacent signals. In particular, Rwp-GLS is able to coincide with the CRLB when the sources are only apart.

In Figure 8, the RMSEs of DoA estimation using the aforementioned methods are plotted against the variation of correlation factor . Furthermore, we carry out this simulation for two closely spaced sources, for example, let (i.e., ), where, dB and In Figure 8, we observe that, as the correlation factor increases the performances of the compared methods degrade, in particular, ICMRAlp is damaged significantly. However, the Rwp-GLS method has the best performance and exhibits immunity to the incorporated correlations among sources.

To further validate the robustness of the Rwp-GLS against the nonuniform noise, we present the RMSEs versus the WNPR in Figure 9, where WNPR is varied from 20 to 200, under the settings of SNR = 0 dB and for two uncorrelated signals. It can be seen that, as WNPR increments, Rwp-GLS has the lowest RMSE over the studied WNPR range. This indicates again that the proposed method is robust against the nonuniform noise.

6.4. Power Estimation

Finally, we attempt to estimate the DoAs and powers of 7 equal-power signals impinged onto the CPA from directions simultaneously. SNR is set as 10 dB and (i.e., WNRP = 10). We carry out 500 independent trials and show the results in Figure 10. Black circles denote the positions of the true powers and corresponding DoAs, while the colored dots denote the estimated ones. As shown, the Rwp-GLS method can correctly locate both DoAs and powers for all sources.

7. Conclusions

In this paper, we have proposed a gridless sparse method for DoA estimation named Rwp-GLS. The new method reformulates the DoA estimation, by introducing the unified surrogate of the nonconvex Schatten-p quasi-norm to penalize the parametrized covariance matrix in the gridless CF criterion. Then, an iterative reweighted minimization method was derived with a SDP formulation. The effectiveness of Rwp-GLS is verified through numerical simulations under the consideration of CPA structure and unknown nonuniform noise background. The results show the superiority of Rwp-GLS in terms of resolution, applicability in a limited number of snapshots, and immunity against nonuniform noise and correlation of sources. In the future, we look forward to reasonable computational efficiency implementation and arbitrary array geometries incorporation.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.