Sequential dimensionality reduction for extracting localized features
Introduction
Linear dimensionality reduction (LDR) techniques are powerful tools for the representation and analysis of high dimensional data. The most well-known and widely used LDR is principal component analysis (PCA) [14]. When dealing with nonnegative data, it is sometimes crucial to take into account the nonnegativity in the decomposition to be able to interpret the LDR meaningfully. For this reason, nonnegative matrix factorization (NMF) was introduced and has been shown to be very useful in several applications such as document classification, air emission control and microarray data analysis; see, e.g., [7] and the references therein. Given a nonnegative input data matrix and a factorization rank r, NMF looks for two matrices and such that . Hence each row of the input matrix M is approximated via a linear combination of the rows of V: for , In other words, the rows of V form an approximate basis for the rows of M, and the weights needed to reconstruct each row of M are given by the entries of the corresponding row of U. The advantage of NMF over PCA (that does not impose nonnegativity constraints on the factors U and V) is that the basis elements V can be interpreted in the same way as the data (e.g., as vectors of pixel intensities; see Section 3 for some illustrations) while the nonnegativity of the weights in U make them easily interpretable as activation coefficients. In this paper, we focus on imaging applications and, in particular, on blind hyperspectral unmixing which we describe in the next section.
A hyperspectral image (HSI) is a three dimensional data cube providing the electromagnetic reflectance of a scene at varying wavelengths measured by hyperspectral remote sensors. Reflectance varies with wavelength for most materials because energy at certain wavelengths is scattered or absorbed to different degrees, this is referred to as the spectral signature of a material; see, e.g., [21]. Some materials will reflect the light at certain wavelengths, others will absorb it at the same wavelengths. This property of hyperspectral images is used to uniquely identify the constitutive materials in a scene, referred to as endmembers, and classify pixels according to the endmembers they contain. A hyperspectral data cube can be represented by a two dimensional pixel-by-wavelength matrix . The columns of M () are original images that have been converted into n-dimensional column vectors (stacking the columns of the image matrix into a single vector), while the rows of M () are the spectral signatures of the pixels (see Fig. 1). Each entry Mij represents the reflectance of the i-th pixel at the j-th wavelength. Under the linear mixing model, the spectral signature of each pixel results from the additive linear combination of the nonnegative spectral signatures of the endmembers it contains. In that case, NMF allows us to model hyperspectral images because of the nonnegativity of the spectral signatures and the abundances: Given a hyperspectral data cube represented by a two dimensional matrix , NMF approximates it with the product of two factor matrices and such that the spectral signature of each pixel (a row of matrix M) is approximated by the additive linear combination of the spectral signatures of the endmembers (rows of matrix V), weighted by coefficients Uik representing the abundance of the k-th endmember in the i-th pixel. For all i, we have: where r is the number of endmembers in the image. The matrix U is called the abundance matrix while the matrix V is the endmember matrix. Fig. 1 illustrates this decomposition on the urban hyperspectral data cube.
Unfortunately, as opposed to PCA, NMF is a difficult problem (NP-hard) [22]. Moreover, the decomposition is in general non-unique and has to be recomputed from scratch when the factorization rank r is modified. For these reasons, a variant of NMF, referred to as nonnegative matrix underapproximation, was recently proposed that allows us to compute factors sequentially; it is presented in the next section.
Nonnegative matrix underapproximation (NMU) [8] was introduced in order to solve NMF sequentially, that is, to compute one rank-one factor at a time: first compute , then , etc. In other words, NMU tries to identify sparse and localized features sequentially. In order to keep the nonnegativity in the sequential decomposition, it is natural to use the following upper bound constraint for each rank-one factor of the decomposition: for all , Hence, given a data matrix , NMU solves, at the first step, the following optimization problem:referred to as rank-one NMU. Then, the nonnegative residual matrix is computed, and the same procedure can be applied on the residual matrix R. After r steps, NMU provides a rank-r NMF of the data matrix M. Compared to NMF, NMU has the following advantages:
- 1.
As PCA, the solution is unique (under some mild assumptions) [10].
- 2.
As PCA, the solution can be computed sequentially, and hence the factorization rank does not need to be chosen a priori.
- 3.
As NMF, it leads to a separation by parts. Moreover the additional underapproximation constraints enhance this property leading to better decompositions into parts [8] (see also Section 3 for some numerical experiments).
Modifications of the original NMU algorithm were made by adding prior information into the model, as it is also often done with NMF; see, e.g., [4] and the references therein. More precisely, two variants of NMU have been proposed:
- 1.
One adding sparsity constraints on the abundance matrix, dubbed sparse NMU [11].
- 2.
One adding spatial information about pixels, dubbed spatial NMU [12].
Section snippets
Algorithm for NMU with priors
In this section, we describe our proposed technique that will incorporate both spatial and sparsity priors into the NMU model. This will allow us to extract more localized and more coherent features in images.
Experimental results
In this section, we conduct several experiments to show the effectiveness of PNMU.
In the first part, we use synthetic data sets for which we know the ground truth hence allowing us to quantify very precisely the quality of the solutions obtained by the different algorithms.
In the second part, we validate the good performance of PNMU on real data sets, and compare the algorithms on two widely used data sets, namely, the Cuprite hyperspectral image and the CBCL face data set.
The following
Conclusions
In this paper a variant of NMU was proposed, namely PNMU, taking into account both sparsity and spatial coherence of the abundance elements. Numerical experiments have shown the effectiveness of PNMU in correctly generating sparse and localized features in images (in particular, synthetic, hyperspectral and facial images), with a better trade-off between sparsity, spatial coherence and reconstruction error.
Acknowledgment
We would like to thank the reviewers for their insightful feedback that helped us improve the paper significantly. NG acknowledges the support by the F.R.S.-FNRS (incentive grant for scientific research no. F.4501.16) and by the ERC (starting grant no. 679515)
Gabriella Casalino received her Ph.D. degree in Computer Science in 2015, from University of Bari (Italy), the topic of her thesis being “Non-negative factorization methods for extracting semantically relevant features in Intelligent Data Analysis”. From University of Bari she also received the M.Sc. degree and the B.Sc. degree in Computer Science in 2013 and 2008, respectively. She is currently a postdoctoral research fellow at the Department of Pharmaceutical Sciences, University of Bari,
References (23)
- et al.
Using underapproximations for sparse nonnegative matrix factorization
Pattern Recognit.
(2010) - et al.
Sparse nonnegative matrix underapproximation and its application to hyperspectral image analysis
Linear Algebra Appl.
(2013) - et al.
Rational variety mapping for contrast-enhanced nonlinear unsupervised segmentation of multispectral images of unstained specimen
Am. J. Pathol.
(2011) - et al.
Chance-constrained robust minimum-volume enclosing simplex algorithm for hyperspectral unmixing
IEEE Trans. Geosci. Rem. Sens.
(2011) - et al.
Two well-known properties of subgradient optimization
Math. Program.
(2009) - M.W. Berry, N. Gillis, F. Glineur, Document classification using nonnegative matrix factorization and...
- A. Cichocki S. Amari R. Zdunek A.H. Phan Non-Negative Matrix and Tensor Factorizations: Applications to Exploratory...
- et al.
Iteratively reweighted least squares minimization for sparse recovery
Commun. Pure Appl. Math.
(2010) Sparse and unique nonnegative matrix factorization through data preprocessing
J. Mach. Learn. Res.
(2012)- N. Gillis The why and how of nonnegative matrix factorization, in regularization, optimization, kernels, and support...
Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization
Neural Comput.
Cited by (24)
Rank-constrained nonnegative matrix factorization for data representation
2020, Information SciencesCitation Excerpt :Therefore, it is a part-based representation learning method. Various prior knowledge of data has been used to constrain the model of matrix factorization improving performance in many applications [16–19]. Cai et al. [30,21] proposed graph-based regularized NMF methods that exploit the intrinsic geometric structure of the data using a regularization technique.
Non-Negative Matrix Underapproximation as Optimal Frequency Band Selector
2023, IECON Proceedings (Industrial Electronics Conference)Optimised denoising sparse autoencoder for the detection of outliers for face recognition
2023, International Journal of BiometricsSpatially coherent clustering based on orthogonal nonnegative matrix factorization
2021, Journal of ImagingRobust Geometric Model Fitting Based on Nonnegative Matrix Underapproximation with Pruning Techniques for Multi-Structure Data
2021, Jisuanji Xuebao/Chinese Journal of Computers
Gabriella Casalino received her Ph.D. degree in Computer Science in 2015, from University of Bari (Italy), the topic of her thesis being “Non-negative factorization methods for extracting semantically relevant features in Intelligent Data Analysis”. From University of Bari she also received the M.Sc. degree and the B.Sc. degree in Computer Science in 2013 and 2008, respectively. She is currently a postdoctoral research fellow at the Department of Pharmaceutical Sciences, University of Bari, working on microarray data analysis. Her research interests include computational intelligence, intelligent data analysis, and non-negative matrix factorization.
Nicolas Gillis received a master degree and a Ph.D. degree in Applied Mathematics from Université Catholique de Louvain (Belgium) in 2007 and 2011, respectively. He is currently an associate professor at the Department of Mathematics and Operational Research, Faculté Polytechnique, Université de Mons, Belgium. His research interests lie in optimization, numerical linear algebra, machine learning, data mining and hyperspectral imaging.