A novel multi-view clustering approach via proximity-based factorization targeting structural maintenance and sparsity challenges for text and image categorization

https://doi.org/10.1016/j.ipm.2021.102546Get rights and content

Highlights

  • We propose a proximity-based factorization model for multi-view clustering.

  • The proposed model is robust to sparse data.

  • The algorithm constructs proximity matrices for each view.

  • These matrices are used to model distribution of data points in the common subspace.

  • The performance of our algorithm is shown both analytically and experimentally.

Abstract

Multi-view data contains a set of features representing different perspectives associated with the same data and this phenomenon can be commonly observed in real-world applications. Multi-view clustering in terms of text and image data faces substantial challenges such as Structure-preserving and Sparsity. Existing methods do not conserve the structure of data space and the recent improvements have earmarked only the local layout. Preserving the local structure of data space is not sufficient to handle sparsity in these data. In this paper, we propose a novel clustering approach, called Proximity-based Multi-View Non-negative Matrix Factorization (PMVNMF), which utilizes both the local and global structure of data space conjointly to handle sparsity in real-world multimedia (text and image) data. For each view, the 1-step and 2-step transition probability matrices as the first-order and second-order proximity matrices are constructed to uncover their respective latent local and global geometric structures. Then, view-specific proximity matrices as an integration of the above two types of proximity matrices are constructed. Eventually, Non-negative Matrix Factorization (NMF) is explored via graph regularization and consensus regularization, to consider the obtained integrated graph structures as well as to disclose the indistinct common structure shared by all representations. The algorithm can capture elementary structure of data space and is robust to sparse data. We conduct experiments on six real-world datasets including two text and four image datasets; and compare the performance of the proposed algorithm with eight baseline approaches. Six evaluation metrics including accuracy, f-score, precision, recall, NMI, and entropy are employed to evaluate the performance of algorithm. The results show the outperformance of proposed algorithm over baselines.

Introduction

The data collected from diverse information sources for multi-feature subjects is termed Multi-view data, where individual views represent distinct perspectives of the same data and the respective feature sets characterize each of these perspectives (Wang et al., 2018a). For instance, the language may be considered as one view in document categorization, because the same incident is reported by different news articles in various languages. Based on the publication records, research communities can be formed on the basis of research areas, keywords, or co-authorship links. Multi-view data is prevalent in many real-world applications (Chen et al., 2017, Schöps et al., 2017, Sun, 2013, Zhang, Fu et al., 2018). Clustering is one of the most popular approaches used to analyze multi-view data with the aim to integrate the available multiple set of features to identify consistent group structures across views. The clustering algorithms facilitate better data representation for many applications such as image categorization and segmentation for computer vision; event detection on social multimedia; document categorization for natural language processing; genetic association in bioinformatics, etc. Conventional multi-view clustering methods concatenate the features of different views into a single union followed by an application of single-view clustering method. However, these methods are not capable of making an effective use of multi-view information and there is no mechanism to guarantee that the resultant clusters optimally represent the heterogeneity among different views. Multi-view clustering has thus received considerable attention from the research community. In the past, several researchers have proposed algorithms for multi-view clustering to summarize the consistent and complementary properties of multi-view data, typically based on factorization models. However, most of these previous methods are focused on shallow factorization models that are inefficient in handling sparsity in data and are also incompetent in capturing their complex elementary structure. Thus, multi-view clustering faces two important challenges: (i) Structure-preserving: clustering solution needs to preserve the intrinsic structure of data space. However, the underlying structure of real-world data is dependent on both the local and global structures and it is hard to preserve both structures simultaneously in multi-view clustering. (ii) Sparsity: Exploiting the confined number of links in real-world data is not sufficient to achieve the optimum clustering solution due to the sparse nature of data. Liu et al. (2013) proposed a multi-view NMF model for multi-view clustering, but it was observed that it failed to conserve the underlying structure of data. Cai et al. (2011) imposed graph regularization into NMF framework to preserve the local geometric structure of single-view data space that was further extended by Wang, Kong et al. (2015) for multi-view data. The method was based on a locally invariant assumption, which states that nearby points are likely to have similar embeddings. The assumption had been exhausted successfully in many machine learning problems to enhance embedding performance.

Recent studies in data embedding (Luqman et al., 2013, Wang et al., 2016a) and machine learning (Goyal and Ferrara, 2018, Kunegis et al., 2010) have illustrated that respecting the underlying structure of data set is elementary to achieve an optimal clustering solution. Specifically, in the context of graphs, the pairwise similarity between data points characterizes the local structure of data and is known as first-order proximity. The second-order proximity unfolds the similarity between neighborhood structures characterizing the global structure. The former proximity describes similarity between data points connected by an edge, while the latter describes similarity between all the points, even the ones that are not connected with an edge. Due to the sparsity of data, first-order proximity is insufficient to represent the underlying structure. Therefore, we present an unsupervised multi-view factorization method to instate clusters that jointly characterize the first-order and second-order proximity of each view. The algorithm extends Multi-NMF model (Liu et al., 2013) by constructing view-specific proximity matrices representing intrinsic structure of each view. These structures are then interpolated into the Multi-NMF framework using graph regularization to exhaust high-order proximity in the clustering solution.

The main contributions in this work are as follows:

  • We propose a novel proximity-based factorization model for multi-view clustering which captures the obscure local and global structures of data space and is robust to sparse data.

  • The proposed algorithm first constructs integrated structures representing indistinct local as well as global structures of respective views and then exploits these structures into NMF framework using graph regularization and consensus regularization to model the distribution of data points in the common subspace.

  • We derive an iterative updating algorithm to solve the optimization problem and demonstrate the scalability and convergence of our algorithm analytically as well as by running it over a variety of real-world data.

  • We extensively evaluate the performance of proposed algorithm on six different real-world multi-view datasets. The efficiency of algorithm is evident with experimental results embracing six evaluation metrics that exhibit a substantial gain over baselines across all datasets.

Section snippets

Problem definition

In this subsection, we state the definitions of nearest-neighbor graph, first-order proximity, and second-order proximity followed by problem definition.

Definition 1

Nearest-Neighbor Graph - A nearest-neighbor graph is denoted as G(V,E), where V={v1,v2,,vn} represents vertices and E={eij}i,j=1n represents the edges. When vi and vj are linked by an edge eij, their edge weight, wij>0, and 0 otherwise.

Here, vertices represent data points of real-world dataset and edges represent similarity between data

Related work

Multi-view clustering algorithms can be broadly classified into five categories: (i) Algorithms that obtain a concise view of data via certain loss function optimization, and incorporate the integrated view into the clustering process (Bickel and Scheffer, 2004, Cai et al., 2013); (ii) Canonical Correlation Analysis (CCA) (Chaudhuri et al., 2009) based algorithms that rely on the assumption of uncorrelated views. These algorithms project multi-view data into a consensus lower-dimensional

Overview of MultiNMF

We explain MultiNMF, the preliminary algorithm to perceive proposed algorithm in this section. MultiNMF (Liu et al., 2013) is a NMF-based approach of clustering for multi-view environment. NMF (Xu et al., 2003) is a matrix factorization method of machine learning which finds two non-negative matrix factors, WR+M×K and HR+N×K called, base matrix and coefficient matrix respectively to obtain a good approximation of a single-view dataset, X, i.e. XWHT. Here, K is the desired reduced dimension.

Proposed algorithm

In this section, we introduce the proposed algorithm as a four-step framework:

Step-1: Affinity matrix construction - Recent studies in spectral graph theory and manifold learning theory have illustrated that the nearest neighbor graph effectively models the local geometric structure on disseminated data points (Cai et al., 2011). We construct a t-nearest neighbor graph G with N vertices, where each vertex corresponds to a data point and for each data point xi, we acquire its t nearest neighbors

Experimental analysis

In this section, we evaluate the performance of proposed algorithm which utilizes the proximity matrix and its Graph Laplacian as an exploratory tool to study the correlation between intrinsic geometric structure and hidden semantics across different views. Experiments on different real-world datasets illustrate the effectiveness of proposed algorithm in discovering optimal clusters in multi-view data. All the experiments are conducted on an Intel Core i7-10700 machine of 2.90 GHz frequency and

Conclusion

In this work, we introduced a novel proximity-based algorithm, PMVNMF for multi-view clustering, aiming to preserve the geometric structure of individual representations while dealing with sparsity in real-world text and image data. We proposed a clustering framework which utilizes first-order and second-order proximities conjointly in a multi-view NMF model. The Laplacian regularization is performed on integrated proximity matrices of respective views such that the obtained clustering solution

CRediT authorship contribution statement

Monika Bansal: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing - original draft. Dolly Sharma: Conception and design of study, Analysis and/or interpretation of data, Writing - original draft, Writing - review & editing.

Acknowledgment

All authors approved the version of the manuscript to be published.

References (53)

  • VoD.-T. et al.

    Feature-enriched matrix factorization for relation extraction

    Information Processing & Management

    (2019)
  • XuC.

    A novel recommendation method based on social network using matrix factorization technique

    Information Processing & Management

    (2018)
  • YangZ. et al.

    MMED: A multi-domain and multi-modality event dataset

    Information Processing & Management

    (2020)
  • ZhangX. et al.

    Multi-view clustering based on graph-regularized nonnegative matrix factorization for object recognition

    Information Sciences

    (2018)
  • ZongL. et al.

    Multi-view clustering via multi-manifold regularized non-negative matrix factorization

    Neural Networks

    (2017)
  • AminiM.R. et al.

    Learning from multiple partially observed views - an application to multilingual text categorization

  • BickelS. et al.

    Multi-view clustering

  • CaiD. et al.

    Graph regularized nonnegative matrix factorization for data representation

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2011)
  • CaiX. et al.

    Multi-view K-means clustering on big data

  • CaoS. et al.

    GraRep: Learning Graph representations with global structural information

  • ChaudhuriK. et al.

    Multi-view clustering via canonical correlation analysis

  • ChenX. et al.

    Multi-view 3D object detection network for autonomous driving

  • ChuaT.-S. et al.

    NUS-WIDE: A real-world web image database from national university of Singapore

  • GreeneD. et al.

    A matrix factorization approach for integrating multiple data views

  • GreeneD. et al.

    Producing a unified graph representation from multiple social network views

  • KumarA. et al.

    A co-training approach for multi-view spectral clustering

  • Cited by (14)

    • Co-consensus semi-supervised multi-view learning with orthogonal non-negative matrix factorization

      2022, Information Processing and Management
      Citation Excerpt :

      Along this line, multi-view learning has achieved considerable research interest (Chao et al., 2021). It has shown to be promising in many real-world applications, e.g., detecting coherent groups (Wang et al., 2020), video action classification (Peng et al., 2020), text and image categorization (Bansal & Sharma, 2021), and so on. In multi-view learning scenario, one challenge is how to effectively fuse the multi-view information, as different views often own different distributions.

    • Multi-kernel graph fusion for spectral clustering

      2022, Information Processing and Management
    • Density-based structural embedding for anomaly detection in dynamic networks

      2022, Neurocomputing
      Citation Excerpt :

      However, it is essential to preserve both local and global structure of networks to support network analyzing applications. Specifically, the local and global structures of the network are defined by first-order and second-order proximity respectively [19]. The first-order proximity describes pairwise proximity between vertices in which two vertices connected with an edge are assumed to be similar; while vertices sharing many common neighbors are considered to be similar under second-order proximity [17].

    • MTGCN: A multi-task approach for node classification and link prediction in graph data

      2022, Information Processing and Management
      Citation Excerpt :

      Wang, Li, Tao, Dong, Wang, and Liu (2021) proposed a trio-based collaborative learning framework to enhance the complementary advantages between different views in the graph. Bansal and Sharma (2021) proposed proximity-based multi-view non-negative matrix factorization to exploit both the local and global structure of the data space to process multimedia data. Node classification and link prediction are related because of node labels and edges are related in a graph, but in different types of graphs, the strength of this relationship is different.

    View all citing articles on Scopus
    View full text