A novel multi-view clustering approach via proximity-based factorization targeting structural maintenance and sparsity challenges for text and image categorization

doi:10.1016/j.ipm.2021.102546

Information Processing & Management

Volume 58, Issue 4, July 2021, 102546

https://doi.org/10.1016/j.ipm.2021.102546 Get rights and content

Highlights

•
We propose a proximity-based factorization model for multi-view clustering.
•
The proposed model is robust to sparse data.
•
The algorithm constructs proximity matrices for each view.
•
These matrices are used to model distribution of data points in the common subspace.
•
The performance of our algorithm is shown both analytically and experimentally.

Abstract

Multi-view data contains a set of features representing different perspectives associated with the same data and this phenomenon can be commonly observed in real-world applications. Multi-view clustering in terms of text and image data faces substantial challenges such as Structure-preserving and Sparsity. Existing methods do not conserve the structure of data space and the recent improvements have earmarked only the local layout. Preserving the local structure of data space is not sufficient to handle sparsity in these data. In this paper, we propose a novel clustering approach, called Proximity-based Multi-View Non-negative Matrix Factorization (PMVNMF), which utilizes both the local and global structure of data space conjointly to handle sparsity in real-world multimedia (text and image) data. For each view, the 1-step and 2-step transition probability matrices as the first-order and second-order proximity matrices are constructed to uncover their respective latent local and global geometric structures. Then, view-specific proximity matrices as an integration of the above two types of proximity matrices are constructed. Eventually, Non-negative Matrix Factorization (NMF) is explored via graph regularization and consensus regularization, to consider the obtained integrated graph structures as well as to disclose the indistinct common structure shared by all representations. The algorithm can capture elementary structure of data space and is robust to sparse data. We conduct experiments on six real-world datasets including two text and four image datasets; and compare the performance of the proposed algorithm with eight baseline approaches. Six evaluation metrics including accuracy, f-score, precision, recall, NMI, and entropy are employed to evaluate the performance of algorithm. The results show the outperformance of proposed algorithm over baselines.

Introduction

The data collected from diverse information sources for multi-feature subjects is termed Multi-view data, where individual views represent distinct perspectives of the same data and the respective feature sets characterize each of these perspectives (Wang et al., 2018a). For instance, the language may be considered as one view in document categorization, because the same incident is reported by different news articles in various languages. Based on the publication records, research communities can be formed on the basis of research areas, keywords, or co-authorship links. Multi-view data is prevalent in many real-world applications (Chen et al., 2017, Schöps et al., 2017, Sun, 2013, Zhang, Fu et al., 2018). Clustering is one of the most popular approaches used to analyze multi-view data with the aim to integrate the available multiple set of features to identify consistent group structures across views. The clustering algorithms facilitate better data representation for many applications such as image categorization and segmentation for computer vision; event detection on social multimedia; document categorization for natural language processing; genetic association in bioinformatics, etc. Conventional multi-view clustering methods concatenate the features of different views into a single union followed by an application of single-view clustering method. However, these methods are not capable of making an effective use of multi-view information and there is no mechanism to guarantee that the resultant clusters optimally represent the heterogeneity among different views. Multi-view clustering has thus received considerable attention from the research community. In the past, several researchers have proposed algorithms for multi-view clustering to summarize the consistent and complementary properties of multi-view data, typically based on factorization models. However, most of these previous methods are focused on shallow factorization models that are inefficient in handling sparsity in data and are also incompetent in capturing their complex elementary structure. Thus, multi-view clustering faces two important challenges: (i) Structure-preserving: clustering solution needs to preserve the intrinsic structure of data space. However, the underlying structure of real-world data is dependent on both the local and global structures and it is hard to preserve both structures simultaneously in multi-view clustering. (ii) Sparsity: Exploiting the confined number of links in real-world data is not sufficient to achieve the optimum clustering solution due to the sparse nature of data. Liu et al. (2013) proposed a multi-view NMF model for multi-view clustering, but it was observed that it failed to conserve the underlying structure of data. Cai et al. (2011) imposed graph regularization into NMF framework to preserve the local geometric structure of single-view data space that was further extended by Wang, Kong et al. (2015) for multi-view data. The method was based on a locally invariant assumption, which states that nearby points are likely to have similar embeddings. The assumption had been exhausted successfully in many machine learning problems to enhance embedding performance.

Recent studies in data embedding (Luqman et al., 2013, Wang et al., 2016a) and machine learning (Goyal and Ferrara, 2018, Kunegis et al., 2010) have illustrated that respecting the underlying structure of data set is elementary to achieve an optimal clustering solution. Specifically, in the context of graphs, the pairwise similarity between data points characterizes the local structure of data and is known as first-order proximity. The second-order proximity unfolds the similarity between neighborhood structures characterizing the global structure. The former proximity describes similarity between data points connected by an edge, while the latter describes similarity between all the points, even the ones that are not connected with an edge. Due to the sparsity of data, first-order proximity is insufficient to represent the underlying structure. Therefore, we present an unsupervised multi-view factorization method to instate clusters that jointly characterize the first-order and second-order proximity of each view. The algorithm extends Multi-NMF model (Liu et al., 2013) by constructing view-specific proximity matrices representing intrinsic structure of each view. These structures are then interpolated into the Multi-NMF framework using graph regularization to exhaust high-order proximity in the clustering solution.

The main contributions in this work are as follows:

•
We propose a novel proximity-based factorization model for multi-view clustering which captures the obscure local and global structures of data space and is robust to sparse data.
•
The proposed algorithm first constructs integrated structures representing indistinct local as well as global structures of respective views and then exploits these structures into NMF framework using graph regularization and consensus regularization to model the distribution of data points in the common subspace.
•
We derive an iterative updating algorithm to solve the optimization problem and demonstrate the scalability and convergence of our algorithm analytically as well as by running it over a variety of real-world data.
•
We extensively evaluate the performance of proposed algorithm on six different real-world multi-view datasets. The efficiency of algorithm is evident with experimental results embracing six evaluation metrics that exhibit a substantial gain over baselines across all datasets.

Section snippets

Problem definition

In this subsection, we state the definitions of nearest-neighbor graph, first-order proximity, and second-order proximity followed by problem definition.

Definition 1

Nearest-Neighbor Graph - A nearest-neighbor graph is denoted as $G (V, E)$ , where $V = {v_{1}, v_{2}, \dots, v_{n}}$ represents vertices and $E = {e_{i j}}_{i, j = 1}^{n}$ represents the edges. When $v_{i}$ and $v_{j}$ are linked by an edge $e_{i j}$ , their edge weight, $w_{i j} > 0$ , and 0 otherwise.

Here, vertices represent data points of real-world dataset and edges represent similarity between data

Related work

Multi-view clustering algorithms can be broadly classified into five categories: (i) Algorithms that obtain a concise view of data via certain loss function optimization, and incorporate the integrated view into the clustering process (Bickel and Scheffer, 2004, Cai et al., 2013); (ii) Canonical Correlation Analysis (CCA) (Chaudhuri et al., 2009) based algorithms that rely on the assumption of uncorrelated views. These algorithms project multi-view data into a consensus lower-dimensional

Overview of MultiNMF

We explain MultiNMF, the preliminary algorithm to perceive proposed algorithm in this section. MultiNMF (Liu et al., 2013) is a NMF-based approach of clustering for multi-view environment. NMF (Xu et al., 2003) is a matrix factorization method of machine learning which finds two non-negative matrix factors, $W \in R_{+}^{M \times K}$ and $H \in R_{+}^{N \times K}$ called, base matrix and coefficient matrix respectively to obtain a good approximation of a single-view dataset, $X$ , i.e. $X \approx W H^{T}$ . Here, $K$ is the desired reduced dimension.

Proposed algorithm

In this section, we introduce the proposed algorithm as a four-step framework:

Step-1: Affinity matrix construction - Recent studies in spectral graph theory and manifold learning theory have illustrated that the nearest neighbor graph effectively models the local geometric structure on disseminated data points (Cai et al., 2011). We construct a $t$ -nearest neighbor graph $G$ with $N$ vertices, where each vertex corresponds to a data point and for each data point $x_{i}$ , we acquire its $t$ nearest neighbors

Experimental analysis

In this section, we evaluate the performance of proposed algorithm which utilizes the proximity matrix and its Graph Laplacian as an exploratory tool to study the correlation between intrinsic geometric structure and hidden semantics across different views. Experiments on different real-world datasets illustrate the effectiveness of proposed algorithm in discovering optimal clusters in multi-view data. All the experiments are conducted on an Intel Core i7-10700 machine of 2.90 GHz frequency and

Conclusion

In this work, we introduced a novel proximity-based algorithm, PMVNMF for multi-view clustering, aiming to preserve the geometric structure of individual representations while dealing with sparsity in real-world text and image data. We proposed a clustering framework which utilizes first-order and second-order proximities conjointly in a multi-view NMF model. The Laplacian regularization is performed on integrated proximity matrices of respective views such that the obtained clustering solution

CRediT authorship contribution statement

Monika Bansal: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing - original draft. Dolly Sharma: Conception and design of study, Analysis and/or interpretation of data, Writing - original draft, Writing - review & editing.

Acknowledgment

All authors approved the version of the manuscript to be published.

References (53)

ChenY. et al.
Graph-regularized least squares regression for multi-view subspace clustering
Knowledge-Based Systems
(2020)
ChikhiN.F.
Multi-view clustering via spectral partitioning and local refinement
Information Processing & Management
(2016)
FangY. et al.
Unsupervised cross-modal retrieval via multi-modal graph regularized smooth matrix factorization hashing
Knowledge-Based Systems
(2019)
GoyalP. et al.
Graph embedding techniques, applications, and performance: A survey
Knowledge-Based Systems
(2018)
HuangS. et al.
Auto-weighted multi-view clustering via deep matrix decomposition
Pattern Recognition
(2020)
JiangB. et al.
Bi-level weighted multi-view clustering via hybrid particle swarm optimization
Information Processing & Management
(2016)
KangZ. et al.
Multi-graph fusion for multi-view spectral clustering
Knowledge-Based Systems
(2020)
LuqmanM.M. et al.
Fuzzy multilevel graph embedding
Pattern Recognition
(2013)
NguyenM.-T. et al.
Web document summarization by exploiting social context with matrix co-factorization
Information Processing & Management
(2019)
QiuX. et al.
Unsupervised multi-view non-negative for law data feature learning with dual graph-regularization in smart Internet of Things
Future Generation Computer Systems
(2019)

VoD.-T. et al.

Feature-enriched matrix factorization for relation extraction

Information Processing & Management

(2019)

XuC.

A novel recommendation method based on social network using matrix factorization technique

Information Processing & Management

(2018)

YangZ. et al.

MMED: A multi-domain and multi-modality event dataset

Information Processing & Management

(2020)

ZhangX. et al.

Multi-view clustering based on graph-regularized nonnegative matrix factorization for object recognition

Information Sciences

(2018)

ZongL. et al.

Multi-view clustering via multi-manifold regularized non-negative matrix factorization

Neural Networks

(2017)

AminiM.R. et al.

Learning from multiple partially observed views - an application to multilingual text categorization

BickelS. et al.

Multi-view clustering

CaiD. et al.

Graph regularized nonnegative matrix factorization for data representation

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2011)

CaiX. et al.

Multi-view K-means clustering on big data

CaoS. et al.

GraRep: Learning Graph representations with global structural information

ChaudhuriK. et al.

Multi-view clustering via canonical correlation analysis

ChenX. et al.

Multi-view 3D object detection network for autonomous driving

ChuaT.-S. et al.

NUS-WIDE: A real-world web image database from national university of Singapore

GreeneD. et al.

A matrix factorization approach for integrating multiple data views

GreeneD. et al.

Producing a unified graph representation from multiple social network views

KumarA. et al.

A co-training approach for multi-view spectral clustering

Cited by (14)

Inter- and intra-hypergraph regularized nonnegative matrix factorization with hybrid constraints
2024, Engineering Applications of Artificial Intelligence
The accurate low rank representation of high-dimensional data learned by the manifold regularized nonnegative matrix factorization framework is effective in data clustering. In previous work, this work has mainly been solved in the way of similarity matrix induction. To further increase the efficacy of low-rank representations, we propose a novel semi-supervised non-negative matrix factorization (NMF) model in this study called inter- and intra-hypergraph regularized non-negative matrix factorization with hybrid constraints (IGNMFC). IGNMFC constructs intra-hypergraph regularization and intra-hypergraph regularization by hypergraph learning, which can precisely induce high-dimensional data to map toward low-dimensional. Moreover, hybrid constraints are introduced to improve the exclusivity and sparsity of low-dimensional representations, and the result accounts for the benefit of this method in learning distinguishable subspace representations. Finally, IGNMFC is transformed into an optimal problem and an efficient iteration rule is proposed. Experiments on several datasets demonstrate that the proposed method outperforms the state-of-the-art NMF algorithms, and can achieve at least 7.9% $\sim$ 15% accuracy improvement in most cases.
Joint long and short span self-attention network for multi-view classification
2024, Expert Systems with Applications
Multi-view classification aims to efficiently utilize information from different views to improve classification performance. In recent researches, many effective multi-view learning methods have been proposed to perform multi-view data analysis. However, most existing methods only consider the correlations between views but ignore the potential correlations between samples. Normally, the views of samples belonging to the same category should have more consistency information and those belonging to different categories should have more distinctions. Therefore, we argue that the correlations and distinctions between the views of different samples also contribute to the construction of feature representations that are more conducive to classification. In order to construct a end-to-end general multi-view classification framework that can better utilize sample information to obtain more reasonable feature representation, we propose a novel joint long and short span self-attention network (JLSSAN). We designed two different self-attention spans to focus on different information. They enable each feature vector to be iteratively updated based on its attention to other views and other samples, which provides better integration of information from different views and different samples. Besides, we adopt a novel weight-based loss fusion strategy, which facilitates the model to learn more reasonable self-attention map between views. Our method outperforms the state-of-the-art methods by more than 3% in accuracy on multiple benchmarks, which demonstrates that our method is effective.
Co-consensus semi-supervised multi-view learning with orthogonal non-negative matrix factorization
2022, Information Processing and Management
Citation Excerpt :
Along this line, multi-view learning has achieved considerable research interest (Chao et al., 2021). It has shown to be promising in many real-world applications, e.g., detecting coherent groups (Wang et al., 2020), video action classification (Peng et al., 2020), text and image categorization (Bansal & Sharma, 2021), and so on. In multi-view learning scenario, one challenge is how to effectively fuse the multi-view information, as different views often own different distributions.
Semi-supervised multi-view learning has recently achieved appealing performance with the consensus relation between samples. However, in addition to the relation between samples, the relation between samples and their assemble centroid is also important to the learning. In this paper, we propose a novel model based on orthogonal non-negative matrix factorization, which allows exploring both the consensus relations between samples and between samples and their assemble centroid. Since this model utilizes more consensus information to guide the multi-view learning, it can lead to better performance. Meanwhile, we theoretically derive a proposition about the equivalency between the partial orthogonality and the full orthogonality. Based on this proposition, the orthogonality constraint and the label constraint are simultaneously implemented in the proposed model. Experimental evaluations on five real-world datasets show that our approach outperforms the state-of-the-art methods, where the improvement is 6% average in terms of ARI index.
Multi-kernel graph fusion for spectral clustering
2022, Information Processing and Management
Many methods of multi-kernel clustering have a bias to power base kernels by ignoring other kernels. To address this issue, in this paper, we propose a new method of multi-kernel graph fusion based on min–max optimization (namely MKGF-MM) for spectral clustering by making full use of all base kernels. Specifically, the proposed method investigates a novel min–max weight strategy to capture the complementary information among all base kernels. As a result, every base kernel contributes to the construction of the fusion graph from all base kernels so that the quality of the fusion graph is guaranteed. In addition, we design an iterative optimization method to solve the proposed objective function. Furthermore, we theoretically prove that our optimization method achieves convergence. Experimental results on real medical datasets and scientific datasets demonstrate that the proposed method outperforms all comparison methods and the proposed optimization method achieves fast convergence.
Density-based structural embedding for anomaly detection in dynamic networks
2022, Neurocomputing
Citation Excerpt :
However, it is essential to preserve both local and global structure of networks to support network analyzing applications. Specifically, the local and global structures of the network are defined by first-order and second-order proximity respectively [19]. The first-order proximity describes pairwise proximity between vertices in which two vertices connected with an edge are assumed to be similar; while vertices sharing many common neighbors are considered to be similar under second-order proximity [17].
Dynamic networks continuously change their structure and properties, and anomaly detection methods must identify these structural changes at both local and global levels. Network embedding is considered a powerful tool for low-dimension structural representation of network objects. However, most embedding techniques developed for dynamic networks are incompetent in capturing the changes in global structures. Besides, with incoming edge streams, these techniques change the embeddings of all network objects, including those vertices and edges that do not experience any change in subsequent timestamps. In this paper, we propose an embedding algorithm DSEDN to identify anomalous vertices and edges in dynamic networks utilizing structural changes in networks. The algorithm uses sparse autoencoder to generate network embeddings minimizing the pair-wise and neighborhood distance between vertex representations of every subgraph derived from random walks. The subgraphs have been weighted based on their respective clustering coefficient, and the clustering algorithm is employed to identify network anomalies. The advantages of proposed method include: (i) better accuracy in detecting network anomalies; (ii) structure-preserving embeddings, such that it maintains the local and global structure of every snapshot of growing graphs; (iii) density-based embeddings; (iv) stable embeddings over time such that embeddings of consecutive timestamps do not change much for those network objects that do not experience any change; (v) scalability. We evaluate the proposed algorithm on anomaly detection and graph visualization on five real-world datasets. Our algorithm achieves improvement in AUC, scalability, and stability across all baselines.
MTGCN: A multi-task approach for node classification and link prediction in graph data
2022, Information Processing and Management
Citation Excerpt :
Wang, Li, Tao, Dong, Wang, and Liu (2021) proposed a trio-based collaborative learning framework to enhance the complementary advantages between different views in the graph. Bansal and Sharma (2021) proposed proximity-based multi-view non-negative matrix factorization to exploit both the local and global structure of the data space to process multimedia data. Node classification and link prediction are related because of node labels and edges are related in a graph, but in different types of graphs, the strength of this relationship is different.
Both node classification and link prediction are popular topics of supervised learning on the graph data, but previous works seldom integrate them together to capture their complementary information. In this paper, we propose a Multi-Task and Multi-Graph Convolutional Network (MTGCN) to jointly conduct node classification and link prediction in a unified framework. Specifically, MTGCN consists of multiple multi-task learning so that each multi-task learning learns the complementary information between node classification and link prediction. In particular, each multi-task learning uses different inputs to output representations of the graph data. Moreover, the parameters of one multi-task learning initialize the parameters of the other multi-task learning, so that the useful information in the former multi-task learning can be propagated to the other multi-task learning. As a result, the information is augmented to guarantee the quality of representations by exploring the complex constructure inherent in the graph data. Experimental results on six datasets show that our MTGCN outperforms the comparison methods in terms of both node classification and link prediction.

View all citing articles on Scopus

View full text

A novel multi-view clustering approach via proximity-based factorization targeting structural maintenance and sparsity challenges for text and image categorization

Highlights

Abstract

Introduction

Section snippets

Problem definition

Related work

Overview of MultiNMF

Proposed algorithm

Experimental analysis

Conclusion

CRediT authorship contribution statement

Acknowledgment

Knowledge-Based Systems

Information Processing & Management

Knowledge-Based Systems

Knowledge-Based Systems

Pattern Recognition

Information Processing & Management

Knowledge-Based Systems

Pattern Recognition

Information Processing & Management

Future Generation Computer Systems

Information Processing & Management

Information Processing & Management

Information Processing & Management

Information Sciences

Neural Networks

Learning from multiple partially observed views - an application to multilingual text categorization

Multi-view clustering

Graph regularized nonnegative matrix factorization for data representation

IEEE Transactions on Pattern Analysis and Machine Intelligence

Multi-view K-means clustering on big data

GraRep: Learning Graph representations with global structural information

Multi-view clustering via canonical correlation analysis

Multi-view 3D object detection network for autonomous driving

NUS-WIDE: A real-world web image database from national university of Singapore

A matrix factorization approach for integrating multiple data views

Producing a unified graph representation from multiple social network views

A co-training approach for multi-view spectral clustering