Elsevier

Pattern Recognition Letters

Volume 111, 1 August 2018, Pages 101-108
Pattern Recognition Letters

Algorithms for two dimensional multi set canonical correlation analysis

https://doi.org/10.1016/j.patrec.2018.04.038Get rights and content

Highlights

  1. Novel algorithms to implement two dimensional multi-set canonical correlation analysis, for multiple image datasets.

  2. Pre-processing step of vectorizing the image data, needed by regular multiset canonical correlation analysis, is eliminated.

  3. Preserves the spatial structure of the image data compared to the regular multiset canonical correlation analysis.

  4. It has useful applications in multi subject medical image analysis.

Abstract

Multi set canonical correlation analysis (mCCA), which extends the application of canonical correlation analysis (CCA) to more than two datasets, is a data driven technique that can jointly analyze the relationship amongst multiple (more than two) datasets. However, the conventional mCCA is directly applicable only to multivariate vector data and requires the image data to be reshaped into vectors. This approach fails to consider the spatial structure of the images and in addition, leads to an increase in the computational complexity. In this paper, we propose new two dimensional mCCA algorithms that operate directly on the image data instead of vectorizing them. Face recognition experiments are presented to compare the performances of conventional mCCA and the proposed two dimensional mCCA techniques. Additionally, experiments against fMRI data are conducted to demonstrate the applicability of the proposed approach in multisubject fMRI analysis.

Introduction

Canonical correlation analysis (CCA) [1] is a data driven technique that has been successfully applied to capture the linear relationship amongst various types of multivariate data. Basic objective of CCA is to determine the coordinate system, which represents the best possible linear relationship between the given multivariate datasets [1] by maximizing the mutual correlation between the datasets. Formally, CCA is defined as the problem of finding two sets of projection vectors a and b for the two sets of multivariate data X and Y, such that correlation between projected values of X on a and Y on b is maximized. Multi-set canonical correlation analysis (mCCA) extended the applicability of CCA to more than two datasets [2]. Various approaches to compute mCCA have been developed. They aim to optimize an objective function of the correlation matrix of the data-sets to be analyzed. It has been successfully used in many applications such as genomic data integration to identify the relationship amongst multiple phenotypic measures [3], cross-language document retrieval [4], etc. The equivalence between linear discriminant analysis (LDA) and CCA is proved in [5]. A new variant, called within-class coupling CCA, is proposed that is applicable in case of data whose samples are implicitly indicative of their class membership. An efficient investigation of CCA and generalized CCA for a text document classification task is presented in [6].

mCCA finds applications in image data as well. Rapid advancements in imaging devices in the last decade have enabled technologies such as remote sensing, medical imaging to generate large amounts of image data. mCCA, for instance, has been used previously in the analysis of image data in applications such as feature extraction and classification [7], and analysis of remote sensing data [8]. Kernel canonical correlation analysis (KCCA) has been used in fMRI analysis in the estimation of correlated subspaces datasets from multiple subjects of a particular medical imaging modality [9]. Conventional mCCA theory is directly applicable to only multivariate vector data. Therefore, the image data has to be vectorized before it can be analysed using mCCA. However, CCA on vectorized data does not consider the spatial structure of the images. In addition, vectorization results in a large covariance matrix that may be ill-conditioned, which makes the solution unstable or non-existent and also, it increases the computational complexity. A two dimensional CCA, directly applicable to image data, was first proposed in [10]. It defines two separate projection vectors that operate along the row and column directions of the image data and therefore, does not require the image to be vectorized. However, it is limited to only two datasets and cannot be used for the analysis of more than two datasets. To overcome these drawbacks, we propose a 2DmCCA algorithm that can be used to analyze multiple (more than two) image datasets, simultaneously [11].

The paper is organized as follows: Section 2 describes the background work on the mCCA and 2DCCA. Section 3 discusses the proposed 2DmCCA framework and a new iterative procedure is described to implement the proposed 2DmCCA algorithm. Section 4 contains the performance comparisons of the proposed 2DmCCA algorithm with conventional mCCA [2] in face recognition experiments. Experiments against block paradigm right finger tapping fMRI are also included in section to demonstrate the applicability of the proposed approach in multisubject medical image analysis. Concluding remarks are given in Section 5.

Section snippets

Two dimensional canonical correlation analysis

Canonical correlation analysis directly applicable to matrix data, known as two dimensional CCA (2DCCA) was first developed in [10]. Given two sets of matrices X=[X1,,XN] where Ximx×nx and Y=[Y1,,YN] where Yimy×ny, 2DCCA defines left linear transforms and right linear transforms that operate along the rows and columns of the matrices, respectively. Let mX=1Ni=1NXi and mY=1Ni=1NYi be the mean matrices of dataset X and Y respectively. Centered datasets X˜ and Y˜ can be obtained as X˜i=Xim

Proposed work

This section details the development of the proposed 2DmCCA framework that can be directly applied to more than two matrix datasets. 2DmCCA objectives are defined and optimization problems are formulated, which are then solved by placing constraints on the canonical coefficient values. A new iterative algorithm is described to compute the canonical coefficient vectors.

Results

Face recognition experiment is carried out using 2DmCCA and conventional mCCA [2] to compare the performances of the proposed 2DmCCA and regular mCCA approaches. Results for conventional mCCA, using constraint 2, are not provided due to its high memory space requirements and a high time complexity. Therefore, we were unable to implement it on our current hardware (64-bit system equipped with an Intel®i7-4790 CPU running at 3.60GHz), as it showed an out of memory error. An application of the

Conclusion

Two dimensional CCA is analyzed in the context of mCCA to develop new 2DmCCA algorithms. The conventional mCCA requires image data to be reshaped into vectors. The proposed algorithms overcome this limitation and can be efficiently applied to image data. One of the future extensions is to induce sparsity into this framework to develop a sparse 2DmCCA framework that helps account for the parsimony in the biomedical data [18].

References (18)

There are more references available in the full text version of this article.

Cited by (13)

  • Clustering adaptive canonical correlations for high-dimensional multi-modal data

    2020, Journal of Visual Communication and Image Representation
    Citation Excerpt :

    In different researches, MCCA is also called multi-view CCA or multi-set CCA. With the efforts of many scholars, MCCA has been applied to a great deal of real-world applications, such as human emotion recognition [9], image classification [10], temporal alignment [11], and multi-subject fMRI analysis [12]. For better catering to these applications, scholars have also proposed some variants of MCCA, and the improvement under the multi-modal correlation analysis framework mainly focuses on discriminative information embedding, local structure preserving, kernel technique, and projection direction orthogonality and so on.

  • Two dimensional CCA via penalized matrix decomposition for structure preserved fMRI data analysis

    2019, Digital Signal Processing: A Review Journal
    Citation Excerpt :

    To address the above mentioned issues, a two dimensional CCA (2DCCA) was proposed in [32], which directly used image data. 2DCCA is based on the 2D image representation of data and it defines two separate canonical projection vectors corresponding to the row and column directions of the image data, therefore, image-to-vector transformation is not required [12]. In the case of fMRI data, 2DCCA suffers from high dimensionality associated with the large number of volumized pixels (i.e., voxels).

  • Self-balanced multi-view orthogonality correlation analysis for image feature learning

    2019, Infrared Physics and Technology
    Citation Excerpt :

    Thus many scholars have transferred their researches from CCA to MCCA. Up to now, MCCA has been widely applied into many real-world applications, such as temporal alignment [5], multi-subject fMRI analysis [9], and prediction of recurrent prostate cancer [10]. Additionally, for suiting various real-world applications, many variants of MCCA have also proposed, which focus on class information embedding, graph theory, and kernel technique.

  • An Improved Canonical Correlation Analysis Method with Adaptive Graph Learning

    2022, Lecture Notes on Data Engineering and Communications Technologies
View all citing articles on Scopus

This work was supported by the Australian Research Council through Grant FT. 130101394.

View full text