Elsevier

Neural Networks

Volume 13, Issues 4–5, June 2000, Pages 411-430
Neural Networks

Invited article
Independent component analysis: algorithms and applications

https://doi.org/10.1016/S0893-6080(00)00026-5Get rights and content

Abstract

A fundamental problem in neural network research, as well as in many other disciplines, is finding a suitable representation of multivariate data, i.e. random vectors. For reasons of computational and conceptual simplicity, the representation is often sought as a linear transformation of the original data. In other words, each component of the representation is a linear combination of the original variables. Well-known linear transformation methods include principal component analysis, factor analysis, and projection pursuit. Independent component analysis (ICA) is a recently developed method in which the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible. Such a representation seems to capture the essential structure of the data in many applications, including feature extraction and signal separation. In this paper, we present the basic theory and applications of ICA, and our recent work on the subject.

Section snippets

Motivation

Imagine that you are in a room where two people are speaking simultaneously. You have two microphones, which you hold in different locations. The microphones give you two recorded time signals, which we could denote by x1(t) and x2(t), with x1 and x2 the amplitudes, and t the time index. Each of these recorded signals is a weighted sum of the speech signals emitted by the two speakers, which we denote by s1(t) and s2(t). We could express this as a linear equation:x1(t)=a11s1+a12s2x2(t)=a21s1+a22

Definition of ICA

To rigorously define ICA (Comon, 1994, Jutten and Herault, 1991), we can use a statistical “latent variables” model. Assume that we observe n linear mixtures x1,…,xn of n independent componentsxj=aj1s1+aj2s2+⋯+ajnsn,forallj.

We have now dropped the time index t; in the ICA model, we assume that each mixture xj as well as each independent component sk is a random variable, instead of a proper time signal. The observed values xj(t), e.g. the microphone signals in the cocktail party problem, are

Definition and fundamental properties

To define the concept of independence, consider two scalar-valued random variables y1 and y2. Basically, the variables y1 and y2 are said to be independent if information on the value of y1 does not give any information on the value of y2, and vice versa. Above, we noted that this is the case with the variables s1, s2 but not with the mixture variables x1, x2.

Technically, independence can be defined by the probability densities. Let us denote by p(y1,y2) the joint probability density function

“Non-Gaussian is independent”

Intuitively speaking, the key to estimating the ICA model is non-Gaussianity. Actually, without non-Gaussianity the estimation is not possible at all, as mentioned in Section 3.3. This is at the same time probably the main reason for the rather late resurgence of ICA research: In most of classical statistical theory, random variables are assumed to have Gaussian distributions, thus precluding any methods related to ICA.

The Central Limit Theorem, a classical result in probability theory, tells

Preprocessing for ICA

In the preceding section, we discussed the statistical principles underlying ICA methods. Practical algorithms based on these principles will be discussed in the next section. However, before applying an ICA algorithm on the data, it is usually very useful to do some preprocessing. In this section, we discuss some preprocessing techniques that make the problem of ICA estimation simpler and better conditioned.

The FastICA algorithm

In the preceding sections, we introduced different measures of non-Gaussianity, i.e. objective functions for ICA estimation. In practice, one also needs an algorithm for maximizing the contrast function, for example the one in Eq. (25). In this section, we introduce a very efficient method of maximization suited for this task. It is here assumed that the data is preprocessed by centering and whitening as discussed in the preceding section.

Applications of ICA

In this section we review some applications of ICA. The most classical application of ICA, the cocktail-party problem, was already explained in Section 1 of this paper.

Conclusion

ICA is a very general-purpose statistical technique in which observed random data are linearly transformed into components that are maximally independent from each other, and simultaneously have “interesting” distributions. ICA can be formulated as the estimation of a latent variable model. The intuitive notion of maximum non-Gaussianity can be used to derive different objective functions whose optimization enables the estimation of the ICA model. Alternatively, one may use more classical

References (45)

  • J.-F. Cardoso

    Infomax and maximum likelihood for source separation

    IEEE Letters on Signal Processing

    (1997)
  • J.-F. Cardoso et al.

    Equivariant adaptive source separation

    IEEE Transactions on Signal Processing

    (1996)
  • A. Cichocki et al.

    Robust neural networks with on-line learning for blind identification and blind separation of sources

    IEEE Transactions on Circuits and Systems

    (1996)
  • T.M. Cover et al.

    Elements of information theory

    (1991)
  • Cristescu, R., Ristaniemi, T., Joutsensalo, J., & Karhunen, J. (2000). Delay estimation in CDMA communications using a...
  • D.L. Donoho et al.

    Wavelet shrinkage: asymptopia?

    Journal of the Royal Statistical Society, Series B

    (1995)
  • J. Friedman

    Exploratory projection pursuit

    Journal of the American Statistical Association

    (1987)
  • J.H. Friedman et al.

    A projection pursuit algorithm for exploratory data analysis

    IEEE Transactions of Computers

    (1974)
  • Giannakopoulos, X., Karhunen, J., & Oja, E. (1998). Experimental comparison of neural ICA algorithms. Proceedings of...
  • R. Gonzalez et al.

    Digital image processing

    (1987)
  • P. Huber

    Projection pursuit

    The Annals of Statistics

    (1985)
  • A. Hyvärinen

    New approximations of differential entropy for independent component analysis and projection pursuit

    (1998)
  • Cited by (7353)

    • Latent model extreme value index estimation

      2024, Journal of Multivariate Analysis
    • Atypical alpha band microstates produced during eyes-closed resting state EEG in autism

      2024, Progress in Neuro-Psychopharmacology and Biological Psychiatry
    View all citing articles on Scopus
    View full text