EEG signal classification using PCA, ICA, LDA and support vector machines

https://doi.org/10.1016/j.eswa.2010.06.065Get rights and content

Abstract

In this work, we proposed a versatile signal processing and analysis framework for Electroencephalogram (EEG). Within this framework the signals were decomposed into the frequency sub-bands using DWT and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients. Principal components analysis (PCA), independent components analysis (ICA) and linear discriminant analysis (LDA) is used to reduce the dimension of data. Then these features were used as an input to a support vector machine (SVM) with two discrete outputs: epileptic seizure or not. The performance of classification process due to different methods is presented and compared to show the excellent of classification process. These findings are presented as an example of a method for training, and testing a seizure prediction method on data from individual petit mal epileptic patients. Given the heterogeneity of epilepsy, it is likely that methods of this type will be required to configure intelligent devices for treating epilepsy to each individual’s neurophysiology prior to clinical operation.

Introduction

Electroencephalograms (EEGs) are recordings of the electrical potentials produced by the brain. Analysis of EEG activity has been achieved principally in clinical settings to identify pathologies and epilepsies since Hans Berger’s recording of rhythmic electrical activity from the human scalp. In the past, interpretation of the EEG was limited to visual inspection by a neurophysiologist, an individual trained to qualitatively make a distinction between normal EEG activity and abnormalities contained within EEG records. The advance in computers and the technologies related to them has made it potential to successfully apply a host of methods to quantify EEG changes (Bronzino, 2000).

Compared with other biomedical signals, the EEG is extremely difficult for an untrained observer to understand, partially as a consequence of the spatial mapping of functions onto different regions of the brain and electrode placement. Besides, data processing can be determination of reduced features set including only data needed for quantification, as in evoked response recordings, or feature extraction and subsequent pattern recognition, as in automated spike detection during monitoring for epileptic seizure activity. In early attempts to show a relationship between the EEG and behavior, analog frequency analyzers were used to examine the EEG data. This approach is based on earlier interpretation that the EEG spectrum contains some characteristic waveforms that fall primarily within four frequency bands – δ (<4 Hz), θ (4–8 Hz), α (8–13 Hz), and β (13–30 Hz). Even though unsatisfactory, these initial efforts did bring in the use of frequency analysis to the study of brain wave activity. Although power spectral analysis provides a quantitative measure of the frequency distribution of the EEG at the expense of other details in the EEG such as the amplitude distribution and information relating to the presence of particular EEG patterns. Hence time–frequency signal-processing algorithms such as discrete wavelet transform (DWT) analysis are necessary to address different behavior of the EEG in order to describe it in the time and frequency domain. It should also be emphasized that the DWT is suitable for analysis of non-stationary signals, and this represents a major advantage over spectral analysis. Hence the DWT is well suited to locating transient events. Such transient events as spikes can occur during epileptic seizures (Adeli et al., 2003, Bronzino, 2000, D’Alessandro et al., 2003, Subasi, 2007).

An exciting application of seizure prediction technology is its potential for use in therapeutic epilepsy devices to trigger intervention to prevent seizures before they begin. Seizure prediction has been investigated by type to include prediction by studying preictal features, prediction by fast detection, prediction by classification, and prediction by probability estimation. Studies in seizure prediction vary widely in their theoretical approaches to the problem, validation of results, and the amount of data analyzed. Some relative weaknesses in this literature are the lack of extensive testing on baseline data free from seizures, the lack of technically rigorous validation and quantification of algorithm performance in many studies (Adeli et al., 2003, D’Alessandro et al., 2003, Subasi, 2006, Subasi, 2007).

Principal component analysis (PCA), independent component analysis (ICA) and linear discriminant analysis (LDA) are well-known methods for feature extraction (Cao et al., 2003, Wang and Paliwal, 2003, Widodo and Yang, 2007). Feature extraction means transforming the existing features into a lower-dimensional space which is useful for feature reduction to avoid the redundancy due to high-dimensional data. In this work, DWT has been applied for the time–frequency analysis of EEG signals for the classification using wavelet coefficients. EEG signals were decomposed into frequency sub-bands using DWT. Then a set of statistical features was extracted from these sub-bands to represent the distribution of wavelet coefficients. PCA, ICA and LDA are used to reduce the dimension of data. Then these features were used as an input to a support vector machine (SVM) with two discrete outputs: epileptic or non-epileptic seizure. The accuracy of the various classifiers will be assessed and cross-compared, and advantages and limitations of each technique will be discussed. The simulation shows that SVM by feature extraction using PCA, ICA or LDA can always perform better than that without feature extraction. Furthermore, among the three methods, the best performance is achieved in LDA feature extraction.

Section snippets

Subjects and data recording

We used the publicly available data described in Andrzejak et al. (2001). The complete data set1 consists of five sets (denoted A–E) each containing 100 single-channel EEG segments. Sets A and B consisted of segments taken from surface EEG recordings that were carried out on five healthy volunteers using a standardized electrode placement scheme. Volunteers were relaxed in an awake state

Results and discussion

In this study, we used EEG signals of normal and epileptic patients in order to perform a comparison between the PCA, ICA and LDA by using SVM. EEG recordings were divided into sub-band frequencies such as α, β, δ and θ by using DWT. Then a set of statistical features was extracted from the wavelet sub-band frequencies δ (1–4 Hz), θ (4–8 Hz), α (8–13 Hz) and β (13–30 Hz). After normalization, the EEG signals were decomposed using wavelet transform and the statistical features were extracted from

Conclusion

Diagnosing epilepsy is a difficult task requiring observation of the patient, an EEG, and gathering of additional clinical information. SVMs that classifies subjects as having or not having an epileptic seizure provides a valuable diagnostic decision support tool for physicians treating potential epilepsy, since differing etiologies of seizures result in different treatments. Conventional classification methods of EEG signals using mutually exclusive time and frequency domain representations

References (22)

  • J.D. Bronzino

    Principles of electroencephalography

  • Cited by (999)

    View all citing articles on Scopus
    View full text