FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection☆
Research highlights
▶ We describe FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection. ▶ Artifacts in the electroencephalogram were detected and removed. ▶ FASTER had >90% sensitivity and specificity for detection of artifacts. ▶ FASTER aggregates the ERP across subject datasets, and detects outlier datasets.
Introduction
The event-related potential (ERP) is computed by aggregating across time-locked electroencephalograms (EEG) epochs. Artifacts – such as eye and muscle movements (measured by electro-oculograms (EOG) and electromyograms (EMG), respectively), and electrode displacement – can be orders of magnitude greater than the ERP, thereby greatly distorting the signal. For example, EEG signals are in the order of tens of μV whereas EOG and EMG signals are in the order of hundreds of μV. In addition to eye and muscle movement artifacts, poor scalp contact for a particular electrode will produce consistently bad data for the duration of the recording. Other artifacts include spurious electrical activity picked up by the EEG amplifier, and current drift.
One simple and computationally inexpensive approach to eye and muscle movement artifact detection and rejection involves deleting portions of the data with artifacts (e.g., EOG data with amplitudes ±75 μV). However, this can potentially lead to a large loss of data, consequently reducing the quality of the ERP. Electrodes with consistently poor signal quality are typically removed and then recreated using interpolation from the remaining electrodes, which effectively reduces the spatial resolution of the EEG. Therefore, several methods of removing eye and muscle movement artifacts while retaining EEG data have been proposed (Croft and Barry, 2000, Moretti et al., 2003, Schlögl et al., 2007).
There are extant methods for detection of artifacts in high-density EEG data, many of which are applicable only to specific artifact types (e.g., eye movement artifacts). Some methods for artifact detection have a broader scope, however. For example, the Statistical Control of Artifacts in Dense Arrays Studies (SCADS) method, fully described in Junghöfer et al. (2000), used thresholding methods to detect artifacts. In this approach, several editing matrices, containing parameters such as standard deviation (SD), maximum gradient, and maximum amplitude value, are constructed for each channel within each epoch. Thresholds are calculated for each parameter across whole epochs, whole channels, and single channels in single epochs using a non-parametric formula to measure the spread of the distribution. Whole channels, whole epochs, or single channels within single epochs whose parameters exceeded the thresholds are removed (epochs) or interpolated (channels).
Another approach to artifact detection involves the use of independent component analysis (ICA), which is widely available through the EEGLAB software suite (Delorme and Makeig, 2004). ICA is a computational method that separates time series data into statistically independent component (IC) waveforms. ICA outputs a matrix that transforms EEG data to IC data, and its inverse matrix to transform IC data back to EEG data. These matrices give information about an IC's spatial properties, and the data gives information about the IC's temporal activity. Data recorded from scalp electrodes can be considered summations of EEG data and artifact, which are independent of each other: ICA is therefore potentially a useful methodology to separate artifact from EEG signal (Jung et al., 2000, Vorobyov and Cichocki, 2002). There is, however, a need to classify the resulting components as either artifactual or neural (Bian et al., 2006). If detected, artifactual ICs can then be subtracted from the recorded data and the remaining data can be remixed. Several methods for detecting and rejecting artifacts based on ICA have been described previously in the literature. Some of these approaches can be limited by the requirement for the detection process to be trained from predefined artifacts (i.e. supervised), which are not always available, or may not be generalizable (Delorme et al., 2007, Jung et al., 1998, Schlögl et al., 2007, Yandong et al., 2006). For example, eye movement artifacts vary in shape, amplitude and length between subjects. The training procedure is generally carried out manually from visually identified artifacts, and consequently full unsupervised automation is not possible with these methods.
In short, methods have been developed for removing various types of artifacts from EEG. However, as many are application dependant, or focused on a single type of artifact, it can be difficult to choose which approach(es) to take. Furthermore, all approaches involve at least some degree of supervision for classification of artifacts. Given the trend towards ever-denser EEG arrays, such artifact rejection methods are time consuming. We describe here a method called Fully Automated Statistical Thresholding for EEG artifact Rejection (FASTER) in which raw data are imported, bad channels removed, epochs extracted, artifacts detected and removed using ICA, subjects’ data aggregated, and data sets from subjects with unacceptably artifact-contaminated detected data are removed. This fully automated, unsupervised approach – with raw EEG data as the input and epoched, artifact-attenuated data as the output – would therefore be of use to the many researchers who collect EEG data.
With any new method of processing EEG data, and in particular for a fully automated, unsupervised method, it is essential to quantify the improvement of signal-to-noise and the rate of artifact detection against other established methods. In addition, methods suitable for dense EEG arrays may not be applicable to lower density arrays, and this applicability needs to be tested also. For example, the ability to detect outliers is improved with increasing sample size and the number of independent components that can be estimated robustly is a function of the number of data points. Therefore, we tested FASTER in a number of scenarios. We compared the FASTER and SCADS methods on 128-, 64-, and 32-scalp electrode arrays simulated EEG data with simulated artifacts. The advantage of testing an artifact rejection method on simulated data is that the sensitivity and specificity of the detection algorithms can be quantified. While simulated data are useful, however, they often contain artifacts with known properties (e.g., EOG amplitude of at least 80 μV). In contrast, real EEG data contain a variety of artifacts whose properties are unknown. Therefore, we compared real 128-channel EEG data from 47 subjects analyzed using the FASTER method, artifacts detected visually by trained individuals, and with SCADS. FASTER was also compared with SCADS on 64- and 32-scalp electrode array subsets of the real data to test the effect of using fewer data points.
Section snippets
Simulated data
Forty-seven sets of simulated data were created. These consisted of 200 epochs of data simulated from P3 dipoles using the BESA Dipole Simulator program (which is found at http://www.besa.de/updates/tools/), which then had artifacts added at random. The procedure for creating artifacts was derived from Delorme et al. (2007). In order to create contaminated channels, white noise was added to a random number of channels (range 0–5). The white noise was of RMS amplitude randomly selected to be
Method overview
We describe here the general approach to artifact detection and removal, in order to give the reader an overview of the procedure. The details of each particular method are described in subsequent sections. The real datasets were converted to EEGLAB format and then referenced to Fz – this was chosen as it was common to the 128-, 64- and 32-channel datasets. The EEG data were filtered offline using equiripple filters between 1 Hz and 95 Hz with a notch filter at 50 Hz (bandwidth 6 Hz) to remove
Artifact removal quantification
The simulated data results were calculated from 47 files. We attempted to quantify the performance of each method by using metrics of sensitivity and specificity. The first, sensitivity, was the percentage of true artifacts that were detected. For example, if there were 100 artifact-contaminated epochs in a dataset, and 50 of these contaminated epochs were detected then the sensitivity would be 50%. The second, specificity, is the percentage of artifact-free EEG data that was wrongly classified
Simulated data
Due to the non-normal distribution of the detection rates, Wilcoxon Signed Ranks tests were conducted. Table 1 displays the results of the simulated data analysis.
Discussion
The aim of this study was to quantify the utility of FASTER – a fully automated statistical thresholding method for EEG artifact rejection, which also incorporates ICA. In order to quantify the performance of FASTER simulated data were analyzed using FASTER and a variant of SCADS, which is a similar statistical method of artifact detection. Furthermore, real data were analyzed using FASTER, SCADS, and by supervised detection. FASTER was also tested across different numbers of scalp electrodes
References (18)
- et al.
Removal of ocular artifact from the EEG: a review
Clin Neurophys
(2000) - et al.
EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis
J Neurosci Methods
(2004) - et al.
Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis
Neuroimage
(2007) - et al.
Estimation of Hurst exponent revisited
Comput Stat Data Anal
(2007) - et al.
Computerized processing of EEG-EOG-EMG artifacts for multi-centric studies in EEG oscillations and event-related potentials
Int J Psychophysiol
(2003) - et al.
Information-based modeling of event-related brain dynamics
Prog Brain Res
(2006) - et al.
The five percent electrode system for high-resolution EEG and ERP measurements
Clin Neurophys
(2001) - et al.
A fully automated correction method of EOG artifacts in EEG recordings
Clin Neurophys
(2007) - et al.
Semi-parametric estimation of the long-range dependence parameter: a survey
Cited by (0)
- ☆
This study was partly funded by an Enterprise Ireland grant to R.B. Reilly (eBiomed: eHealthCare based on Biomedical Signal Processing and ICT for Integrated Diagnosis and Treatment of Disease), an Irish Research Council for Science Engineering and Technology Postgraduate scholarship to H. Nolan, and by Science Foundation Ireland (09/RFP/NE2382).
- 1
These two authors contributed equally to this manuscript.