Abstract

Epileptic patients suffer from an epileptic brain seizure caused by the temporary and unpredicted electrical interruption. Conventionally, the electroencephalogram (EEG) signals are manually studied by medical practitioners as it records the electrical activities from the brain. This technique consumes a lot of time, and the outputs are unreliable. In a bid to address this problem, a new structure for detecting an epileptic seizure is proposed in this study. The EEG signals obtained from the University of Bonn, Germany, and real-time medical records from the Senthil Multispecialty Hospital, India, were used. These signals were disintegrated into six frequency subbands that employed discrete wavelet transform (DWT) and extracted twelve statistical functions. In particular, seven best features were identified and further fed into k-Nearest Neighbor (kNN), naïve Bayes, Support Vector Machine (SVM), and Decision Tree classifiers for two-type and three-type classifications. Six statistical parameters were employed to measure the performance of these classifications. It has been found that different combinations of features and classifiers produce different results. Overall, the study is a first attempt to find the best combination feature set and classifier for 16 different 2-class and 3-class classification challenges of the Bonn and Senthil real-time clinical dataset.

1. Introduction

Epilepsy is a brain disorder that includes repeated seizures in the brain due to uncontrolled electrical movement. It results in uninhibited jerking movement and momentary loss of consciousness. It is potentially life-threatening as it causes malfunction of the brain and lung, heart failure, and unexpected deaths caused by accident. Therefore, it is imperative to diagnose epilepsy [1]. The signal that records electrical movement and activity in the brain is the electroencephalogram (EEG) signal. Electrodes are placed on various parts of the scalp during this procedure and produce multichannel data. Since it is a noninvasive and inexpensive method, it serves as a vital data resource in neurological diagnosis such as seizure detection [1, 2]. Typically, medical personnel collect recordings by visually inspecting the long-term EEG. This method consumes time, is cumbersome and prone to errors, and requires a certain level of human expertise. Thus, an automated epilepsy seizure detection framework is needed.

This tedious nature of reading EEG recordings by doctors to determine epileptic conditions has necessitated research into more straightforward, quicker, and more efficient methods of detecting epileptic conditions. In [3], a pattern recognition study was conducted using time-domain (TD) functions for detecting epileptic seizures, which includes waveform length (WL), some slope sign changes (SSC), and many zero-crossings (ZC) that are derivative of filtered EEG data and discrete wavelet transform (DWT) of filtered EEG data for the detection of epileptic seizures. Therefore, the performance of time-domain features was studied based on the support vector machines (SVM) classifiers and naïve Bayes (NB). With direct and DWT-based TD functions, the results revealed that the suggested technique should reach the best accuracy of 100 percent for regular eyes open and epileptic datasets.

In [4], it is imperative to use the combination of electroencephalogram (EEG) with the technique of deep learning computation to diagnose epileptic seizures, as highlighted in the study. This study emphasized designing and evaluating seizure detection using the deep convolution neural network-based classifier. The outcome proposed was to determine the most accurate seizure detection, and it was classified into three methods. 99.21% was the highest average classification accuracy, which was proposed to be the FT-VGG16 classifier using the accuracy data with previous studies of the same set of data. The outcome shows that comparing the signal-to-image conversion method and accuracy data model surpassed all earlier investigations in the vast majority. Furthermore, to discover the most improved classification accuracy and the EEG frequencies feature, the SHapley Additive exPlanations (SHAP) analysis approach was used.

In [5], the Bonn dataset was used to evaluate the new suggested technique for automatically recording epileptic electroencephalograms. This is based on approximated entropy and repetitive mixed quantification analysis, leveraging a convolution neural network. The results revealed that approximation entropy and recurrence quantification effectively detect epileptic seizures. 92.17%, 91.75%, and 92.00% show the recurrence rate of attainment of sensitivity and accuracy, respectively. The classic recordings automatically differentiate seizure electroencephalogram from convolutional neural networks, especially when combined with the approximate entropy and recurrence quantification analysis features. The results reached 98.84%, 99.35%, and 99.26%. Several other works in this domain all point to the fact that automatic detection of epileptic conditions can be a possibility, thus ruling out the tedious task of doctors inspecting EEGs. These tools would be helpful in epilepsy clinical diagnosis and therapy.

In providing an accurate solution to the problem of an epileptic seizure, several algorithms have been proposed, and they provide several levels of accuracy. In [68], several time-frequency domain algorithms were introduced for the accurate characterization of epileptic seizures from collected EEG signals. Two of the algorithms are short-time Fourier transform and multiwavelet transform. The two algorithms provided satisfactory results upon validation. Discrete wavelet transform was employed in [9] for epileptic seizure detection. This method was primarily used to extract features from the EEG signals and carry out the principal component analysis, independent component analysis, and, lastly, the linear discriminant analysis, which was introduced to reduce the dimensions of the various signals and for straightforward representation. Support vector regression machine learning-based model was then employed to classify the mixed EEG signals in the multidimensional plane.

As introduced, the Support Vector Machine is efficient for signal classification, but that does not come without the challenges of selecting the optimal number of parameters. Setting the proper parameter is crucial to achieving high accuracy in detecting epileptic seizures. This hyperparameter tuning and selection makes particle swarm optimization and genetic algorithms highly accurate. The detection of epileptic seizures must be done accurately and efficiently, and this is why machine learning algorithms have been recently introduced. The large volume of the dataset used in EEG signals can be processed accurately with the algorithms.

Also, the robust network architecture that ML algorithms provide makes them scalable and useful when characterizing EEG signals. The epileptic seizure detection must be performed with the lowest false negative and false positive. ML algorithms have been introduced to ensure this efficiency and accuracy in feature characterization. The digital wavelet transformation (DWT) introduced in [9, 10] was able to handle the problem of spikes in epileptic seizures. The DWT algorithm can handle the spikes through the localization of these transient occurrences. The algorithm prevents the generalization of the spike occurrences and thus reduces or minimizes any form of error at that particular time during the signal characterization process.

The detection of epileptic seizures was introduced using hybrid methods [11]. The generic algorithm was embedded into fuzzy logic and characterized both epileptic and nonepileptic signals. Various risk assessments to both the two signal characterizations (epileptic and nonepileptic) were provided by fusing the data into the genetic algorithm to make accurate predictions. Another hybrid method was introduced in [12], in which computational intelligence was integrated with a genetic algorithm to ensure optimal characterization of EEG signals. The entire dataset was divided into training and validation datasets. Features were extracted from the datasets and used to train the genetic algorithm. The validation datasets were then used to validate the trained model. The genetic algorithm-based model was able to detect epileptic seizures accurately. Hybrid models work efficiently and can adequately compensate for the deficiencies in each base model to produce a single model with high accuracy. For this particular base model, the accuracy largely depends on the proper tuning of the parameters of the genetic algorithm.

In [13], another model stationary wavelet transform was introduced and employed to detect an epileptic seizure. This presented algorithm properly captured points on the edge of the signal which are stationary at all points. The ability to capture these stationary points reduces the probability of error. This is because the points on the wavelet that would have been normally left unaccounted for have now been adequately represented with the aid of this stationary wavelet algorithm. It was applied to both epileptic and nonepileptic signals under varying conditions to determine its optimization level. The stationary wavelet algorithm is also efficient for handling the data points along rough edges that could negatively affect the seizure detection algorithm. The stationary points for the nonepileptic signals present some form of complexities. Still, they could also be handled appropriately with the stationary wavelet transformation for linear signals and nonlinear signals.

An algorithm was given to detect epileptic seizures in [14]. The study used data from both epileptic patients and those who are not epileptic patients to develop the framework. The authors then classified the sample features into various datasets using linear and nonlinear classifiers. The linear classifier handled the measurements taken from nonepileptic patients, while the nonlinear classifiers were used to classify epileptic patients. Digital wavelet transformation was then later used to analyze seizure detection. The datasets were also divided into the training and the test datasets to ensure proper characterization in all cases. While the detection algorithm was developed with the training dataset, the testing was carried out with the test datasets to prevent overfitting and reduce the number of outliers. Several other algorithms have been developed and used in epileptic seizure detection, but some present one form of limitation or the other. The concern is not just in the development of an algorithm. The focus should be on optimization and accurate characterization so that seizure detection can be done with the lowest possible form of error.

This work combines the wavelet domain and machine learning approach to identify epileptic seizures. Time-frequency analysis is carried out on EEG signals as they are nonlinear, nonstationary, and complex. There are many methods for performing time-frequency analysis. In this study, the discrete wavelet transform technique is adopted. Extracting the hidden characteristics of the signal is done by using feature extraction, which helps in inspecting the signal. These derived features are fed into classifiers that differentiate between healthy and seizure signals. The classifiers used are k-Nearest Neighbor (kNN), Support Vector Machine (SVM), naïve Bayes, and Decision Tree classifier [15, 16]. The performance of these classifiers is measured using statistical parameters.

The remainder of this work is structured as follows. Section 2 presents the methodology. The results and discussions are given in Section 3. The comparison with other existing state-of-the-art developments is given in Section 4, and the conclusion is given in Section 5.

2. Methods

The EEG data is fragmented into six frequency subbands using the discrete wavelet transform. The essential characteristics, such as mean average power, lowest coefficient, average value, and highest coefficient, are removed and inputted into naïve Bayes, SVM, Decision Tree classifiers, and kNN. The computation performance is done for each classifier and feature. The proposed framework is indicated in Figure 1.

2.1. Bonn University Dataset

This data is available on the website of the Department of Epileptology, Bonn University, Germany [15]. This is a single-channel data provided solely for research purposes. The record contains five datasets named as set A to set E. Each set of data has 100 samples. The time taken is 23.6 s for 100 EEG segments recorded on the head's surface via a single channel. Five healthy volunteers were chosen for clusters A and B, and the EEG signals were recorded with keen observation, respectively. For cluster C, there are the patients who do not have the epileptic attack at hemisphere hippocampal formation, and, for cluster D, the epileptogenic zone is where the recording took place. Set E of the EEG signal was recorded, while the patients were experiencing an epileptic attack. A 12-bit A/D converter with a 173.61 Hz sampling frequency is used for digitizing the data. Therefore, each EEG segment is found to contain 4096 points of sampling. The graphs of EEG signals are plotted and presented in Figure 2.

2.2. Real-Time Clinical Dataset

Real-time multichannel clinical data were acquired from six healthy patients and six epileptic patients from Senthil Multispecialty Hospital in Erode, Tamil Nadu, India. The data was 21-channel EEG data. The epileptic signal was recorded during the preseizure period. The sampling frequency is maintained at 256 Hz.

2.3. Preprocessing of Signal Using DWT

The Fast Fourier Transform (FFT) for frequency-domain analysis is applied in many applications. But, for biomedical signals such as EEG, the usage of FFT is restricted, since they include uneven patterns and are also nonstationary. The only time-domain analysis will not yield information regarding the frequency of the pattern. Therefore, the time-frequency analysis is applied for preprocessing the signal. Discrete wavelet transform is a widely used time-frequency analyzer for biomedical signals as it prefers variable window sizes. The DWT algorithm uses low-pass (LP) and high-pass (HP) quadrature mirror filters.

The input signal is routed through the low- and high-pass filters to produce the approximate (A1) coefficient and their outputs detailed (D1) coefficient. The result obtained from the high-pass filter is provided to another filter of quadrature mirror type, and the process is repeated to determine the coefficients of the subsequent level. Every decomposition process leads to doubling the frequency resolution because of filtering and halved through downsampling.

This work uses Daubechies order-4 wavelet function due to its orthogonal features and filtering efficiency. The necessary statistical features are acquired from the subbands frequency.

2.4. Statistical Features from Discrete Wavelet Transform (DWT) Coefficients

The following features were extracted.

Mean Average Value (MAV). MAV relates to the information frequency of the signal and can be determined from the following equation:

Maximum Coefficient. The maximum coefficient measures the maximum frequency value for a given sample.

Minimum Coefficient. The minimum coefficient measures the minimum value of frequency for a given sample.

Standard Deviation (STD). Standard deviation relates to the proportionate changes in the frequency signal and is given by the following equation:

Average Power. Average power represents information about the frequency content of the signal and is determined by the following equation:

Shannon Entropy. The expression of Shannon entropy offers an easy mode to determine the average number of bits necessary to encode a string of symbols. It is given by the following equation:

Approximate Entropy (ApEn). The extent of regularity and unpredictability of fluctuations over time-series data can be quantified by approximate entropy, ApEn (m, r, N). The parameters m and r are the run length and tolerance window input parameters, respectively. The parameter N is the number of points of the time series.

2.5. K-Fold Cross-Validation

10-fold cross-validation is used in this study to get reliable results. The original sample is divided into 10 subsamples. The 9 subsamples are used as training datasets, and one subsample is used as testing dataset. It is then repeated 10 times. In this way, each dataset is trained 9 times and tested one time. The 10 results obtained are then averaged to give a single estimated accuracy. This validation is applied to SVM, KNN, naïve Bayes, and Decision Tree classifier.

2.6. Support Vector Machine (SVM) Classifier

SVM is a binary classifier model based on a machine learning algorithm. In this format, a training dataset is classified into two groups so that the division is as wide as possible. The machine learning [46] algorithm generates a plot of hyperplane, which distributes the two groups using the training dataset. From the plot, a set of data is considered nonsensitive if the hyperplane is closer to that data. Thus optimal hyperplane is chosen, which is farthest from the data points. The optimal hyperplane is used to classify the testing dataset.

The equation of hyperplane iswhere is the weight vector and is the bias vector. An infinite number of hyperplanes can be obtained by varying the two parameters. The condition for optimal hyperplane iswhere x represents the support vector, the training set closest to the hyperplane.

2.7. k-Nearest Neighbor (kNN) Classifier

kNN is a nonparametric and nonlinear classifier. It is used for relatively larger training sets. The similarities between the training and testing [47] sets are considered for the measure. The class having the majority in the nearby “k” datasets is assigned to the test/unknown dataset. The dataset “nearness” is measured using Euclidean distance given bywhere and .

The value of k should be a positive integer. In this study, the value of k is 3.

2.8. Naïve Bayes (NB) Classifier

This classifier is based on Bayesian theory and is a probabilistic classifier. It is also based on the assumption that each class feature is independent of any other feature. The NB classifier needs less training data.

Assuming D as a training set for n number of classes and Y as the attribute vector and associated class labels, the class with the highest posterior probability to which attribute Y belongs to is given bywhereby Bayes theorem.

Here represents the class probabilities, is the prior probability of Y, is the posterior probability, and is the posterior probability of Y conditioned on .

2.9. Decision Tree (DT) Classifier

DT is a predictive modelling approach widely used in data mining, statistics, and machine learning algorithms. Tree models are used where the target variable is assigned with continuous values. The Decision Tree leaves represent the class labels, and the branches represent a combination of features that lead to those class labels.

2.10. Statistical Parameter

The performances of the four classifiers are evaluated using six parameters, namely, accuracy, specificity, sensitivity, positive predicted value (PPV), negative predicted value (NPV), and Mathews correlation coefficient (MCC) [4, 5]. These parameters are mathematically defined as follows:

Accuracy:

Sensitivity:

Specificity:

Positive predictive value (PPV):

Negative predictive value (NPV):

Mathews correlation coefficient (MCC):where CCP denotes correct classified patterns, and total patterns are denoted as TPT. TP denotes true positive, FN denotes false negative, FP denotes false positive, and TN denotes a true negative.

3. Results and Discussions

3.1. Results from the University of Bonn Dataset

The datasets from sets A, B, C, D, and E are decomposed into different subbands. The different frequency subbands are D1 (43.4–86.8 Hz), D2 (21.7–43.4 Hz), D3 (10.85–21.7 Hz), D4 (5.42–10.85 Hz), D5 (2.70–5.43 Hz), and A5 (0–2.70 Hz). Since the most useful information is available in subbands D3–D5 and A5, only they are considered [10]. Features such as Mean Absolute Value, maximum coefficients, minimum coefficients, Standard Deviation, average power, Shannon entropy, and approximate entropy are derived from subbands D3, D4, D5, and A5 for five datasets A, B, C, D, and E.

Sixteen cases are considered, including 2 class classifications and 3 class classifications on data readily available from the University of Bonn, Germany. 7 features were obtained in the study. However, the number of features generated for each EEG signal should be 4 x 7 = 28. For every 100 signals of datasets A to E, 28 features were generated. In each case, 10-fold cross-validation is applied, dividing whole data into 10 equal parts, where 9 data parts are used for training against 1 used for testing purposes. The SVM, KNN, naïve Bayes, and Decision Tree classifiers are then fed the following training and testing sets, and performance measures such as Accuracy, Specificity, Sensitivity, Positive Predicted Value, Negative Predicted Value, and Mathews Correlation coefficients are obtained.

From Table 1, we can observe the result for 16 different classifications for the SVM classifier. For A-E classification, estimated entropy from the D5 frequency subband provides the highest level of accuracy of 100%. Similarly, for B-E classification, minimum coefficients extracted from D3, D4, and A5 frequency subbands give only 90%, 91%, and 92% accuracy, respectively, but approximate entropy extracted from D5 frequency subband gives the highest accuracy of 99.5%. For C-E classification, in D4 frequency subband, maximum coefficient gives the highest accuracy of 98%. For D-E classification, in the D3 frequency subband, MAV gives the best accuracy of 96%. For AB-E classification, in the D5 frequency subband, approximate entropy provides the highest level of accuracy of 99.67%. For AC-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 98.66%. For AD-E classification, in the D3 frequency subband, MAV provides the highest level of accuracy of 98%. For BC-E classification, in the D3 frequency subband, the minimum coefficient gives an accuracy of 92.6%. For BD-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 96.3%. For CD-E classification, in the D3 frequency subband, MAV gives the highest accuracy of 98%. For ABC-E classification, in the D5 frequency subband, MAV gives the highest accuracy of 99%. For ABD-E classification, in the D5 frequency subband, MAV gives the highest accuracy of 97%. For ACD-E classification, in the D3 frequency subband, MAV gives the highest accuracy of 98.25%. For BCD-E, in the D5 frequency subband, MAV gives the highest accuracy of 97%. For ABCD-E classification, in the D3 frequency subband, the minimum coefficient gives the best accuracy of 100%. For AB-CD classification, in the D3 frequency subband, approximate entropy gives an accuracy of 80%. Moreover, we can infer that the feature that provides SVM the best result is the Mean Absolute Value. The highest accuracy is achieved for A-E and ABCD-E classification, which is 100%. The lowest classification accuracy is achieved for AB-CD classification.

From Table 2, we can observe the result for 17 different classifications for the kNN classifier. For A-E classification, approximate entropy from the D5 frequency subband gives the highest accuracy of 100%. For B-E classification, maximum and minimum coefficients extracted from D3, D4, and A5 frequency subbands give poor results, but MAV extracted from D5 gives the best result of 100%. For C-E classification, in the D3, D4, and D5 frequency subbands, STD gives the highest accuracy of 98%, and in the A5 frequency subband, ApEp gives the accuracy of 79.5%. For D-E classification, in the D3 frequency subband, MAV gives the best accuracy of 97%. For AB-E classification, in the D5 frequency subband, MAV gives the highest accuracy of 100%. For AC-E classification, in the D3 frequency subband, STD gives the highest accuracy of 98.7%. For AD-E classification, in the D3 frequency subband, MAV gives the highest accuracy of 98%. For BC-E classification, in the D5 frequency subband, MAV gives a better result. The highest accuracy achieved for A-E, B-E, AB-E, and ABCD-E classification is 100%. The lowest classification accuracy is achieved for AB-CD and AB-CD-E classification, with the highest accuracy of 98.6%. For BD-E classification, in the D5 frequency subband, Shannon entropy gives the highest accuracy of 95.33%. For CD-E classification, in the D3 frequency subband, MAV gives the highest accuracy of 97.66%. For ABC-E classification, in the D5 frequency subband, MAV gives the highest accuracy of 99%. For ABD-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 96.5%. For ACD-E classification, in the D3 frequency subband, Shannon entropy provides the highest level of accuracy of 98%. For BCD-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 97%. For ABCD-E classification, in the D3 frequency subband, the minimum coefficient gives the best accuracy of 100%. For AB-CD classification, in the D5 frequency subband, MAV gives the best accuracy of 75%. For AB-CD-E classification, in the D5 frequency subband, Shannon entropy gives the best accuracy of 75%. Moreover, we can infer that Mean Absolute Value and Shannon entropy are the features that offer the best result for kNN.

From Table 3, we can observe the results for 16 different classifications for the naïve Bayes classifier. For A-E classification, MAV from the D5 frequency subband gives the highest accuracy of 100%. For B-E classification, the minimum coefficient extracted from D3, average power extracted from D4, and ApEp extracted from A5 frequency subband give poor results. Still, MAV extracted from D5 gives the best result of 99.5%. For C-E classification, in D3, D4, D5, and A5 frequency subbands, the STD has been extracted, and the highest accuracy of 100% has been attained only in the D5 frequency subband. For D-E classification, in the D3 frequency subband, MAV gives the best accuracy of 97.5%. For AB-E classification, in the D5 frequency subband, STD provides the highest level of accuracy of 99.7%. For AC-E classification, in the D3 frequency subband, STD provides the highest level of accuracy of 98.7%. For AD-E classification, in the D3 frequency subband, MAV provides the highest level of accuracy of 98%. For BC-E classification, in the D5 frequency subband, ApEp provides the highest level of accuracy of 98.7%. For BD-E classification, in the D5 frequency subband, ApEp gives the maximum accuracy of 96%. For CD-E classification, in the D3 and D4 frequency subbands, MAV gives the maximum accuracy of 97.7%. For ABC-E classification, in the D5 frequency subband, MAV gives the maximum accuracy of 99%. For ABD-E classification, in the D5 frequency subband, ApEp gives the maximum accuracy of 96.25%. For ACD-E classification, in the D5 frequency subband, ApEp gives the maximum accuracy of 97.6%. For ABCD-E classification, in D3 frequency subband, the minimum coefficient gives the best accuracy of 97.4%. For AB-CD classification, in the D3 frequency subband, approximate entropy gives the maximum accuracy of 82.5%. Moreover, from Table 3, we can infer that the attributes which show the best result for naïve Bayes are Mean Absolute Value and approximate entropy. The maximum accuracy is achieved for A-E and C-E, 100%. The lowest accuracy is achieved for AB-CD classification.

From Table 4, we can observe the results for 16 different classifications for the Decision Tree classifier. For A-E classification, MAV from the D5 frequency subband gives the highest accuracy of 100%. For B-E classification, the maximum coefficient extracted from D3 and D4 and STD extracted from A5 frequency subband give poor results, but MAV extracted from D5 give the best result of 100%. For C-E classification, Shannon entropy extracted from D4 gives the highest accuracy of 98. For D-E classification, in the D3 frequency subband, MAV gives the best accuracy of 96.5%. For AB-E classification, in the D5 frequency subband, MAV gives the maximum accuracy of 100%. For AC-E classification, in the D3 frequency subband, STD provides the highest level of accuracy of 98.67%. In the D5 frequency subband, ApEp provides the highest level of accuracy of 98.67%. For AD-E classification, in the D3 frequency subband, MAV provides the highest accuracy of 98%. For BC-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 98.33%. For BD-E classification, in the D5 frequency subband, MAV provides the highest level of accuracy of 94.67%. For CD-E classification, in the D3 frequency subband, MAV provides the highest level of accuracy of 96.67%. For ABC-E classification, in the D3 frequency subband, the minimum coefficient gives the best accuracy of 94.5%. For ABD-E classification, in the D4 frequency subband, the maximum coefficient gives the highest accuracy of 95.75%. For ACD-E classification, in the D3 frequency subband, MAV gives the maximum accuracy of 98%. For BCD-E classification, in the D5 frequency subband, ApEp gives the maximum accuracy of 96.25%. For ABCD-E classification, in the D3 frequency subband, the minimum coefficient gives the best accuracy of 100%. For AB-CD classification, in the D3 frequency subband, MAV gives the maximum accuracy of 79.75%. Moreover, we can infer that the attributes that show the Decision Tree's best result are Mean Absolute Value and Shannon entropy. The highest accuracy is achieved for A-E, B-E, AB-E, and ABCD-E classifications, which is 100%. The lowest classification is performed for AB-CD classification.

3.2. Results from Clinical Real-Time Dataset

A real-time clinical dataset from a healthy signal is distinguished from an epileptic patient signal. We have applied DWT and generated the features for different subbands. This work has considered all the 21-channel datasets obtained for 24 sec. The features were generated from subbands D3–D5 and A5, and these features were used for classification [15, 16, 18]. But the better result has been obtained only for the average power feature derived from the D5 subband using the SVM classifier. Table 5 shows the results. Here, we applied 10-fold cross-validation for classification, since most of the useful information [48] required for distinguishing healthy and seizure patient signals might be in subband D5.

4. Comparison with Existing State of the Art

Different researchers had proposed several techniques to detect epileptic seizures from EEG signals. Their works are compared with this work and tabulated in Table 6. The methods that used the same dataset are shown for comparison. It has been noticed that several strategies, such as DTCWT, empirical mode decomposition, CNN, and Fuzzy Neural Network, are used to examine EEG to identify epileptic seizures from normal conditions.

The majority of the researchers classified set A with set E for the two-class classifications and got a classification accuracy from 94.8% to 100% [1931, 3942]. Many researchers also classified set B with set E and achieved a classification accuracy from 82.88% to 99.25% [19, 25, 29, 31, 41]. On classifying set C with set E, researchers got a classification accuracy from 88% to 99.6% [23, 25, 29, 31, 41]. For classification of set D with set E, researchers got classification accuracy from 79.94% to 95.85% [23, 25, 29, 31, 41]. We, too, have achieved 100% in A-E classification in our work. We have got better results for B-E, C-E, and D-E classifications. We achieved 100% accuracy for B-E classification, whereas the maximum accuracy to date has been 99.25% only. Similarly, for C-E classification, we achieved 100% accuracy, whereas the maximum accuracy to date has been 99.6% only. We have achieved 97.5% for D-E classification, but the maximum accuracy achieved to date has been 95.85%.

Researchers also combined two datasets from set A to set D and classified them with set E. In the case of AB-E classification, the maximum accuracy achieved to date has been 99.2% [29]. In our work, we have achieved 100% accuracy. Our accuracy is 98.6% for AC-E classification, but the maximum accuracy achieved has been 99.5% [23]. For AD-E classification, researchers got classification accuracy of 85.9% [30] and 97.08% [29], but we got better accuracy of 98%. Researchers got a classification accuracy of 98.25% [31], but we got a better accuracy of 98.67%. In the case of BD-E classification, we got a classification accuracy of 96.33%, but the maximum accuracy achieved has been 96.5% [28]. Similarly, for CD-E classification, we got a classification accuracy of 98%, but the maximum accuracy achieved is 100% [26]. Likewise, researchers have combined three different datasets and classified them with set E. Researchers have achieved a classification accuracy of 98.68% [31] for ABC-E, but we have achieved a better accuracy of 99% in our work. For ACD-E classification, researchers have achieved classification accuracy from 96.65% to 98.15% [20, 25, 31, 34]. We have achieved a slightly better accuracy of 98.25% in our work. We have also classified ABD-E, which has not been computed in any previous work, and have achieved an accuracy of 97%. We have achieved a classification accuracy of 97% for BCD-E classification, but the highest accuracy has been 97.72% [31].

The other two-class classifications are ABCD-E and AB-CD. For ABCD-E classification, our result is comparable with those of other researchers. Researchers have achieved accuracy ranging from 97.1% [28] to 100% [26]. In our work also, we have reached an accuracy of 100%. Additionally, we have combined set A with set B and set C with set D and classified these two combined datasets. This type of classification has not been attempted previously, and we got a classification accuracy of 82.5%.

Furthermore, for three-class classification, AB-CD-E, researchers have achieved classification accuracy ranging from 95.6% to 98.8% [22, 28, 31, 35, 36]. We have got a lesser accuracy of 95% only in our work.

Many studies have been carried out on applying ensemble techniques to various aspects of science and engineering. Ensemble models combine several base models to achieve an overall model with high predictive ability. Ensemble models as used in various areas of science and engineering are provided in [5, 4345].

4.1. Discussion of Key Findings

For A-E classification, MAV, STD, and average power extracted from the D5 frequency subband are fed into kNN, NB, and DT classifiers, giving the best result of 100% accuracy. The SVM classifier gives 100% accuracy for approximate entropy features from the D5 frequency subband. For B-E classification, MAV, STD, and average power taken out from the D5 frequency subband are fed into kNN and DT classifier, which gives the best result of 100% accuracy. For C-E classification, STD and average power extracted from the D3 frequency subband are fed into the NB classifier, giving the best result of 100% accuracy. For D-E classification, MAV extracted from the D3 frequency subband is fed into the NB classifier, which gives the best result of 97.5% accuracy. For AB-E classification, MAV and Shannon entropy extracted from the D5 frequency subband are fed into the kNN and DT classifier, giving the best accuracy result of 100% accuracy. For AC-E classification, STD extracted from the D3 and D5 frequency subbands is fed into kNN and DT classifier, giving the best result of 98.67% accuracy. For AD-E classification, MAV extracted from the D3 frequency subband is fed into all classifiers, giving the best result of 98% accuracy. For BC-E classification, MAV extracted from the D5 frequency subband is fed into kNN, NB, and DT classifier, giving the best result of 98.67% accuracy. For BD-E classification, MAV extracted from the D5 frequency subband is fed into the SVM classifier, giving the best result of 96.33% accuracy. For CD-E classification, MAV extracted from the D3 frequency subband is fed into the SVM classifier, giving the best result of 98% accuracy. For ABC-E classification, MAV extracted from the D5 frequency subband is fed into SVM, kNN, and NB classifier, which gives the best result of 99% accuracy. For ABD-E classification, MAV extracted from the D5 frequency subband is fed into the SVM classifier, which gives the best result of 97% accuracy. For ACD-E classification, MAV extracted from the D3 frequency subband is fed into SVM and NB classifier giving the best result of 98.25% accuracy. For BCD-E classification, MAV extracted from the D5 frequency subband is fed into SVM and NB classifier, which gives the best result of 97% accuracy. For the ABCD-E classification, minimum coefficient extracted from the D5 frequency subband is fed into SVM, kNN, and DT classifier, giving the best result of 100% accuracy. For AB-CD classification, ApEp extracted from the D3 frequency subband is fed into kNN and NB classifier, which gives the best result of 82.5% accuracy. For AB-CD-E classification, Shannon entropy from the D5 frequency subband is fed into the kNN classifier, which gives the best result of 95% accuracy. From Table 6, we can conclude that we have achieved 100% accuracy for A-E, B-E, C-E, AB-E, and ABCD-E classifications, which have been achieved previously. Additionally, we achieved better accuracy for the dataset combinations of D-E, BC-E, and ABC-E in the detection of epileptic seizure.

5. Conclusions

A new approach for identifying epileptic seizures by using the time-frequency domain features and various classifiers is proposed in this work. The dataset developed by the University of Bonn in Germany achieves the highest classification rate of 100%. The overall accuracy of 91.67% is obtained in real-time data from the Senthil Multispecialty Hospital, India. The proposed method is successful after verifying and comparing it with the accuracies of the existing methods in several existing literature. The work also presented different base classifier machine learning models to characterize EEG signals and accurately detect epileptic seizures. One of the significant contributions of this work is that if offered wide-ranging machine learning models on the dataset, the proposed method could guarantee accurate prediction. In contrast, other existing works on EEG signals do not examine as many models explored in this work. Examining the different base classifier models is vital for generalization. This is one of the strengths and critical contributions of this paper. Since we could test the validity of the results clinically, we would develop this algorithm further to implement it in the hospitals in our future work.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors thank Senthil Multispecialty Hospital, India, for the real-time data provided and the University of Bonn, Germany, for making their dataset available online for this study. The authors express gratitude to the Vellore Institute of Technology, Vellore, India, for providing the SEED Grant Fund to conduct the research. The work of Agbotiname Lucky Imoize was supported by the Nigerian Petroleum Technology Development Fund (PTDF) and the German Academic Exchange Service (DAAD) through the Nigerian-German Postgraduate Program under Grant 57473408. This study was also partially supported by Prince Sattam bin Abdulaziz University, Saudi Arabia.