An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction
Introduction
Diabetes is one of the most prevalent chronic diseases today. As the number of people with diabetes continually grows worldwide, the importance of research on the treatment of diabetes is progressively increasing. Particularly, type 2 diabetes is the most common, accounting for 85–90% of diabetes (Bennett, Guo, & Dharmage, 2007). Most patients with type 2 diabetes are under medical care with mono- or combination therapy of oral hypoglycemic agents, aiming at lowering glucose level. Glycated hemoglobin (HbA1c) is an effective and widely used measurement of glucose level for patients with type 2 diabetes (Bennett et al., 2007, Lu et al., 2010). According to the guideline of the American Diabetes Association (ADA) (American Diabetes Association, 2014), combination therapy is recommended for the patients with type 2 diabetes who cannot be controlled by mono-therapy. In addition, the ADA recommends an HbA1c level of 7% or lower as the reasonable glycemic goal for most individuals.
Unfortunately, the majority of patients fail to achieve their glycemic goals (Brown, Nichols, & Perry, 2004). This is because the outcome of the diabetic treatment is highly related to various factors. The efficacy of anti-diabetic drugs can be affected by the characteristics of patients such as age, gender, obesity, and blood pressure. Moreover, the efficacy can also be affected by the interaction of various drugs. Therapies for type 2 diabetes are generally based on the combination of 2–3 oral hypoglycemic agents in order to obtain better and more reliable results (American Diabetes Association, 2014, Yki-Järvinen, 2001) In addition, because diabetes often leads to complications, drugs to treat complications are also administered to diabetic patients.
Predicting drug treatment failure is an important issue in the medical domain. Many studies have been conducted based on statistical analyses. However, it is difficult to predict the failure accurately using only statistical analyses because the failure is related to a variety of factors. Presently, the effectiveness of machine leaning approaches for disease diagnosis has been reported by several studies (Hu et al., 2012, Sajda, 2006, Zeng and Liu, 2010), and more recently, some researchers have attempted to apply machine learning approaches to diabetes (Huang et al., 2007, Marinov et al., 2011). Most of them focus on prediction at the disease level. Machine learning approaches can also be effective for predicting drug treatment failure. However, to the best of our knowledge, there are relatively few efforts at predicting drug treatment failure using these approaches. The problem of anti-diabetic drug failure prediction can be defined as a classification problem, and has the characteristics of multivariate and complex relationships. Therefore, support vector machine (SVM) can be a good candidate as a classification algorithm.
SVM (Vapnik, 1995) is one of the most popular state-of-the-art classification algorithms, and shows superior generalization performance based on structural risk minimization principle. The effectiveness of SVM has been verified in various applications such as text categorization, handwritten digit recognition, image segmentation, and financial forecasting (Burges, 1998). Moreover, SVM is also known to be very effective in the medical domain (Barakat et al., 2010, Yu et al., 2010).
However, training of an SVM becomes a difficult problem when the size of a given dataset is very large because the SVM takes of its training time complexity Kang and Cho (2014). the training of SVM involves solving a quadratic programming (QP) problem of complexity. Therefore, it is practically undesirable to train the SVM for a large-scale dataset directly. Commonly used approaches to alleviate the complexity are improving the efficiency of the QP process (Fan et al., 2005, Joachims, 1999, Platt, 1999) and reducing the number of training data points by eliminating non-support vectors before the QP process (Li and Maguire, 2011, Shin and Cho, 2007).
When these approaches are insufficient, the training time of SVM can be further reduced by constructing an ensemble of SVMs that are trained with small bootstrap samples (Kim, Pang, Je, Kim, & Bang, 2003). The two typical ensemble methods, Bagging (Breiman, 1996) and Boosting (Freund & Schapire, 1997), can be employed to construct SVM ensembles (Kim et al., 2003, Wang et al., 2009). By doing so, we can obtain comparable classification accuracy by aggregating SVMs properly, although the classification accuracy of each SVM is lowered. One major concern is that a bootstrap sample might contain lots of superfluous data points when the size of such sample is set to very small, thereby resulting in the training of an ill-formed SVM.
In this paper, we propose an efficient and effective ensemble of SVMs for large-scale datasets based on data selection methods, called E3-SVM. The proposed method is based on the fact that the SVM only uses support vectors to determine the decision boundary. In the proposed method, a reduced dataset is constructed by applying data selection methods (Shin and Cho, 2007, Li and Maguire, 2011) that select data points that are more likely to be the support vectors. That is, non-crucial data points of the original dataset are excluded in the reduced dataset. The ensemble of SVMs is constructed using bootstrap samples drawn from the reduced dataset. Consequently, the classification accuracy of the ensemble is improved by reducing the risk of using superfluous data points when training SVMs with small bootstrap samples. We investigated the efficiency and effectiveness of the proposed method through experiments on the anti-diabetic drug failure prediction problem.
The rest of this paper is organized as follows. In section 2, we briefly review the related work. In section 3, we describe our proposed method and section 4 reports the experimental results on the anti-diabetic drug failure prediction problem. The conclusion and future work are given in section 5.
Section snippets
Support vector machines
SVM (Vapnik, 1995) seeks to find the maximum margin hyperplane that separates the positive datapoints from negative datapoints. Given a training dataset , where is the number of training datapoints, is an input feature vector and is the corresponding target class label, an SVM can be formulated as the following optimization problem:where is the parameter that controls the tradeoff between the
E3-SVM: efficient and effective ensemble of SVMs
In this section, we describe our proposed method E3-SVM, short for efficient and effective ensemble of SVMs. To construct an SVM ensemble, several SVMs are trained using bootstrap samples drawn from a given dataset. When the size of the bootstrap sample is much smaller than the dataset, each SVM is probable to be trained on mostly superfluous data points. To address the problem, the proposed method reduces the original dataset by selecting crucial data points and draws bootstrap samples from
Anti-diabetic drug failure prediction
We demonstrated the effectiveness of the proposed method for the anti-diabetic drug failure prediction problem through experiments using real-world data. In this section, data collection, preprocessing, experimental settings, and results are described.
Conclusions and future work
The treatment of patients with type 2 diabetes is mostly based on drug therapies and aims at managing glucose levels appropriately. Predicting drug treatment failure is a major issue, but is very difficult because of the influence of a wide variety of factors and because the relationship between such factors is also complicated. Due to the complexity, SVM can be a good candidate as a classification algorithm for the anti-diabetic drug failure prediction problem. The major drawback is high
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIP) (No. 2011–0030814), Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2014R1A1A1004648), and the Brain Korea 21 PLUS Project in 2014. This work was also supported by the Engineering Research Institute of SNU.
References (27)
- et al.
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences
(1997) - et al.
Predicting warfarin dosage from clinical data: A supervised learning approach
Artificial Intelligence in Medicine
(2012) - et al.
Feature selection and classification model construction on type 2 diabetic patients data
Artificial Intelligence in Medicine
(2007) - et al.
Approximating support vector machine with artificial neural network for fast prediction
Expert Systems with Applications
(2014) - et al.
Constructing support vector machine ensemble
Pattern Recognition
(2003) - et al.
Empirical analysis of support vector machine ensemble classifiers
Expert Systems with Applications
(2009) - et al.
Mixture classification model based on clinical markers for breast cancer prognosis
Artificial Intelligence in Medicine
(2010) - American Diabetes Association (2014). Standards of medical care in diabetes–2014. Diabetes Care, 37, S14–S80....
- et al.
Intelligible support vector machines for diagnosis of diabetes mellitus
IEEE Transactions on Information Technology in Biomedicine
(2010) - et al.
HbA1c as a screening tool for detection of type 2 diabetes: A systematic review
Diabetic Medicine
(2007)
Bagging predictors
Machine Learning
The burden of treatment failure in type 2 diabetes
Diabetes Care
A tutorial on support vector machines for pattern recognition
Data Mining and Knowledge Discovery
Cited by (49)
Forecasting the eddy current loss of a large turbo generator using hybrid ensemble Gaussian process regression
2023, Engineering Applications of Artificial IntelligenceA comprehensive review on ensemble deep learning: Opportunities and challenges
2023, Journal of King Saud University - Computer and Information SciencesSmart healthcare disease diagnosis and patient management: Innovation, improvement and skill development
2021, Machine Learning with ApplicationsA stacking-based ensemble learning method for earthquake casualty prediction
2021, Applied Soft ComputingCitation Excerpt :Ensemble learning is not a specific method; rather, it is more of an idea, where multiple base learners are built and multiple results are integrated with a certain strategy as the final result. It often demonstrates better prediction ability and stability than any single machine learning model [20]. Therefore, we exploit the advantages of the tree-based model and build a model for earthquake casualty prediction based on the ensemble learning method.