Abstract
Density based logistic regression (DLR) is a recently introduced classification technique, that performs a one-to-one non-linear transformation of the original feature space to another feature space based on density estimations. This new feature space is particularly well suited for learning a logistic regression model. Whilst performance gains, good interpretability and time efficiency make DLR attractive, there exist some limitations to its formulation. In this paper, we tackle these limitations and propose several new extensions: 1) A more robust methodology for performing density estimations, 2) A method that can transform two or more features into a single target feature, based on the use of higher order kernel density estimation, 3) Analysis of the utility of DLR for transfer learning scenarios. We evaluate our extensions using several synthetic and publicly available datasets, demonstrating that higher order transformations have the potential to boost prediction performance and that DLR is a promising method for transfer learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing 17, 255–287 (2010)
Bowman, A.W., Azzalini, A.: Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford statistical science series, vol. 18. Clarendon Press, Oxford University Press, Oxford, New York (1997)
Chen, W., Chen, Y., Mao, Y., Guo, B.: Density-based logistic regression. In: KDD 2013, 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 140–148 (2013)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
Duong, T., Hazelton, M.: Plug-in bandwidth matrices for bivariate kernel density estimation. Journal of Nonparametric Statistics 15(1), 17–30 (2003)
Duong, T., Hazelton, M.: Cross-validation bandwidth matrices for multivariate kernel density estimation. Scandinavian Journal of Statistics 32(3), 485–506 (2005)
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33(1), 1 (2010)
Ma, Y., Luo, G., Zeng, X., Chen, A.: Transfer learning for cross-company software defect prediction. Information and Software Technology 54(3), 248–256 (2012)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359 (2010)
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 832–837 (1956)
Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on statistics and applied probability, vol. 26. Chapman and Hall (1986)
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14(5), 540–578 (2009)
Wand, M.P., Jones, M.C.: Kernel smoothing. Monographs on statistics and applied probability, vol. 60. Chapman & Hall (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kankanige, Y., Bailey, J. (2014). Improved Feature Transformations for Classification Using Density Estimation. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-13560-1_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)