Improved Feature Transformations for Classification Using Density Estimation

Kankanige, Yamuna; Bailey, James

doi:10.1007/978-3-319-13560-1_10

Yamuna Kankanige²¹ &
James Bailey²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

6341 Accesses
2 Citations

Abstract

Density based logistic regression (DLR) is a recently introduced classification technique, that performs a one-to-one non-linear transformation of the original feature space to another feature space based on density estimations. This new feature space is particularly well suited for learning a logistic regression model. Whilst performance gains, good interpretability and time efficiency make DLR attractive, there exist some limitations to its formulation. In this paper, we tackle these limitations and propose several new extensions: 1) A more robust methodology for performing density estimations, 2) A method that can transform two or more features into a single target feature, based on the use of higher order kernel density estimation, 3) Analysis of the utility of DLR for transfer learning scenarios. We evaluate our extensions using several synthetic and publicly available datasets, demonstrating that higher order transformations have the potential to boost prediction performance and that DLR is a promising method for transfer learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing 17, 255–287 (2010)
Google Scholar
Bowman, A.W., Azzalini, A.: Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford statistical science series, vol. 18. Clarendon Press, Oxford University Press, Oxford, New York (1997)
MATH Google Scholar
Chen, W., Chen, Y., Mao, Y., Guo, B.: Density-based logistic regression. In: KDD 2013, 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 140–148 (2013)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
MATH Google Scholar
Duong, T., Hazelton, M.: Plug-in bandwidth matrices for bivariate kernel density estimation. Journal of Nonparametric Statistics 15(1), 17–30 (2003)
Article MATH MathSciNet Google Scholar
Duong, T., Hazelton, M.: Cross-validation bandwidth matrices for multivariate kernel density estimation. Scandinavian Journal of Statistics 32(3), 485–506 (2005)
Article MATH MathSciNet Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33(1), 1 (2010)
Google Scholar
Ma, Y., Luo, G., Zeng, X., Chen, A.: Transfer learning for cross-company software defect prediction. Information and Software Technology 54(3), 248–256 (2012)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359 (2010)
Article Google Scholar
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 832–837 (1956)
Google Scholar
Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on statistics and applied probability, vol. 26. Chapman and Hall (1986)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)
Google Scholar
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14(5), 540–578 (2009)
Article Google Scholar
Wand, M.P., Jones, M.C.: Kernel smoothing. Monographs on statistics and applied probability, vol. 60. Chapman & Hall (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Systems, University of Melbourne, Melbourne, Australia
Yamuna Kankanige & James Bailey

Authors

Yamuna Kankanige
View author publications
You can also search for this author in PubMed Google Scholar
James Bailey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIMOS Berhad Technology Park Malaysia, 57000, Bukit Jalil, KL, Malaysia
Duc-Nghia Pham
Kyungpook National University, Sankyuk-Dong, Buk-Gu, 702-701, Daegu, Korea
Seong-Bae Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kankanige, Y., Bailey, J. (2014). Improved Feature Transformations for Classification Using Density Estimation. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-13560-1_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics