Skip to main content

Improved Feature Transformations for Classification Using Density Estimation

  • Conference paper
Book cover PRICAI 2014: Trends in Artificial Intelligence (PRICAI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Abstract

Density based logistic regression (DLR) is a recently introduced classification technique, that performs a one-to-one non-linear transformation of the original feature space to another feature space based on density estimations. This new feature space is particularly well suited for learning a logistic regression model. Whilst performance gains, good interpretability and time efficiency make DLR attractive, there exist some limitations to its formulation. In this paper, we tackle these limitations and propose several new extensions: 1) A more robust methodology for performing density estimations, 2) A method that can transform two or more features into a single target feature, based on the use of higher order kernel density estimation, 3) Analysis of the utility of DLR for transfer learning scenarios. We evaluate our extensions using several synthetic and publicly available datasets, demonstrating that higher order transformations have the potential to boost prediction performance and that DLR is a promising method for transfer learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing 17, 255–287 (2010)

    Google Scholar 

  2. Bowman, A.W., Azzalini, A.: Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford statistical science series, vol. 18. Clarendon Press, Oxford University Press, Oxford, New York (1997)

    MATH  Google Scholar 

  3. Chen, W., Chen, Y., Mao, Y., Guo, B.: Density-based logistic regression. In: KDD 2013, 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 140–148 (2013)

    Google Scholar 

  4. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)

    MATH  Google Scholar 

  5. Duong, T., Hazelton, M.: Plug-in bandwidth matrices for bivariate kernel density estimation. Journal of Nonparametric Statistics 15(1), 17–30 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  6. Duong, T., Hazelton, M.: Cross-validation bandwidth matrices for multivariate kernel density estimation. Scandinavian Journal of Statistics 32(3), 485–506 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  7. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33(1), 1 (2010)

    Google Scholar 

  8. Ma, Y., Luo, G., Zeng, X., Chen, A.: Transfer learning for cross-company software defect prediction. Information and Software Technology 54(3), 248–256 (2012)

    Article  Google Scholar 

  9. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  10. Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 832–837 (1956)

    Google Scholar 

  11. Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on statistics and applied probability, vol. 26. Chapman and Hall (1986)

    Google Scholar 

  12. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)

    Google Scholar 

  13. Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14(5), 540–578 (2009)

    Article  Google Scholar 

  14. Wand, M.P., Jones, M.C.: Kernel smoothing. Monographs on statistics and applied probability, vol. 60. Chapman & Hall (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kankanige, Y., Bailey, J. (2014). Improved Feature Transformations for Classification Using Density Estimation. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13560-1_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13559-5

  • Online ISBN: 978-3-319-13560-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics