Skip to main content
Log in

A novel permission-based Android malware detection system using feature selection based on linear regression

  • S.I. : Machine Learning Applications for Security
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With the developments in mobile and wireless technology, mobile devices have become an important part of our lives. While Android is the leading operating system in market share, it is the platform most targeted by attackers. Although many solutions have been proposed in the literature for the detection of Android malware, there is still a need for attribute selection methods to be used in Android malware detection systems. In this study, a machine learning-based malware detection system is proposed to distinguish Android malware from benign applications. At the feature selection stage of the proposed malware detection system, it is aimed to remove unnecessary features by using a linear regression-based feature selection approach. In this way, the dimension of the feature vector is reduced, the training time is decreased, and the classification model can be used in real-time malware detection systems. When the results of the study are examined, the highest 0.961 is obtained according to the F-measure metric by using at least 27 features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Android asset packaging tool (AAPT2) (2020). https://developer.android.com/studio/command-line/aapt2

  2. Abdullah Z, Saudi MM, Anuar NB (2017) Abc: android botnet classification using feature selection and classification algorithms. Adv Sci Lett 23(5):4717–4720

    Article  Google Scholar 

  3. Alazab M, Alazab M, Shalaginov A, Mesleh A, Awajan A (2020) Intelligent mobile malware detection using permission requests and API calls. Future Gener Comput Syst 107:509–521

    Article  Google Scholar 

  4. Altaher A (2016) Classification of android malware applications using feature selection and classification algorithms. VAWKUM Trans Comput Sci 10(1):1–5

    Article  Google Scholar 

  5. Amin M, Tanveer TA, Tehseen M, Khan M, Khan FA, Anwar S (2020) Static malware detection and attribution in android byte-code through an end-to-end deep system. Future Gener Comput Syst 102:112–126

    Article  Google Scholar 

  6. Apkpure android application store. (2020)APKPure.com

  7. Arauzo-Azofra A, Aznarte JL, Benítez JM (2011) Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst Appl 38(7):8170–8177

    Article  Google Scholar 

  8. Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of android malware in your pocket. Netw Distrib Syst Secur Symp (NDSS) 14:23–26

    Google Scholar 

  9. Bai H, Xie N, Di X, Ye Q (2020) Famd: A fast multifeature android malware detection framework, design, and implementation. IEEE Access 8:194729–194740

    Article  Google Scholar 

  10. Bhat P, Dutta K (2019) A survey on various threats and current state of security in android platform. ACM Comput Surv (CSUR) 52(1):1–35

    Article  Google Scholar 

  11. Bhattacharya A, Goswami RT, Mukherjee K (2019) A feature selection technique based on rough set and improvised PSO algorithm (PSORS-FS) for permission based detection of android malwares. Int J Mach Learn Cybern 10(7):1893–1907

    Article  Google Scholar 

  12. Burguera I, Zurutuza U, Nadjm-Tehrani S (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, pp. 15–26

  13. Dorogush AV, Ershov V, Gulin A (2018) Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363

  14. Enck W, Ongtang M, McDaniel P (2009) Understanding android security. IEEE Secur Priv 7(1):50–57

    Article  Google Scholar 

  15. Faruki P, Bharmal A, Laxmi V, Ganmoor V, Gaur MS, Conti M, Rajarajan M (2014) Android security: a survey of issues, malware penetration, and defenses. IEEE Commun Surv Tutor 17(2):998–1022

    Article  Google Scholar 

  16. Fatima A, Maurya R, Dutta MK, Burget R, Masek J (2019) Android malware detection using genetic algorithm based optimized feature selection and machine learning. In: 2019 42nd International conference on telecommunications and signal processing (TSP), pp. 220–223. IEEE

  17. Feizollah A, Anuar NB, Salleh R, Wahab AWA (2015) A review on feature selection in mobile malware detection. Dig Investig 13:22–37

    Article  Google Scholar 

  18. Fereidooni H, Conti M, Yao D, Sperduti A (2016) Anastasia: android malware detection using static analysis of applications. In: 2016 8th IFIP International conference on new technologies, mobility and security (NTMS), pp. 1–5 . https://doi.org/10.1109/NTMS.2016.7792435

  19. Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Future Genera Comput Syst 117:47–58

    Article  Google Scholar 

  20. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  21. IT threat evolution Q2 2020. Mobile statistics. (2020)https://securelist.com/it-threat-evolution-q2-2020-mobile-statistics/98337/

  22. Khandoker TUI, Huang D, Sreeram V (2011) A low complexity linear regression approach to time synchronization in underwater networks. In: 2011 8th International conference on information, communications and signal processing, pp. 1–5. IEEE

  23. Latif J, Xiao C, Tu S, Rehman SU, Imran A, Bilal A (2020) Implementation and use of disease diagnosis systems for electronic medical records based on machine learning: A complete review. IEEE Access 8:150489–150513

    Article  Google Scholar 

  24. Linux system call table. (2020) https://thevivekpandey.github.io/posts/2017-09-25-linux-system-calls.html

  25. Liu K, Xu S, Xu G, Zhang M, Sun D, Liu H (2020) A review of android malware detection approaches based on machine learning. IEEE Access 8:124579–124607

    Article  Google Scholar 

  26. The official site for Android app developers. (2020) https://developer.android.com/reference/android/Manifest.permission.html

  27. Montgomery DC, Peck EA, Vining GG (2012) Introduction to linear regression analysis, vol 821. Wiley, New Jersey

    MATH  Google Scholar 

  28. Padmanabhan J, Johnson Premkumar MJ (2015) Machine learning in automatic speech recognition: a survey. IETE Tech Rev 32(4):240–251

    Article  Google Scholar 

  29. Pan Y, Ge X, Fang C, Fan Y (2020) A systematic literature review of android malware detection using static analysis. IEEE Access 8:116363–116379

    Article  Google Scholar 

  30. Pehlivan U, Baltaci N, Acartürk C, Baykal N (2014) The analysis of feature selection methods and classification algorithms in permission based android malware detection. In: 2014 IEEE symposium on computational intelligence in cyber security (CICS), pp. 1–8. IEEE

  31. RM SP, Maddikunta PKR, Parimala M, Koppu S, Reddy T, Chowdhary CL, Alazab M (2020) An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in iomt architecture. Computer Communications

  32. Salah A, Shalabi E, Khedr W (2020) A lightweight android malware classifier using novel feature selection methods. Symmetry 12(5):858

    Article  Google Scholar 

  33. Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG, Álvarez G (2013) Puma: Permission usage to detect malware in android. In: International joint conference CISIS-12-ICEUTE 12-SOCO 12 Special Sessions, pp. 289–298. Springer

  34. Srinivasan K, Fisher D (1995) Machine learning approaches to estimating software development effort. IEEE Trans Softw Eng 21(2):126–137

    Article  Google Scholar 

  35. Number of smartphone users from 2016 to 2021. (2020)https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/

  36. Global market share held by the leading smartphone operating systems in sales to end users from 1st quarter 2009 to 2nd quarter 2018. (2020)https://www.statista.com/statistics/266136/

  37. Suarez-Tangil G, Tapiador JE, Peris-Lopez P, Blasco J (2014) Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Exp Syst Appl 41(4):1104–1117

    Article  Google Scholar 

  38. Wang W, Zhao M, Gao Z, Xu G, Xian H, Li Y, Zhang X (2019) Constructing features for detecting android malicious applications: Issues, taxonomy and directions. IEEE Access 7:67602–67631

    Article  Google Scholar 

  39. Wei F, Li Y, Roy S, Ou X, Zhou W (2017) Deep ground truth analysis of current android malware. In: International conference on detection of intrusions and malware, and vulnerability assessment, pp. 252–276. Springer

  40. Wei X, Chen W, Li X (2021) Exploring the financial indicators to improve the pattern recognition of economic data based on machine learning. Neural Comput Appl 33:723–737

    Article  Google Scholar 

  41. Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) Droidmat: android malware detection through manifest and api calls tracing. In: 2012 Seventh Asia joint conference on information security, pp. 62–69. IEEE

  42. Yerima SY, Sezer S, McWilliams G (2014) Analysis of Bayesian classification-based approaches for android malware detection. IET Inf Secur 8(1):25–36

    Article  Google Scholar 

  43. Yerima SY, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using bayesian classification. In: 2013 27th International conference on advanced information networking and applications (AINA), pp. 121–128. IEEE

  44. Yildiz O, Dogru IA (2019) Permission-based android malware detection system using feature selection with genetic algorithm. Int J Softw Eng Knowl Eng 29(02):245–262

    Article  Google Scholar 

  45. Yuan H, Tang Y, Sun W, Liu L (2020) A detection method for android application security based on TF-IDF and machine learning. Plos one 15(9):e0238694

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Durmuş Özkan Şahin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Şahin, D.Ö., Kural, O.E., Akleylek, S. et al. A novel permission-based Android malware detection system using feature selection based on linear regression. Neural Comput & Applic 35, 4903–4918 (2023). https://doi.org/10.1007/s00521-021-05875-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05875-1

Keywords

Navigation