Abstract
With the developments in mobile and wireless technology, mobile devices have become an important part of our lives. While Android is the leading operating system in market share, it is the platform most targeted by attackers. Although many solutions have been proposed in the literature for the detection of Android malware, there is still a need for attribute selection methods to be used in Android malware detection systems. In this study, a machine learning-based malware detection system is proposed to distinguish Android malware from benign applications. At the feature selection stage of the proposed malware detection system, it is aimed to remove unnecessary features by using a linear regression-based feature selection approach. In this way, the dimension of the feature vector is reduced, the training time is decreased, and the classification model can be used in real-time malware detection systems. When the results of the study are examined, the highest 0.961 is obtained according to the F-measure metric by using at least 27 features.
Similar content being viewed by others
References
Android asset packaging tool (AAPT2) (2020). https://developer.android.com/studio/command-line/aapt2
Abdullah Z, Saudi MM, Anuar NB (2017) Abc: android botnet classification using feature selection and classification algorithms. Adv Sci Lett 23(5):4717–4720
Alazab M, Alazab M, Shalaginov A, Mesleh A, Awajan A (2020) Intelligent mobile malware detection using permission requests and API calls. Future Gener Comput Syst 107:509–521
Altaher A (2016) Classification of android malware applications using feature selection and classification algorithms. VAWKUM Trans Comput Sci 10(1):1–5
Amin M, Tanveer TA, Tehseen M, Khan M, Khan FA, Anwar S (2020) Static malware detection and attribution in android byte-code through an end-to-end deep system. Future Gener Comput Syst 102:112–126
Apkpure android application store. (2020)APKPure.com
Arauzo-Azofra A, Aznarte JL, Benítez JM (2011) Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst Appl 38(7):8170–8177
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of android malware in your pocket. Netw Distrib Syst Secur Symp (NDSS) 14:23–26
Bai H, Xie N, Di X, Ye Q (2020) Famd: A fast multifeature android malware detection framework, design, and implementation. IEEE Access 8:194729–194740
Bhat P, Dutta K (2019) A survey on various threats and current state of security in android platform. ACM Comput Surv (CSUR) 52(1):1–35
Bhattacharya A, Goswami RT, Mukherjee K (2019) A feature selection technique based on rough set and improvised PSO algorithm (PSORS-FS) for permission based detection of android malwares. Int J Mach Learn Cybern 10(7):1893–1907
Burguera I, Zurutuza U, Nadjm-Tehrani S (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, pp. 15–26
Dorogush AV, Ershov V, Gulin A (2018) Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363
Enck W, Ongtang M, McDaniel P (2009) Understanding android security. IEEE Secur Priv 7(1):50–57
Faruki P, Bharmal A, Laxmi V, Ganmoor V, Gaur MS, Conti M, Rajarajan M (2014) Android security: a survey of issues, malware penetration, and defenses. IEEE Commun Surv Tutor 17(2):998–1022
Fatima A, Maurya R, Dutta MK, Burget R, Masek J (2019) Android malware detection using genetic algorithm based optimized feature selection and machine learning. In: 2019 42nd International conference on telecommunications and signal processing (TSP), pp. 220–223. IEEE
Feizollah A, Anuar NB, Salleh R, Wahab AWA (2015) A review on feature selection in mobile malware detection. Dig Investig 13:22–37
Fereidooni H, Conti M, Yao D, Sperduti A (2016) Anastasia: android malware detection using static analysis of applications. In: 2016 8th IFIP International conference on new technologies, mobility and security (NTMS), pp. 1–5 . https://doi.org/10.1109/NTMS.2016.7792435
Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Future Genera Comput Syst 117:47–58
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
IT threat evolution Q2 2020. Mobile statistics. (2020)https://securelist.com/it-threat-evolution-q2-2020-mobile-statistics/98337/
Khandoker TUI, Huang D, Sreeram V (2011) A low complexity linear regression approach to time synchronization in underwater networks. In: 2011 8th International conference on information, communications and signal processing, pp. 1–5. IEEE
Latif J, Xiao C, Tu S, Rehman SU, Imran A, Bilal A (2020) Implementation and use of disease diagnosis systems for electronic medical records based on machine learning: A complete review. IEEE Access 8:150489–150513
Linux system call table. (2020) https://thevivekpandey.github.io/posts/2017-09-25-linux-system-calls.html
Liu K, Xu S, Xu G, Zhang M, Sun D, Liu H (2020) A review of android malware detection approaches based on machine learning. IEEE Access 8:124579–124607
The official site for Android app developers. (2020) https://developer.android.com/reference/android/Manifest.permission.html
Montgomery DC, Peck EA, Vining GG (2012) Introduction to linear regression analysis, vol 821. Wiley, New Jersey
Padmanabhan J, Johnson Premkumar MJ (2015) Machine learning in automatic speech recognition: a survey. IETE Tech Rev 32(4):240–251
Pan Y, Ge X, Fang C, Fan Y (2020) A systematic literature review of android malware detection using static analysis. IEEE Access 8:116363–116379
Pehlivan U, Baltaci N, Acartürk C, Baykal N (2014) The analysis of feature selection methods and classification algorithms in permission based android malware detection. In: 2014 IEEE symposium on computational intelligence in cyber security (CICS), pp. 1–8. IEEE
RM SP, Maddikunta PKR, Parimala M, Koppu S, Reddy T, Chowdhary CL, Alazab M (2020) An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in iomt architecture. Computer Communications
Salah A, Shalabi E, Khedr W (2020) A lightweight android malware classifier using novel feature selection methods. Symmetry 12(5):858
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG, Álvarez G (2013) Puma: Permission usage to detect malware in android. In: International joint conference CISIS-12-ICEUTE 12-SOCO 12 Special Sessions, pp. 289–298. Springer
Srinivasan K, Fisher D (1995) Machine learning approaches to estimating software development effort. IEEE Trans Softw Eng 21(2):126–137
Number of smartphone users from 2016 to 2021. (2020)https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/
Global market share held by the leading smartphone operating systems in sales to end users from 1st quarter 2009 to 2nd quarter 2018. (2020)https://www.statista.com/statistics/266136/
Suarez-Tangil G, Tapiador JE, Peris-Lopez P, Blasco J (2014) Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Exp Syst Appl 41(4):1104–1117
Wang W, Zhao M, Gao Z, Xu G, Xian H, Li Y, Zhang X (2019) Constructing features for detecting android malicious applications: Issues, taxonomy and directions. IEEE Access 7:67602–67631
Wei F, Li Y, Roy S, Ou X, Zhou W (2017) Deep ground truth analysis of current android malware. In: International conference on detection of intrusions and malware, and vulnerability assessment, pp. 252–276. Springer
Wei X, Chen W, Li X (2021) Exploring the financial indicators to improve the pattern recognition of economic data based on machine learning. Neural Comput Appl 33:723–737
Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) Droidmat: android malware detection through manifest and api calls tracing. In: 2012 Seventh Asia joint conference on information security, pp. 62–69. IEEE
Yerima SY, Sezer S, McWilliams G (2014) Analysis of Bayesian classification-based approaches for android malware detection. IET Inf Secur 8(1):25–36
Yerima SY, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using bayesian classification. In: 2013 27th International conference on advanced information networking and applications (AINA), pp. 121–128. IEEE
Yildiz O, Dogru IA (2019) Permission-based android malware detection system using feature selection with genetic algorithm. Int J Softw Eng Knowl Eng 29(02):245–262
Yuan H, Tang Y, Sun W, Liu L (2020) A detection method for android application security based on TF-IDF and machine learning. Plos one 15(9):e0238694
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Şahin, D.Ö., Kural, O.E., Akleylek, S. et al. A novel permission-based Android malware detection system using feature selection based on linear regression. Neural Comput & Applic 35, 4903–4918 (2023). https://doi.org/10.1007/s00521-021-05875-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05875-1