Abstract
Mixed Boolean-Arithmetic (MBA) expression mixes bitwise operations (e.g., AND, OR, and NOT) and arithmetic operations (e.g., ADD and IMUL). It enables a semantic-preserving program transformation to convert a simple expression to a difficult-to-understand but equivalent form. MBA expression has been widely adopted as a highly effective and low-cost obfuscation scheme. However, state-of-the-art deobfuscation research proposes substantial challenges to the MBA obfuscation technique. Attacking methods such as bit-blasting, pattern matching, program synthesis, deep learning, and mathematical transformation can successfully simplify specific categories of MBA expressions. Existing MBA obfuscation must be enhanced to overcome these emerging challenges.
In this paper, we first review existing MBA obfuscation methods and reveal that existing MBA obfuscation is based on “linear MBA”, a simple subset of MBA transformation. This leaves the more complex “non-linear MBA” in its infancy. Therefore, we propose a new obfuscation method to unleash the power of non-linear MBA. Non-linear MBA expressions are generated from the combination or transformation of linear MBA rules based on a solid theoretical underpinning. Comparing to existing MBA obfuscation, our method can generate significantly more complex MBA expressions. To present the practicability of the non-linear MBA obfuscation scheme, we apply non-linear MBA obfuscation to the Tiny Encryption Algorithm (TEA). We have implemented the method as a prototype tool, named MBA-Obfuscator, to produce a large-scale dataset. We run all existing MBA simplification tools on the dataset, and at most 147 out of 1,000 non-linear MBA expressions can be successfully simplified. Our evaluation shows MBA-Obfuscator is a practical obfuscation scheme with a solid theoretical cornerstone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Biondi, F., Josse, S., Legay, A.: Bypassing Malware Obfuscation with Dynamic Synthesis. https://ercim-news.ercim.eu/en106/special/bypassing-malware-obfuscation-with-dynamic-synthesis (July 2016)
Biondi, F., Josse, S., Legay, A., Sirvent, T.: Effectiveness of synthesis in concolic deobfuscation. Comput. Secur. 70, 500–515 (2017)
Blazy, S., Hutin, R.: Formal verification of a program obfuscation based on mixed boolean-arithmetic expressions. In: Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2019) (2019)
Blazytko, T., Contag, M., Aschermann, C., Holz, T.: Syntia: synthesizing the semantics of obfuscated code. In: Proceedings of the 26th USENIX Security Symposium (USENIX Security 2017) (2017)
Collberg, C., Martin, S., Myers, J., Nagra, J.: Distributed application tamper detection via continuous software updates. In: Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC 2012) (2012)
Collberg, C., Martin, S., Myers, J., Zimmerman, B.: Documentation for Arithmetic Encodings in Tigress. http://tigress.cs.arizona.edu/transformPage/docs/encodeArithmetic
Collberg, C., Martin, S., Myers, J., Zimmerman, B.: Documentation for Data Encodings in Tigress. http://tigress.cs.arizona.edu/transformPage/docs/encodeData
Eyrolles, N.: Obfuscation with Mixed Boolean-Arithmetic Expressions: Reconstruction, Analysis and Simplification Tools. Ph.D. thesis, Université Paris-Saclay (2017)
Eyrolles, N., Goubin, L., Videau, M.: Defeating MBA-based obfuscation. In: Proceedings of the 2016 ACM Workshop on Software PROtection (SPRO 2016) (2016)
Feng, W., Liu, B., Xu, D., Zheng, Q., Xu, Y.: Neureduce: Reducing mixed boolean-arithmetic expressions by recurrent neural network. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 635–644 (2020)
Guinet, A., Eyrolles, N., Videau, M.: Arybo: Manipulation, canonicalization and identification of mixed boolean-arithmetic symbolic expressions. In: Proceedings of GreHack 2016 (2016)
Gulwani, S., Polozov, O., Singh, R.: Program Synthesis. Found. Trends in Program. Lang. 4(1–2), 1–119 (2017)
Irdeto: Irdeto Cloaked CA: a secure, flexible and cost-effective conditional access system. www.irdeto.com (2017)
Israsena, P.: Securing ubiquitous and low-cost rfid using tiny encryption algorithm. In: 2006 1st International Symposium on Wireless Pervasive Computing, pp. 4-pp. IEEE (2006)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO 2004) (2004)
Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52, 107–115 (2009)
Liem, C., Gu, Y.X., Johnson, H.: A compiler-based infrastructure for software-protection. In: Proceedings of the 3rd ACM SIGPLAN Workshop on Programming Languages and Analysis for Security (PLAS 2008) (2008)
Liu, B., Shen, J., Ming, J., Zheng, Q., Li, J., Xu, D.: Mba-blast: unveiling and simplifying mixed boolean-arithmetic obfuscation. In: 30th USENIX Security Symposium (USENIX Security 2021) (2021)
Ma, H., Jia, C., Li, S., Zheng, W., Wu, D.: Xmark: dynamic software watermarking using collatz conjecture. IEEE Trans. Inf. Forensics Secur. 14, 2859–2874 (2019)
MapleSoft: The Essential Tool for Mathematics. https://www.maplesoft.com/products/maple/ (2020)
Mougey, C., Gabriel, F.: DRM obfuscation versus auxiliary attacks. In: REcon Conference (2014)
Moura, L.D., Bjørner, N.: Z3: an efficient SMT solver. In: Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) (2008)
Quarkslab: Epona Application Protection v1.5. https://epona.quarkslab.com (July 2019)
Rahim, R., et al.: Tiny encryption algorithm and pixel value differencing for enhancement security message. Int. J. Eng. Technol. 7(2.9), 82–85 (2018)
Sagemath: SageMath. http://www.sagemath.org/ (2020)
Schrittwieser, S., Katzenbeisser, S., Kinder, J., Merzdovnik, G., Weippl, E.: Protecting software through obfuscation: Can it keep pace with progress in code analysis? ACM Comput. Surv. (CSUR) 49(1), 1–37 (2016)
Suwartadi, E., Gunawan, C., Setijadi, A., Machbub, C.: First step toward internet based embedded control system. In: IEEE 5th Asian Control Conference, vol. 2, pp. 1226–1231 (2004)
Wheeler, D.J., Needham, R.M.: TEA, a tiny encryption algorithm. In: Proceedings of the 2nd International Workshop on Fast Software Encryption (1994)
WOLFRAM: WOLFRAM MATHEMATICA. http://www.wolfram.com/mathematica/ (2020)
Zafar, F., Olano, M., Curtis, A.: Gpu random numbers via the tiny encryption algorithm. In: Proceedings of the Conference on High Performance Graphics, pp. 133–141 (2010)
Zhou, Y., Main, A.: Diversity via Code Transformations: A Solution for NGNA Renewable Security. The National Cable and Telecommunications Association Show (2006)
Zhou, Yongxin, Main, Alec, Gu, Yuan X., Johnson, Harold: Information hiding in software with mixed boolean-arithmetic transforms. In: Kim, Sehun, Yung, Moti, Lee, Hyung-Woo (eds.) WISA 2007. LNCS, vol. 4867, pp. 61–75. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-77535-5_5
Acknowledgments
We would like to thank our shepherd Roland Yap and anonymous paper reviewers for their helpful feedback. We also thank team members from Anhui Province Key Laboratory of High Performance Computing (USTC) and the UNH SoftSec group for their valuable suggestions.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, B., Feng, W., Zheng, Q., Li, J., Xu, D. (2021). Software Obfuscation with Non-Linear Mixed Boolean-Arithmetic Expressions. In: Gao, D., Li, Q., Guan, X., Liao, X. (eds) Information and Communications Security. ICICS 2021. Lecture Notes in Computer Science(), vol 12918. Springer, Cham. https://doi.org/10.1007/978-3-030-86890-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-86890-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86889-5
Online ISBN: 978-3-030-86890-1
eBook Packages: Computer ScienceComputer Science (R0)