Abstract
Individuals’ actions like smartphone usage, internet shopping, bank card transaction, watched movies can all be represented in form of sequences. Accordingly, these sequences have meaningful frequent temporal patterns that scientist and companies study to understand different phenomena and business processes. Therefore, we tend to believe that patterns are de-identified from individuals’ identity and safe to share for studies. Nevertheless, we show, through unicity tests, that the combination of different patterns could act as a quasi-identifier causing a privacy breach, revealing private patterns. To solve this problem, we propose to use \(\epsilon \)-differential privacy over the extracted patterns to add uncertainty to the association between the individuals and their true patterns. Our results show that its possible to reduce significantly the privacy risk conserving data utility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alatrista-Salas, H., Azé, J., Bringay, S., Cernesson, F., Selmaoui-Folcher, N., Teisseire, M.: A knowledge discovery process for spatiotemporal data: application to river water quality monitoring. Ecol. Inform. 26, 127–139 (2015)
Alatrista-Salas, H., Guevara-Cogorno, A., Maehara, Y., Nunez-del-Prado, M.: Efficiently mining gapped and window constraint frequent sequential patterns. In: Torra, V., Narukawa, Y., Nin, J., Agell, N. (eds.) MDAI 2020. LNCS (LNAI), vol. 12256, pp. 240–251. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57524-3_20
Amiri, M., Mohammad-Khanli, L., Mirandola, R.: A sequential pattern mining model for application workload prediction in cloud environment. J. Netw. Comput. Appl. 105, 21–62 (2018)
Bonomi, L., Xiong, L.: Mining frequent patterns with differential privacy. Proc. VLDB Endow. 6(12), 1422–1427 (2013)
Ceci, M., Lanotte, P.F.: Closed sequential pattern mining for sitemap generation. World Wide Web 24(1), 175–203 (2020). https://doi.org/10.1007/s11280-020-00839-2
Chen, R., Acs, G., Castelluccia, C.: Differentially private sequential data publication via variable-length n-grams. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 638–649 (2012)
Chen, R., Fung, B.C., Desai, B.C., Sossou, N.M.: Differentially private transit data publication: a case study on the montreal transportation system. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–221 (2012)
Dankar, F.K., El Emam, K., Neisa, A., Roffey, T.: Estimating the re-identification risk of clinical data sets. BMC Med. Inform. Decis. Making 12(1), 66 (2012)
Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: Pre-Proceedings of ETK-NTTS, vol. 2, pp. 807–826 (2001)
Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–134 (2001)
Dwork, C.: Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming - Volume Part II, ICALP 2006, pp. 1–12 (2006)
Gambs, S., Killijian, M.O., del Prado Cortez, M.N.: De-anonymization attack on geolocated data. J. Comput. Syst. Sci. 80(8), 1597–1614 (2014)
Guevara-Cogorno, A., Flamand, C., Alatrista-Salas, H.: Copper-constraint optimized prefixspan for epidemiological research. Procedia Comput. Sci. 63, 433–438 (2015)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Lee, J., Clifton, C.: How much is enough? Choosing \(\varepsilon \) for differential privacy. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 325–340. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24861-0_22
Lien, Y.C.N., Wu, W.J., Lu, Y.L.: How well do teachers predict students’ actions in solving an ill-defined problem in stem education: a solution using sequential pattern mining. IEEE Access 8, 134976–134986 (2020)
de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: The privacy bounds of human mobility. Sci. Rep. 3, 1–5 (2013)
de Montjoye, Y.A., Radaelli, L., Singh, V.K., Pentland, A.: Unique in the shopping mall: on the reidentifiability of credit card metadata. Science 347(6221), 536–539 (2015)
Rocher, L., Hendrickx, J.M., De Montjoye, Y.A.: Estimating the success of re-identifications in incomplete datasets using generative models. Nat. Commun. 10(1), 1–9 (2019)
Salas, J.: Sanitizing and measuring privacy of large sparse datasets for recommender systems. J. Ambient Intell. Humaniz. Comput. (2019). https://doi.org/10.1007/s12652-019-01391-2
Salas, J., Torra, V.: Differentially private graph publishing and randomized response for collaborative filtering. In: Proceedings of the 17th International Joint Conference on e-Business and Telecommunications, ICETE 2020-V2: SECRYPT, Lieusaint, Paris, France, 8–10 July 2020, pp. 415–422. ScitePress (2020)
Sánchez, D., Martínez, S., Domingo-Ferrer, J.: Comment on “unique in the shopping mall: on the reidentifiability of credit card metadata’’. Science 351(6279), 1274 (2016)
Torra, V., Salas, J.: Graph perturbation as noise graph addition: a new perspective for graph anonymization. In: Pérez-Solà, C., Navarro-Arribas, G., Biryukov, A., Garcia-Alfaro, J. (eds.) DPM/CBT -2019. LNCS, vol. 11737, pp. 121–137. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31500-9_8
Wright, A.P., Wright, A.T., McCoy, A.B., Sittig, D.F.: The use of sequential pattern mining to predict next prescribed medications. J. Biomed. Inform. 53, 73–80 (2015)
Xu, S., Cheng, X., Su, S., Xiao, K., Xiong, L.: Differentially private frequent sequence mining. IEEE Trans. Knowl. Data Eng. 28(11), 2910–2926 (2016)
Xu, S., Su, S., Cheng, X., Li, Z., Xiong, L.: Differentially private frequent sequence mining via sampling-based candidate pruning. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 1035–1046. IEEE (2015)
Zheng, Z., Wei, W., Liu, C., Cao, W., Cao, L., Bhatia, M.: An effective contrast sequential pattern mining approach to taxpayer behavior analysis. World Wide Web 19(4), 633–651 (2016). https://doi.org/10.1007/s11280-015-0350-4
Zhou, F., Lin, X.: Frequent sequence pattern mining with differential privacy. In: Huang, D.-S., Bevilacqua, V., Premaratne, P., Gupta, P. (eds.) ICIC 2018. LNCS, vol. 10954, pp. 454–466. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95930-6_42
Acknowledgements
This research was partly supported by the Spanish Government under projects RTI2018-095094-B-C21 and RTI2018-095094-B-C22 “CONSENT”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nunez-del-Prado, M., Salas, J., Alatrista-Salas, H., Maehara-Aliaga, Y., Megías, D. (2021). Are Sequential Patterns Shareable? Ensuring Individuals’ Privacy. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-85529-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)