Skip to main content

Are Sequential Patterns Shareable? Ensuring Individuals’ Privacy

  • Conference paper
  • First Online:
Modeling Decisions for Artificial Intelligence (MDAI 2021)

Abstract

Individuals’ actions like smartphone usage, internet shopping, bank card transaction, watched movies can all be represented in form of sequences. Accordingly, these sequences have meaningful frequent temporal patterns that scientist and companies study to understand different phenomena and business processes. Therefore, we tend to believe that patterns are de-identified from individuals’ identity and safe to share for studies. Nevertheless, we show, through unicity tests, that the combination of different patterns could act as a quasi-identifier causing a privacy breach, revealing private patterns. To solve this problem, we propose to use \(\epsilon \)-differential privacy over the extracted patterns to add uncertainty to the association between the individuals and their true patterns. Our results show that its possible to reduce significantly the privacy risk conserving data utility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://unstats.un.org/unsd/iiss/Classification-of-Individual-Consumption-According-to-Purpose-COICOP.ashx.

References

  1. Alatrista-Salas, H., Azé, J., Bringay, S., Cernesson, F., Selmaoui-Folcher, N., Teisseire, M.: A knowledge discovery process for spatiotemporal data: application to river water quality monitoring. Ecol. Inform. 26, 127–139 (2015)

    Article  Google Scholar 

  2. Alatrista-Salas, H., Guevara-Cogorno, A., Maehara, Y., Nunez-del-Prado, M.: Efficiently mining gapped and window constraint frequent sequential patterns. In: Torra, V., Narukawa, Y., Nin, J., Agell, N. (eds.) MDAI 2020. LNCS (LNAI), vol. 12256, pp. 240–251. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57524-3_20

    Chapter  Google Scholar 

  3. Amiri, M., Mohammad-Khanli, L., Mirandola, R.: A sequential pattern mining model for application workload prediction in cloud environment. J. Netw. Comput. Appl. 105, 21–62 (2018)

    Article  Google Scholar 

  4. Bonomi, L., Xiong, L.: Mining frequent patterns with differential privacy. Proc. VLDB Endow. 6(12), 1422–1427 (2013)

    Article  Google Scholar 

  5. Ceci, M., Lanotte, P.F.: Closed sequential pattern mining for sitemap generation. World Wide Web 24(1), 175–203 (2020). https://doi.org/10.1007/s11280-020-00839-2

    Article  Google Scholar 

  6. Chen, R., Acs, G., Castelluccia, C.: Differentially private sequential data publication via variable-length n-grams. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 638–649 (2012)

    Google Scholar 

  7. Chen, R., Fung, B.C., Desai, B.C., Sossou, N.M.: Differentially private transit data publication: a case study on the montreal transportation system. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–221 (2012)

    Google Scholar 

  8. Dankar, F.K., El Emam, K., Neisa, A., Roffey, T.: Estimating the re-identification risk of clinical data sets. BMC Med. Inform. Decis. Making 12(1), 66 (2012)

    Article  Google Scholar 

  9. Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: Pre-Proceedings of ETK-NTTS, vol. 2, pp. 807–826 (2001)

    Google Scholar 

  10. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–134 (2001)

    Google Scholar 

  11. Dwork, C.: Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming - Volume Part II, ICALP 2006, pp. 1–12 (2006)

    Google Scholar 

  12. Gambs, S., Killijian, M.O., del Prado Cortez, M.N.: De-anonymization attack on geolocated data. J. Comput. Syst. Sci. 80(8), 1597–1614 (2014)

    Article  MathSciNet  Google Scholar 

  13. Guevara-Cogorno, A., Flamand, C., Alatrista-Salas, H.: Copper-constraint optimized prefixspan for epidemiological research. Procedia Comput. Sci. 63, 433–438 (2015)

    Article  Google Scholar 

  14. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  15. Lee, J., Clifton, C.: How much is enough? Choosing \(\varepsilon \) for differential privacy. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 325–340. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24861-0_22

    Chapter  Google Scholar 

  16. Lien, Y.C.N., Wu, W.J., Lu, Y.L.: How well do teachers predict students’ actions in solving an ill-defined problem in stem education: a solution using sequential pattern mining. IEEE Access 8, 134976–134986 (2020)

    Article  Google Scholar 

  17. de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: The privacy bounds of human mobility. Sci. Rep. 3, 1–5 (2013)

    Article  Google Scholar 

  18. de Montjoye, Y.A., Radaelli, L., Singh, V.K., Pentland, A.: Unique in the shopping mall: on the reidentifiability of credit card metadata. Science 347(6221), 536–539 (2015)

    Article  Google Scholar 

  19. Rocher, L., Hendrickx, J.M., De Montjoye, Y.A.: Estimating the success of re-identifications in incomplete datasets using generative models. Nat. Commun. 10(1), 1–9 (2019)

    Article  Google Scholar 

  20. Salas, J.: Sanitizing and measuring privacy of large sparse datasets for recommender systems. J. Ambient Intell. Humaniz. Comput. (2019). https://doi.org/10.1007/s12652-019-01391-2

  21. Salas, J., Torra, V.: Differentially private graph publishing and randomized response for collaborative filtering. In: Proceedings of the 17th International Joint Conference on e-Business and Telecommunications, ICETE 2020-V2: SECRYPT, Lieusaint, Paris, France, 8–10 July 2020, pp. 415–422. ScitePress (2020)

    Google Scholar 

  22. Sánchez, D., Martínez, S., Domingo-Ferrer, J.: Comment on “unique in the shopping mall: on the reidentifiability of credit card metadata’’. Science 351(6279), 1274 (2016)

    Article  Google Scholar 

  23. Torra, V., Salas, J.: Graph perturbation as noise graph addition: a new perspective for graph anonymization. In: Pérez-Solà, C., Navarro-Arribas, G., Biryukov, A., Garcia-Alfaro, J. (eds.) DPM/CBT -2019. LNCS, vol. 11737, pp. 121–137. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31500-9_8

    Chapter  Google Scholar 

  24. Wright, A.P., Wright, A.T., McCoy, A.B., Sittig, D.F.: The use of sequential pattern mining to predict next prescribed medications. J. Biomed. Inform. 53, 73–80 (2015)

    Article  Google Scholar 

  25. Xu, S., Cheng, X., Su, S., Xiao, K., Xiong, L.: Differentially private frequent sequence mining. IEEE Trans. Knowl. Data Eng. 28(11), 2910–2926 (2016)

    Article  Google Scholar 

  26. Xu, S., Su, S., Cheng, X., Li, Z., Xiong, L.: Differentially private frequent sequence mining via sampling-based candidate pruning. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 1035–1046. IEEE (2015)

    Google Scholar 

  27. Zheng, Z., Wei, W., Liu, C., Cao, W., Cao, L., Bhatia, M.: An effective contrast sequential pattern mining approach to taxpayer behavior analysis. World Wide Web 19(4), 633–651 (2016). https://doi.org/10.1007/s11280-015-0350-4

    Article  Google Scholar 

  28. Zhou, F., Lin, X.: Frequent sequence pattern mining with differential privacy. In: Huang, D.-S., Bevilacqua, V., Premaratne, P., Gupta, P. (eds.) ICIC 2018. LNCS, vol. 10954, pp. 454–466. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95930-6_42

    Chapter  Google Scholar 

Download references

Acknowledgements

This research was partly supported by the Spanish Government under projects RTI2018-095094-B-C21 and RTI2018-095094-B-C22 “CONSENT”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Nunez-del-Prado .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nunez-del-Prado, M., Salas, J., Alatrista-Salas, H., Maehara-Aliaga, Y., Megías, D. (2021). Are Sequential Patterns Shareable? Ensuring Individuals’ Privacy. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85529-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85528-4

  • Online ISBN: 978-3-030-85529-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics