Abstract
Mining of spatial data is an enabling technology for mobile services, Internet-connected cars and the Internet of Things. But the very distinctiveness of spatial data that drives utility can cost user privacy. Past work has focused upon points and trajectories for differentially private release. In this work, we continue the tradition of privacy-preserving spatial analytics, focusing not on point or path data, but on planar spatial regions. Such data represent the area of a user’s most frequent visitation—such as “around home and nearby shops”. Specifically we consider the differentially private release of data structures that support range queries for counting users’ spatial regions. Counting planar regions leads to unique challenges not faced in existing work. A user’s spatial region that straddles multiple data structure cells can lead to duplicate counting at query time. We provably avoid this pitfall by leveraging the Euler characteristic for the first time with differential privacy. To address the increased sensitivity of range queries to spatial region data, we calibrate privacy-preserving noise using bounded user region size and a constrained inference that uses robust least absolute deviations. Our novel constrained inference reduces noise and promotes covertness by (privately) imposing consistency. We provide a full end-to-end theoretical analysis of both differential privacy and high-probability utility for our approach using concentration bounds. A comprehensive experimental study on several real-world datasets establishes practical validity.
Similar content being viewed by others
Notes
We use body and region interchangeably to refer to a user’s spatial area. We use the term body to distinguish query regions from users’ regions
In the literature, the terms multiple, double or distinct counting are used interchangeably. We suggest the term “duplicate” as it conveys that objects are over-counted.
This paper extends our ICDM’2016 conference paper [18].
References
Ács G, Castelluccia C, Chen R (2012) Differentially private histogram publishing through lossy compression. In: ICDM’12, pp 1–10
Andrés ME, Bordenabe NE, Chatzikokolakis K, Palamidessi C (2013) Geo-indistinguishability: differential privacy for location-based systems. In: CCS’13, pp 901–914
Barak B, Chaudhuri K, Dwork C, Kale S, McSherry F, Talwar K (2007) Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Proceedings of the twenty-sixth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, June 11–13, 2007, Beijing, China, pp 273–282
Beigel R, Tanin E (1998) The geometry of browsing. In: LATIN ’98: theoretical informatics, third latin American symposium, pp 331–340
Beresford AR, Stajano F (2003) Location privacy in pervasive computing. IEEE Pervasive Comput 2(1):46–55
Braz F, Orlando S, Orsini R, Raffaetà A, Roncato A, Silvestri C (2007) Approximate aggregations in trajectory data warehouses. In: Proceedings of the 23rd international conference on data engineering workshops, ICDE 2007, pp 536–545
Chawla S, Dwork C, McSherry F, Talwar K (2005) On the utility of privacy-preserving histograms. In: Proceedings of the 21st conference on uncertainty in artificial intelligence
Chen R, Fung BCM, Desai BC, Sossou NM (2012) Differentially private transit data publication: a case study on the Montreal transportation system. In: KDD’12, pp 213–221
Chow CY, Mokbel MF (2011a) Privacy of spatial trajectories. In: Zheng Y, Zhou X (eds) Computing with spatial trajectories. Springer, New York, pp 109–141
Chow C-Y, Mokbel MF (2011) Trajectory privacy in location-based services and data publication. SIGKDD Explor 13(1):19–29
Cormode G, Procopiuc CM, Srivastava D, Shen E, Yu T (2012) Differentially private spatial decompositions. In: ICDE’12, pp 20–31
Dielman TE (2005) Least absolute value regression: recent contributions. J Stat Comput Simul 75(4):263–286
Dwork C (2008) Differential privacy: a survey of results. In: Theory and applications of models of computation, 5th international conference, TAMC, pp 1–19
Dwork C (2011) A firm foundation for private data analysis. Commun ACM 54(1):86–95
Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Theory of cryptography, third theory of cryptography conference, TCC, vol 3876. Lecture notes in computer science, Springer, Berlin, pp 265–284
Fan L, Xiong L, Sunderam VS (2013) Differentially private multi-dimensional time series release for traffic monitoring. In: IFIP’13. Proceedings, pp 33–48
Fanaeepour M, Kulik L, Tanin E, Rubinstein BIP (2015) The CASE histogram: privacy-aware processing of trajectory data using aggregates. GeoInformatica 19(4):747–798
Fanaeepour M, Rubinstein BIP (2016) Beyond points and paths: counting private bodies. In: ICDM, pp 131–140
Ghinita G (2013) Privacy for location-based services. In: Bertino E, Sandhu R (eds) Synthesis lectures on information security, privacy, and trust. Morgan & Claypool Publishers, San Rafael
Gruteser M, Liu X (2004) Protecting privacy in continuous location-tracking applications. IEEE Secur Priv 2(2):28–34
Gurobi Optimization, Inc. (2015) Gurobi optimizer reference manual. http://www.gurobi.com
Hay M, Rastogi V, Miklau G, Suciu D (2010) Boosting the accuracy of differentially private histograms through consistency. PVLDB 3(1):1021–1032
He X, Cormode G, Machanavajjhala A, Procopiuc CM, Srivastava D (2015) DPT: differentially private trajectory synthesis using hierarchical reference systems. PVLDB 8(11):1154–1165
Hsu J, Gaboardi M, Haeberlen A, Khanna S, Narayan A, Pierce BC, Roth A (2014) Differential privacy: an economic method for choosing epsilon. In: IEEE 27th computer security foundations symposium, CSF 2014, pp 398–410
Iliffe J, Lott R (2008) Datums and map projections for remote sensing, GIS and surveying. Whittles Publishing. https://books.google.com.au/books?id=u_4RAQAAIAAJ
Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: EDBT’10, pp 123–134
Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for Python. http://www.scipy.org/
Karmarkar N (1984) A new polynomial-time algorithm for linear programming. In: STOC’84, pp 302–311
Kifer D, Lin B (2010) Towards an axiomatization of statistical privacy and utility. In: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS, pp 147–158
Krumm J (2007) Inference attacks on location tracks. In: 5th international conference on pervasive computing, PERVASIVE’07, pp 127–143
Krumm J (2009) A survey of computational location privacy. Pers Ubiquitous Comput 13(6):391–399
Leonardi L, Orlando S, Raffaetà A, Roncato A, Silvestri C, Andrienko GL, Andrienko NV (2014) A general framework for trajectory data warehousing and visual OLAP. GeoInformatica 18(2):273–312
Li C, Hay M, Miklau G, Wang Y (2014) A data- and workload-aware query answering algorithm for range queries under differential privacy. PVLDB 7(5):341–352
López IFV, Snodgrass RT, Moon B (2005) Spatiotemporal aggregate computation: a survey. IEEE Trans Knowl Data Eng TKDE 17(2):271–286
Marketos G, Frentzos E, Ntoutsi I, Pelekis N, Raffaetà A, Theodoridis Y (2008) Building real-world trajectory warehouses. In: Seventh ACM international workshop on data engineering for wireless and mobile access, Mobide 2008, pp 8–15
Mir DJ, Isaacman S, Cáceres R, Martonosi M, Wright RN (2013) DP-WHERE: differentially private modeling of human mobility. In: Proceedings of the 2013 IEEE international conference on big data, pp 580–588
Papadias D, Kalnis P, Zhang J, Tao Y (2001) Efficient OLAP operations in spatial data warehouses. In: 7th international symposium on advances in spatial and temporal databases, SSTD’01, pp 443–459
Papadias D, Tao Y, Kalnis P, Zhang J (2002) Indexing spatio-temporal data warehouses. In: Proceedings of the 18th international conference on data engineering, ICDE’02, pp 166–175
Piorkowski M, Sarafijanovoc-Djukic N, Grossglauser M (2009) A parsimonious model of mobile partitioned networks with clustering. In: COMSNETS. http://www.comsnets.org
Primault V, Mokhtar SB, Lauradoux C, Brunie L (2014) Differentially private location privacy in practice. CoRR abs/1410.7744
Qardaji WH, Yang W, Li N (2013) Differentially private grids for geospatial data. In: ICDE’13, pp 757–768
Rubinstein BIP, Bartlett PL, Huang L, Taft N (2012) Learning in a large function space: privacy-preserving mechanisms for SVM learning. J Priv Confid 4(1):65–100
Sun C, Agrawal D, El Abbadi A (2002a) Exploring spatial datasets with histograms. In: Proceedings of the 18th international conference on data engineering, ICDE, pp 93–102
Sun C, Agrawal D, El Abbadi A (2002b) Selectivity estimation for spatial joins with geometric selections. In: EDBT’02, pp 609–626
Sun C, Bandi N, Agrawal D, El Abbadi A (2006) Exploring spatial datasets with histograms. Distrib Parallel Databases 20(1):57–88
Tao Y, Kollios G, Considine J, Li F, Papadias D (2004) Spatio-temporal aggregation using sketches. In: Proceedings of the 20th international conference on data engineering, ICDE 2004, pp 214–225
Tao Y, Papadias D, Zhang J (2002) Aggregate processing of planar points. In: 8th international conference on extending database technology, EDBT 2002, pp 682–700
Timko I, Böhlen MH, Gamper J (2009) , Sequenced spatio-temporal aggregation in road networks. In: EDBT 2009, 12th international conference on extending database technology, pp 48–59
To H, Ghinita G, Shahabi C (2014) A framework for protecting worker location privacy in spatial crowdsourcing. PVLDB 7(10):919–930
Trudeau R (1993) Introduction to graph theory. Dover books on mathematics series. Dover Publications, New York
Wang M, Zhang X, Meng X (2013) , DiffR-tree: a differentially private spatial index for OLAP query. In: WAIM’13, pp 705–716
Xie H, Tanin E, Kulik L (2007) Distributed histograms for processing aggregate data from moving objects. In: 8th international conference on mobile data management (MDM 2007), pp 152–157
Xie H, Tanin E, Kulik L, Scheuermann P, Trajcevski G, Fanaeepour M (2014) Euler histogram tree: a spatial data structure for aggregate range queries on vehicle trajectories. In: 7th ACM SIGSPATIAL international workshop on computational transportation science, IWCTS 2014
Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: 18th ACM SIGSPATIAL international symposium on advances in geographic information systems, ACM-GIS 2010, pp 99–108
Zhang J, Ghinita G, Chow C (2014) Differentially private location recommendations in geosocial networks. In: MDM’14, pp 59–68
Zheng Y, Xie X, Ma W (2010) Geolife: a collaborative social networking service among user, location and trajectory. IEEE Data Eng Bull 33(2):32–39
Acknowledgements
This work was supported in part by Australian Research Council DECRA Grant DE160100584.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fanaeepour, M., Rubinstein, B.I.P. Differentially private counting of users’ spatial regions. Knowl Inf Syst 54, 5–32 (2018). https://doi.org/10.1007/s10115-017-1113-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1113-6