ABSTRACT
Privacy-preserving document exchange among collaboration groups in an enterprise as well as across enterprises requires techniques for sharing and search of access-controlled information through largely untrusted servers. In these settings search systems need to provide confidentiality guarantees for shared information while offering IR properties comparable to the ordinary search engines. Top-k is a standard IR technique which enables fast query execution on very large indexes and makes systems highly scalable. However, indexing access-controlled information for top-k retrieval is a challenging task due to the sensitivity of the term statistics used for ranking.
In this paper we present Zerber+R -- a ranking model which allows for privacy-preserving top-k retrieval from an outsourced inverted index. We propose a relevance score transformation function which makes relevance scores of different terms indistinguishable, such that even if stored on an untrusted server they do not reveal information about the indexed data. Experiments on two real-world data sets show that Zerber+R makes economical usage of bandwidth and offers retrieval properties comparable with an ordinary inverted index.
- Alspach, D. and Sorenson, H. Nonlinear Bayesian Estimation Using Gaussian Sum Approximations. IEEE Transactions on Automatic Control, Vol. 17, No. 4, p. 439--448, Aug., 1972.Google ScholarCross Ref
- Bawa, M., Bayardo, Jr. R. J. and Agrawal, R. Privacy-preserving indexing of documents on the network. In Proceedings of the VLDB, 2003. Google ScholarDigital Library
- Bayardo, R. and Agrawal, R. Data privacy through optimal k-anonymization. In Proceedings of ICDE, 2005. Google ScholarDigital Library
- Bertino, E., Castano, S. and Ferrari, E. Securing XML documents with Author-X. In IEEE Internet Computing, May/June 2001. Google ScholarDigital Library
- Boneh, D., Crescenzo, G. D., Ostrovsky, R., and Persiano, G., Public-key encryption with keyword search. In Proceedings of Eurocrypt 2004.Google ScholarCross Ref
- Büttcher, S. and Clarke, C. L. A. A Security Model for Full-Text File System Search in Multi-User Environments. In Proceedings of the FAST, 2005. Google ScholarDigital Library
- Chang, Y.-C. and Mitzenmacher, M. Privacy preserving keyword searches on remote encrypted data. Cryptology ePrint Archive, Report 2004/051, Feb 2004.Google Scholar
- Fung, B. C. M., Wang, K. and Yu, P. S. Top-down specialization for information and privacy preservation. In Proceedings of ICDE 2005. Google ScholarDigital Library
- Goh, E., Shacham, H., Modadugu, N. and Boneh, D. Sirius: Securing remote untrusted storage. In NDSS, 2003.Google Scholar
- Hacigumus, H., Iyer, B. R., Li, C. and Mehrotra, S. Executing SQL over encrypted data in the database-service-provider model. In Proceedings of the SIGMOD, 2002. Google ScholarDigital Library
- Iyengar, V. Transforming data to satisfy privacy constraints. In Proceedings of the SIGKDD, 2002. Google ScholarDigital Library
- Kallahalla, M., Riedel, E., Swaminathan, R., Wang, Q. and Fu, K. Plutus: scalable secure file sharing on untrusted storage. In Proceedings of the FAST, 2003. Google ScholarDigital Library
- Kohlschütter, C., Chirita, P.-A. and Nejdl W. Using Link Analysis to Identify Aspects in Faceted Web Search. SIGIR'2006 Faceted Search Workshop, 2006, Seattle, WA.Google Scholar
- Miklau, G. and Suciu, D. Controlling Access to Published Data Using Cryptography. In Proc. of the VLDB 2003. Google ScholarDigital Library
- Mitra, S., Hsu, W. W. and Winslett, M. Trustworthy keyword search for regulatory-compliant records retention, In Proceedings of VLDB, 2006, Seoul, Korea, 1001--1012. Google ScholarDigital Library
- Open Directory Project: http://www.dmoz.org/Google Scholar
- Rice, J. Mathematical Statistics and Data Analysis II Edition 1995. ISBN 0-534-20934-3Google Scholar
- Singhal, A. Modern Information Retrieval: A Brief Overview. In IEEE, Data Eng. Bull. 24(4), 2001Google Scholar
- Song, D. X., Wagner, D., Perrig, A. Practical Techniques for Searches on Encrypted Data. In Proceedings of IEEE Security and Privacy Symposium, May 2000, 44--55. Google ScholarDigital Library
- Stud IP LMS. Available at: http://www.studip.de/.Google Scholar
- Swaminathan, A., Mao, Y., Su, G.-M., Gou, H., Varna, A. L., He, S., Wu, M., Oard, D. W. Confidentiality-preserving rank-ordered search. In Proc. of StorageSS '07 Workshop. Google ScholarDigital Library
- Zerr, S., Demidova, E., Olmedilla, D., Nejdl, W., Winslett M., Mitra, S. Zerber: r-Confidential Indexing for Distributed Documents. In Proceedings of the EDBT 2008. Google ScholarDigital Library
Recommendations
Temporally enhanced network-constrained (TENC) R-tree
MobiGIS '16: Proceedings of the 5th ACM SIGSPATIAL International Workshop on Mobile Geographic Information SystemsThis paper describes a new Network-constrained Moving objects indexing structure, which extends the state-of-the-art for this kind of data. The indexing structure we propose is called Temporally Enhanced Network-Constrained R-tree (TENC R-tree), which ...
(k, R, r)-anonymity: a light-weight and personalized location protection model for LBS query
ACM TURC '17: Proceedings of the ACM Turing 50th Celebration Conference - ChinaThis paper studies the problem of location and query content preserving in location based service (LBS) systems. Based on the private information retrieval (PIR) theory and location k-anonymity model, we propose a new privacy preserving model, called (k,...
Zerber: r-confidential indexing for distributed documents
EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technologyTo carry out work assignments, small groups distributed within a larger enterprise often need to share documents among themselves while shielding those documents from others' eyes. In this situation, users need an indexing facility that can quickly ...
Comments