A privacy‐preserving framework for ranked retrieval model
In this paper, we address privacy issues related to ranked retrieval model in web databases, each of which takes private attributes as part of input in the ranking function. Many web databases keep private attributes invisible to public and believe that the adversary is unable to reveal the private attribute values from query results. However, prior research (Rahman et al. in Proc VLDB Endow 8:1106–17, 2015) studied the problem of rank-based inference of private attributes over web databases. They found that one can infer the value of private attributes of a victim tuple by issuing well-designed queries through a top-k query interface. To address the privacy issue, in this paper, we propose a novel privacy-preserving framework. Our framework protects private attributes’ privacy not only under inference attacks but also under arbitrary attack methods. In particular, we classify adversaries into two widely existing categories: domain-ignorant and domain-expert adversaries. Then, we develop equivalent set with virtual tuples (ESVT) for domain-ignorant adversaries and equivalent set with true tuples (ESTT) for domain-expert adversaries. The ESVT and the ESTT are the primary parts of our privacy-preserving framework. To evaluate the performance, we define a measurement of privacy guarantee for private attributes and measurements for utility loss. We prove that both ESVT and ESTT achieve the privacy guarantee. We also develop heuristic algorithms for ESVT and ESTT, respectively, under the consideration of minimizing utility loss. We demonstrate the effectiveness of our techniques through theoretical analysis and extensive experiments over real-world dataset.