TY - GEN
T1 - Randomized bit vector
T2 - 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
AU - Sun, Lin
AU - Zhang, Lan
AU - Ye, Xiaojun
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/10/17
Y1 - 2018/10/17
N2 - Recently, many methods have been proposed to prevent privacy leakage in record linkage by encoding record pair data into another anonymous space. Nevertheless, they cannot perform well in some circumstances due to high computational complexities, low privacy guarantees or loss of data utility. In this paper, we propose distance-aware encoding mechanisms to compare numerical values in the anonymous space. We first embed numerical values into Hamming space by a low-computational encoding algorithm with randomized bit vector. To provide rigorous privacy guarantees, we use the random response based on differential privacy to keep global indistinguishability of original data and use Laplace noises via pufferfish mechanism to provide local indistinguishability. Besides, we provide an approach for embedding and privacy-related parameters selection to improve data utility. Experiments on datasets from different data distributions and application contexts validate that our approaches can be used efficiently in privacy-preserving record linkage tasks compared with previous works and have excellent performance even under very small privacy budgets.
AB - Recently, many methods have been proposed to prevent privacy leakage in record linkage by encoding record pair data into another anonymous space. Nevertheless, they cannot perform well in some circumstances due to high computational complexities, low privacy guarantees or loss of data utility. In this paper, we propose distance-aware encoding mechanisms to compare numerical values in the anonymous space. We first embed numerical values into Hamming space by a low-computational encoding algorithm with randomized bit vector. To provide rigorous privacy guarantees, we use the random response based on differential privacy to keep global indistinguishability of original data and use Laplace noises via pufferfish mechanism to provide local indistinguishability. Besides, we provide an approach for embedding and privacy-related parameters selection to improve data utility. Experiments on datasets from different data distributions and application contexts validate that our approaches can be used efficiently in privacy-preserving record linkage tasks compared with previous works and have excellent performance even under very small privacy budgets.
KW - Data anonymization
KW - Differential privacy
KW - Privacy-preserving record linkage
KW - Pufferfish mechanism
UR - http://www.scopus.com/inward/record.url?scp=85058016484&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058016484&partnerID=8YFLogxK
U2 - 10.1145/3269206.3271703
DO - 10.1145/3269206.3271703
M3 - Conference contribution
AN - SCOPUS:85058016484
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1263
EP - 1272
BT - CIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management
A2 - Paton, Norman
A2 - Candan, Selcuk
A2 - Wang, Haixun
A2 - Allan, James
A2 - Agrawal, Rakesh
A2 - Labrinidis, Alexandros
A2 - Cuzzocrea, Alfredo
A2 - Zaki, Mohammed
A2 - Srivastava, Divesh
A2 - Broder, Andrei
A2 - Schuster, Assaf
PB - Association for Computing Machinery
Y2 - 22 October 2018 through 26 October 2018
ER -