AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (4.9 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

PrivBV: Distance-Aware Encoding for Distributed Data with Local Differential Privacy

School of Software, Tsinghua University, Beijing 100084, China
Show Author Information

Abstract

Recently, local differential privacy (LDP) has been used as the de facto standard for data sharing and analyzing with high-level privacy guarantees. Existing LDP-based mechanisms mainly focus on learning statistical information about the entire population from sensitive data. For the first time in the literature, we use LDP for distance estimation between distributed data to support more complicated data analysis. Specifically, we propose PrivBV—a locally differentially private bit vector mechanism with a distance-aware property in the anonymized space. We also present an optimization strategy for reducing privacy leakage in the high-dimensional space. The distance-aware property of PrivBV brings new insights into complicated data analysis in distributed environments. As study cases, we show the feasibility of applying PrivBV to privacy-preserving record linkage and non-interactive clustering. Theoretical analysis and experimental results demonstrate the effectiveness of the proposed scheme.

References

[1]
W. Zhang, Z. Li, and X. Chen, Quality-aware user recruitment based on federated learning in mobile crowd sensing, Tsinghua Science and Technology, vol. 26, no. 6, pp. 869-877, 2021.
[2]
J. Y. Hua, G. Yue, and S. Zhong, Differentially private publication of general time-serial trajectory data, in 2015 IEEE Conf. Computer Communications, Hong Kong, China, 2015, pp. 549-557.
[3]
D. Vatsalan and P. Christen, Privacy-preserving matching of similar patients, J. Biomed. Inform., vol. 59, pp. 285-298, 2016.
[4]
D. Karapiperis, A. Gkoulalas-Divanis, and V. S. Verykios, FEDERAL: A framework for distance-aware privacy-preserving record linkage, IEEE Trans. Knowl. Data Eng., vol. 30, no. 2, pp. 292-304, 2018.
[5]
D. Karapiperis, A. Gkoulalas-Divanis, and V. S. Verykios, Distance-aware encoding of numerical values for privacy-preserving record linkage, in 2017 IEEE 33rd Int. Conf. Data Engineering, San Diego, CA, USA, 2017, pp. 135-138.
[6]
Y. Khazbak, J. Fan, S. Zhu and G. Cao, Preserving personalized location privacy in ride-hailing service, Tsinghua Science and Technology, vol. 25, no. 6, pp. 743-757, 2020.
[7]
P. Kairouz, S. Oh, and P. Viswanath, Extremal mechanisms for local differential privacy, J. Mach. Learning Res., vol. 17, no. 1, pp. 492-542, 2016.
[8]
M. Fanaeepour and B. I. P. Rubinstein, Histogramming privately ever after: Differentially-private data-dependent error bound optimisation, in 2018 IEEE 34th Int. Conf. Data Engineering, Paris, France, 2018, pp. 1204-1207.
[9]
R. Bassily and A. Smith, Local, private, efficient protocols for succinct histograms, in Proc. 47th Annu. ACM Symp. Theory of Computing, New York, NY, USA, 2015, pp. 127-135.
[10]
J. Hsu, S. Khanna, and A. Roth, Distributed private heavy hitters, in Int. Colloquium on Automata, Languages, and Programming, A. Czumaj, K. Mehlhorn, A. Pitts, and R. Wattenhofer, eds. Warwick, UK: Springer, 2012, pp. 461-472.
[11]
G. Cormode, T. Kulkarni, and D. Srivastava, Marginal release under local differential privacy, in Proc. 2018 Int. Conf. Management of Data, New York, NY, USA, 2018, pp. 131-146.
[12]
B. L. Ding, J. Kulkarni, and S. Yekhanin, Collecting telemetry data privately, in Advances in Neural Information Proc. Systems, Long Beach, CA, USA, 2017, pp. 3571-3580.
[13]
C. Dwork and A. Roth, The algorithmic foundations of differential privacy, Found. Trends Theoret. Comput. Sci., vol. 9, no. 3, pp. 211-407, 2014.
[14]
N. Wang, X. K. Xiao, Y. Yang, J. Zhao, S. C. Hui, H. Shin, J. Shin, and G. Yu, Collecting and Analyzing Multidimensional Data with Local Differential Privacy, in 2019 IEEE 35th Int. Conf. Data Engineering (ICDE), Macao, China, 2019, pp. 638-649.
[15]
D. Vatsalan, P. Christen, and V. S. Verykios, A taxonomy of privacy-preserving record linkage techniques, Informat. Syst., vol. 38, no. 6, pp. 946-969, 2013.
[16]
T. H. Wang, J. Blocki, N. H. Li, and S. Jha, Locally differentially private protocols for frequency estimation, in Proc. 26th USENIX Conf. Security Symp., Berkeley, CA, USA, 2017, pp. 729-745.
[17]
C. Dwork, Differential privacy, in Proc. 33rd Int. Conf. Automata, Languages and Programming, Venice, Italy, 2006, pp. 1-12.
[18]
B. Ding, H. Nori, P. Li, and J. Allen, Comparing population means under local differential privacy: With significance and power, in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 26-33.
[19]
T. H. Wang, N. H. Li, and S. Jha, Locally differentially private frequent itemset mining, in Proc. 2018 IEEE Symp. Security and Privacy, San Francisco, CA, USA, 2018, pp. 127-143.
[20]
Ú. Erlingsson, V. Pihur, and A. Korolova, RAPPOR: randomized aggregatable privacy-preserving ordinal response, in Proc. 2014 ACM SIGSAC Conf. Computer and Communications Security, Scottsdale, AZ, USA, 2014, pp. 1054-1067.
[21]
A. Lancichinetti, S. Fortunato, and J. Kertész, Detecting the overlapping and hierarchical community structure in complex networks, New J. Phys., vol. 11, no. 3, pp. 033015, 2009.
[22]
J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proc. 5th Berkeley Symp. Mathematical Statistics and Probability, Berkeley, CA, USA, 1967, pp. 281-297.
[23]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-learn: machine learning in python, J. Mach. Learn. Res., vol. 12, pp. 2825-2830, 2011.
[24]
S. R. M. Oliveira and O. R. Zaiane, Privacy preserving clustering by data transformation, J. Inf. Data Manag., vol. 1, no. 1, pp. 37-37, 2010.
[25]
K. Liu, H. Kargupta, and J. Ryan, Random projection-based multiplicative data perturbation for privacy preserving distributed data mining, IEEE Trans. Knowl. Data Eng., vol. 18, no. 1, pp. 92-106, 2006.
Tsinghua Science and Technology
Pages 412-421
Cite this article:
Sun L, Ping G, Ye X. PrivBV: Distance-Aware Encoding for Distributed Data with Local Differential Privacy. Tsinghua Science and Technology, 2022, 27(2): 412-421. https://doi.org/10.26599/TST.2021.9010027

691

Views

60

Downloads

12

Crossref

12

Web of Science

12

Scopus

0

CSCD

Altmetrics

Received: 26 October 2020
Accepted: 24 March 2021
Published: 29 September 2021
© The author(s) 2022

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return