Cost-Efficient and Privacy-Preserving Synthesis of Complex Sensitive Data

Publisher:
Association for Computing Machinery (ACM)
Publication Type:
Conference Proceeding
Citation:
Proceedings of the 2025 Australasian Computer Science Week, 2025, pp. 30-39
Issue Date:
2025-02-10
Full metadata record
This paper introduces a novel method for generating differentially private synthetic datasets that harnesses Bayesian networks to ensure the preservation of essential statistical properties and referential integrity across linked tables. To address the dual challenges of maintaining privacy and minimizing computational overhead, we introduce a decomposition scheme for additive Laplacian noise that significantly reduces computational costs while enhancing the efficiency of the differential privacy framework. Our methodology offers a robust solution for creating synthetic datasets that not only mimic the statistical characteristics of original datasets, but also safeguard sensitive information against inference attacks. Through comprehensive evaluations, we demonstrate the practicality and effectiveness of our approach, which achieves a significant speedup in noise injection, thereby facilitating real-time data analysis. This breakthrough contributes to the broader accessibility of complex data analysis, particularly benefiting sectors dealing with sensitive information by improving data privacy and security measures. Our findings represent a significant advancement in statistical methodologies and software, underscoring the ongoing necessity for innovation in data processing techniques.
Please use this identifier to cite or link to this item: