This website requires JavaScript.

Packing Privacy Budget Efficiently

Pierre TholoniatKelly KostopoulouMosharaf Chowdhury ...+3 Junfeng Yang
Dec 2022
摘要
Machine learning (ML) models can leak information about users, anddifferential privacy (DP) provides a rigorous way to bound that leakage under agiven budget. This DP budget can be regarded as a new type of compute resourcein workloads of multiple ML models training on user data. Once it is used, theDP budget is forever consumed. Therefore, it is crucial to allocate it mostefficiently to train as many models as possible. This paper presents thescheduler for privacy that optimizes for efficiency. We formulate privacyscheduling as a new type of multidimensional knapsack problem, called privacyknapsack, which maximizes DP budget efficiency. We show that privacy knapsackis NP-hard, hence practical algorithms are necessarily approximate. We developan approximation algorithm for privacy knapsack, DPK, and evaluate it onmicrobenchmarks and on a new, synthetic private-ML workload we developed fromthe Alibaba ML cluster trace. We show that DPK: (1) often approaches theefficiency-optimal schedule, (2) consistently schedules more tasks compared toa state-of-the-art privacy scheduling algorithm that focused on fairness(1.3-1.7x in Alibaba, 1.0-2.6x in microbenchmarks), but (3) sacrifices somelevel of fairness for efficiency. Therefore, using DPK, DP ML operators shouldbe able to train more models on the same amount of user data while offering thesame privacy guarantee to their users.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答