On the Geometry of Differential Privacy

Kunal Talwar
Microsoft Research

We consider the noise complexity of differentially private mechanisms in the setting where the user asks d linear queries f:R^n -> R^d non-adaptively. Here, the database is represented by a vector in R^n and proximity between databases is measured in the ell_1-metric.

We show that the noise complexity is determined by two geometric parameters of a convex body associated with the set of queries. We use this connection to give tight upper and lower bounds on the noise complexity for any d between 1 and n. We show that for d random linear queries each of sensitivity 1, it is necessary and sufficient to add total Euclidean error \Theta(d\sqrt{\log(n)}) to achieve differential privacy. Assuming the truth of a deep conjecture from convex geometry, known as the Hyperplane conjecture, we can extend our results to arbitrary linear queries giving nearly matching upper and lower bounds.

Our bound translates to error O(\sqrt{d\log(n)}) per answer. The best previous upper bound (Laplacian mechanism) gives a bound of O(d) per answer, while the best known lower bound was \Omega(\sqrt{d}). In contrast, our lower bound is strong enough to separate the concept of differential privacy from the notion of approximate differential privacy where an upper bound of O(\sqrt{d}) can be achieved.

This talk is based on joint work with Moritz Hardt.

Presentation (PDF File)

Back to Statistical and Learning-Theoretic Challenges in Data Privacy