Reconciling Confidentiality Risk Measures from Statistics and Computer Science

Jerry Reiter
Duke University

Statisticians and computer scientists have independently developed a variety of metrics for assessing the risks of confidentiality breaches in public use, individual-level data. In general, computer science approaches offer formal guarantees of confidentiality under specified assumptions, whereas statistical approaches provide straightforward interpretations tailored to the specific datatset of interest. In this talk, I contrast two approaches from computer science, namely differential privacy and epsilon privacy, with an approach from statistical science, namely the Duncan and Lambert approach to estimating probabilities of re-identification. I present some strengths and weaknesses of each and suggest ways to synthesize features of both to improve confidentiality protection.

Presentation (PowerPoint File)

Back to Statistical and Learning-Theoretic Challenges in Data Privacy