Differentially private (DP) linear and logistic regression algorithms have been proposed for private-analysis of GWAS data. While DP provides an elegant mathematical framework for defining a provable disclosure risk in the presence of arbitrary adversaries, numerous technical and practical subtleties exist that limit its usability in statistical applications. We introduce the concept of the adjacent output space, where the structure of this space is directly connected to the sensitivity analysis, and extend the optimal K-Norm mechanisms. We implement these mechanisms on linear and logistics regressions, and demonstrate the improvements on data utility. We show that the choice of norm can result in a significant reduction of noise. By choosing one mechanism over another, the same statistical utility can be achieved using half the original privacy budget. These improvements result in higher data usability, more accurate results, and consequently better inference under DP. Time permitting, we will demonstrate this with GWAS data, otherwise with another data application of linear and penalized logistic regression.
(Jordan Awan and Aleksandra Slavkovic)
Back to Algorithmic Challenges in Protecting Privacy for Biomedical Data