Distributed Private Machine Learning

Abhradeep Thakurta
University of California, Santa Cruz (UC Santa Cruz)

Machine learning has fundamentally transformed the way we interact with many networked devices around us. However, machine learning's effectiveness also raises profound concerns about privacy --- how we control the collection and use of our information. This tension between collection of users’ information to improve sales revenue of organizations (e.g., via targeted advertising), and the corresponding privacy concerns is increasing at an alarming rate. The main focus of this talk will be on designing scalable distributed machine learning algorithms with local differential privacy guarantees.

Designing differentially private machine learning algorithms has been primarily focused on balancing the trade-offs between utility and privacy. In the distributed setting, where the data is distributed across various devices (e.g., cellphones), I argue that the nature and degree of interaction these devices have with the aggregating server is as important as utility and privacy for a large scale deployment. In this talk I will discuss learning / optimization algorithms under local differential privacy that use very little or no adaptive interaction while providing strong utility guarantees. Furthermore, I argue that for a large class of learning problems some degree of adaptive interaction is unavoidable. Finally, I discuss some of my ongoing work on differentially private on-device machine learning that allow effective personalization for users (e.g., personalized advertisements or movie recommendations).

Presentation (PDF File)

Back to Algorithmic Challenges in Protecting Privacy for Biomedical Data