Statistical learning methods developed in the last decades have proved very useful for many applications. However, most algorithms were developed under the assumption that the data is independent from the algorithm. This is no longer true in applications where data is generated or provided by (human) strategic agents. As more and more modern applications (in particular online applications) learn from data generated or provided by external parties who can act strategically, it becomes increasingly important to account for the data-provider's incentives in order to design well-performing learning algorithms in practice.
In this tutorial, we first present the methods called adversarial learning developed for security applications since the late 2000's. These methods aim at exposing the vulnerabilities of learning algorithms to data generated by an adversary and at developing algorithms that are robust to worst-case attacks (making various assumptions on the capability and information of the adversary). Then, we present more recent methods based on game theory that propose to model the utility of the adversary and of the learner in order to develop algorithms that are optimal at Nash equilibrium. These methods are more flexible and less pessimistic than worst-case analysis. They also reveal interesting insights into the learning problem.
If time permits, we will finally present recent works on learning from data provided by strategic agents. These works, originally motivated by applications where agents obfuscate their data before revealing it to protect their privacy, propose to develop general learning methods that incentivize agents to reveal high-quality data in order to improve the model learned.