Doubly-sequential experimentation

Aaditya Ramdas
Carnegie Mellon University
Machine Learning and Statistics

Very often in the sciences, one encounters a situation where a single-investigator lab or a multi-investigator research facility runs a sequence of experiments over an extended, possibly indefinite, period of time (say months or years). Frequently, samples are obtained one at a time, so each of these experiments is itself sequential (possibly testing one or more different hypotheses, or estimating effect sizes of one or a few treatments). We call this situation doubly-sequential experimentation, since it involves a sequence of sequential experiments. Largely, we have been myopic, thinking about one experiment at a time, but we suffer selection bias (or publication bias) when we choose to report or act upon only a subset of these experiments. We will first discuss two problems that this setup creates for reproducibility from a statistical perspective: incorrect inference due to peeking at the data within one experiment (inner sequential process), and an online version of the multiple testing problem across experiments (outer sequential process). Our discussion will consider new notions of error both within and across experiments that are particularly suitable to this setting, along with new theoretical advances and algorithms to provably control them. Specifically, we will briefly highlight our recent work on how to construct a sequence of confidence intervals that are valid throughout an experiment including at data-dependent stopping times, and briefly describe new algorithms to control the false coverage rate of selectively constructed intervals (or false discovery rate of rejected hypotheses) in a fully online manner. This is an ongoing line of research that raises as many questions as it answers.

Bio: Aaditya Ramdas is an assistant professor in the Department of Statistics and Data Science (75%), and the Machine Learning Department (25%) at Carnegie Mellon University. Previously, he was a postdoctoral researcher in Statistics and EECS at UC Berkeley from 2015-18, mentored by Michael Jordan and Martin Wainwright. He finished his PhD at CMU in Statistics and Machine Learning, advised by Larry Wasserman and Aarti Singh, winning the Best Thesis Award. His undergraduate degree was in Computer Science from IIT Bombay. A lot of his research focuses on modern aspects of reproducibility in science and technology, involving statistical testing and controlling false discoveries in static and dynamic settings. He also frequently works on problems in sequential decision-making and online uncertainty quantification. Outside of work, he is an avid traveler and outdoor enthusiast; here are five topics for conversation: traveling to 50+ countries (on all habitable continents), being advanced scuba diver (while reefs exist), week-long bike rides in Zambia and California (to end AIDS), living a trash-free life (with a huge carbon footprint), and completing a full Ironman triathlon (alive).

Back to Workshop III: Validation and Guarantees in Learning Physical Models: from Patterns to Governing Equations to Laws of Nature