Many core problems in systems and hardware design are combinatorial optimization or decision making tasks with state and action sizes that are orders of magnitude larger than common AI benchmarks in robotics and games. In this talk, I will go over some of our research on tackling such optimization problems. First, I will talk about our work on deep reinforcement learning models that learn to do computational resource allocation, a combinatorial optimization problem that repeatedly appears in systems. The proposed method is end-to-end and abstracts away the complexity of the underlying optimization space; the RL agent learns the implicit tradeoffs between computation and communication of the underlying resources and optimizes the allocation using only the true reward function (e.g., the runtime of the generated allocation). Then, I will talk about our work on generalizable graph clustering and partitioning with unsupervised learning. Finally, I will discuss our recent work on optimizing chip placement with reinforcement learning. Our approach has the ability to learn from past experience and improve over time. The placement policy can generalize to unseen blocks. Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator chips, whereas existing baselines require human experts in the loop and can take several weeks.
Back to Deep Learning and Combinatorial Optimization