Very roughly, a human life contains a trillion actions. Hierarchical reinforcement learning is based on the idea that effective behaviour on anything like this scale requires some form of hierarchical structure. Such structure relates high-level actions to lower-level actions and simplifies the process of learning complete behaviours. It also yields a temporal decomposition of global value functions into small additive components that describe the local acquisition of value within each "subroutine". These components can then be learned effectively from small amounts of experience. The general picture is one in which reinforcement learning algorithms "fill in the details"
of partially specified hierarchical behaviours, creating new capabilities that can then be combined into yet more complex behaviours, rather than leaving the agent to learn everything from scratch. This approach allows agents to learn intelligent behaviours of surprising complexity.