Effective neural network learning of sequence generating behavior while

Shaun Gittens
University of Maryland at College Park
Computer Science

Existing methods for training a learning agent to produce sequences of action vectors in order to accomplish a task while situated in a complex environment have met with limited success. Various reinforcement learning techniques are often used to handle such learning tasks, but convergence to an optimal policy is not guaranteed for many of these methods. Traditional supervised learning methods hold more assurances of convergence to optimal policies than reinforcement learning, but these methods are not well suited for such tasks as these since desired proximal outputs to the learner tend not to be readily available from the proverbial ``teacher''. Rather, distal target outputs attainable through the environment are generally the only information the teacher can supply, leaving the learner to obtain the desired proximal sequential behavior necessary to yield them. Distal supervised learning techniques have been devised such that the strengths of traditional supervised learning methods may now be used for learning tasks in complex external environments. However, the training of neural networks to produce sequences, or trajectories, to correctly yield target distal trajectories still has much room for improvement. In this work, the benefits of using recurrent neural nets in the distal supervised learning framework is investigated in enhancing the acquisition of time varying sequential behavior. One advantage to utilizing recurrent neural nets in domains such as this include the ability to learn to generate sequences of actions even in the absence of current or previous state information. A methodology is presented which outlines how the existing distal learning framework may be modified for training a sequential neural network in a complex environment.


Back to Blackwell-Tapia Conference and Prize Presentation