Consider situations in which input-output mappings are learned by example, wherein knowledge is updated trial by trial, online. Such situations arise for humans, animals, and machines. In typical Bayesian approaches to learning, a learner begins with prior beliefs regarding candidate models and parameter values, and then, in consideration of the next example, re-allocates beliefs across models and parameter values. Crucially, the models bridge all the way from input to output, so that each model specifies the probabilities of output values given an input value. Suppose instead that the input-output mapping is mediated by a chain of successive sub-models. The first model in the chain maps an input value to probabilities of values of a first intermediate representation. The next model in the chain maps a value of the first intermediate representation to probabilities of values of the next intermediate representation. And so on, until the final model maps a value of the penultimate representation to probabilities of output values. It is possible, in principle, to do Bayesian updating simultaneously on the joint parameter space of the chain of models.
Instead, I consider a scheme in which each link does locally Bayesian learning, by using a target output value propagated back from the final output, and an input value propagated up from the initial input. I instantiate this general scheme in a specific associative learning model that first maps inputs to attentionally filtered inputs, and then maps the attentionally filtered inputs to outputs. The model is applied to various human learning phenomena, including one called highlighting, which is particularly challenging to other extant Bayesian models, including the Rational Model and the Kalman filter model. In general, the locally Bayesian model does not perform according to globally Bayesian standards, but it can better capture how humans actually behave. In general, locally Bayesian learning protects end-of-the-chain models from disconfirmation, at the cost of distorting beginning-of-the-chain models. Locally Bayesian learning changes its internal data to fit its beliefs, before it changes its beliefs to fit the external data. People seem to behave this way too.
Brief overview of background issues and approach:
Kruschke, J. K. (2006). Learned Attention. Presentation at the Fifth International Conference on Development and Learning, Indiana University May 31-June 3, 2006.
Full article regarding locally Bayesian learning:
Kruschke, J. K. (2006). Locally Bayesian learning with applications to retrospective revaluation and highlighting. Psychological Review, 113(4), 677-699.
More details about locally Bayesian learning, with summary regarding Kalman filter and Rational Model:
Kruschke, J. K. (2006). Locally Bayesian learning. Proceedings of the Annual Conference of the Cognitive Science Society.