The recently proposed Bayesian approach to online learning is applied to learning a rule defined as a noisy single layer perceptron with either continuous or binary weights. In the Bayesian online approach the exact posterior distribution is approximated by a simpler parametric posterior that is updated online as new examples are incorporated to the dataset. In the case of continuous weights, the approximate posterior is chosen to be Gaussian. The computational complexity of the resulting online algorithm is found to be at least as high as that of the Bayesian offline approach, making the online approach less attractive. A Hebbian approximation based on casting the full covariance matrix into an isotropic diagonal form significantly reduces the computational complexity and yields a previously identified optimal Hebbian algorithm. In the case of binary weights, the approximate posterior is chosen to be a biased binary distribution. The resulting online algorithm is derived and shown to outperform several other online approaches to this problem.
Neural networks are adaptive systems characterized by a set of parameters w, the weights and biases that specify the connectivity among the neuronal computational elements. Of particular interest is the ability of these systems to learn from examples. Traditional formulations of the learning problem are based on a dynamical prescription for the adaptation of the parameters w. The learning process thus generates a trajectory in w space that starts from a random initial assignment w0 and leads to a specific w* that is in some sense optimal.