To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In Chapters 2 to 5, we fixed the set of problem elements and were interested in rinding single information and algorithm which minimize an error or cost of approximation. Depending on the deterministic or stochastic assumptions on the problem elements and information noise, we studied the four different settings: worst, average, worst-average, and average-worst case settings.
In this chapter, we study the asymptotic setting in which a problem element f is fixed and we wish to analyze asymptotic behavior of algorithms. The aim is to construct a sequence of information and algorithms such that the error of successive approximations vanishes as fast as possible, as the number of observations increases to infinity.
The asymptotic setting is often studied in computational practice. We mention only the Romberg algorithm for computing integrals, and finite element methods (FEM) for solving partial differential equations with the meshsize tending to zero. When dealing with these and other numerical algorithms, we are interested in how fast they converge to the solution.
One might hope that it will be possible to construct a sequence φn(yn) of approximations such that for the element f the error ∥S(f) − φn(yn)∥ vanishes much faster than the error over the whole set of problem elements (or, equivalently, faster than the corresponding radius of information). It turns out, however, that in many cases any attempts to construct such algorithms would fail. We show this by establishing relations between the asymptotic and other settings.
The projection theorem allows construction of the orthogonal right-inverse of a linear surjective operator A, associating with any datum y the solution x to the equation Ax = y with minimal norm. In the same way, it allows construction of the orthogonal left-inverse of a linear injective operator A, associating with any datum y the solution x to the equation Ax = ȳ, where ȳ is the orthogonal projection of y onto the image of A. More generally, when A is any linear operator between finite-dimensional vector spaces, the pseudoinverse of A associates with any datum y the solution x (with minimal norm) to the equation Ax = ȳ, where ȳ is the orthogonal projection of y onto the image of A.
These definitions show how useful the concept of the pseudoinverse is in many situations. It is used explicitly or implicitly in many domains of statistics and data analysis. It is then quite natural that the pseudoinverse plays an important role in the use of adaptive systems in learning algorithms of patterns.
This is what we do to construct the heavy algorithm for adaptive systems that are affine with respect to the controls. Because we are looking for synaptic matrices when we deal with neural networks, we have to make a short pause to study the structure of the space of linear operators, of its dual, and of a tensor product of linear operators.
A neural network is a network of subunits, called “formal neurons,” processing input signals to output signals, which are coupled through “synapses.” The synapses are the nodes of this particular kind of network, the “strength” of which, called the synaptic weight, codes the “knowledge” of the network and controls the processing of the signals.
Let us be clear at the outset that the resemblance of a formal neuron to an animal-brain neuron is not well established, but that is not essential at this stage of abstraction. However, this terminology can be justified to some extent, and it is by now widely accepted, as discussed later. Chapter 8 develops this issue.
Also, there is always a combination of two basic motivations for dealing with neural networks - one attempting to model actual biological nervous systems, the other being content with implementation of neural-like systems on computers. Every model lies between these two requirements – the first constraining the modeling, the second allowing more freedom in the choice of a particular representation.
There are so many different versions of neural networks that it is difficult to find a common framework to unify all of them at a rather concrete level. But one can regard neural networks as dynamical systems (discrete or continuous), the states of which are the signals, and the controls of which are the synaptic weights, which regulate the flux of transmitters from one neuron to another.
This book is devoted to some mathematical methods that arise in two domains of artificial intelligence: neural networks and qualitative physics (which here we shall call “qualitative analysis”). These two topics are treated independently. Rapid advances in these two areas have left unanswered many mathematical questions that should motivate and challenge a wide range of mathematicians. The mathematical techniques that I choose to present in this book are as follows:
control and viability theory in neural networks and cognitive systems, regarded as dynamical systems controlled by synaptic matrices.
set-valued analysis, which plays a natural and crucial role in qualitative analysis and simulation by emphasizing properties common to a class of problems, data, and solutions. Set-valued analysis also underlies mathematical morphology, which provides useful techniques for image recognition.
This allows us to present in a unified way many examples of neural networks and to use several results on the control of linear and nonlinear systems to obtain a learning algorithm of pattern-classification problems (including time series in forecasting), such as the back-propagation formula, in addition to learning algorithms concerning feedback-regulation laws for solutions to control systems subject to state constraints (inverse dynamics).
We investigate in this chapter the case of linear neural networks, named associative memories by T. Kohonen (Figure 3.1). We begin by specializing the heavy algorithm we have studied in the general case of adaptive systems to the case of neural networks, where controls are matrices. It shows how to modify the last synaptic matrix that has learned a set of patterns for learning a new pattern without forgetting the previous patterns.
Because right-inverses of tensor products are tensor products of right-inverses, we observe that the heavy algorithm has a Hebbian character: The heavy algorithm states that the correction of a synaptic matrix during learning is the product of activities in both presynaptic and postsynaptic neurons. This added feature that plain vectors do not enjoy justifies the specifics of systems controlled by matrices instead of vectors.
We then proceed with associative memories with postprocessing, with multilayer and continuous-layer associative memories. We conclude this chapter with associative memories with gates, where the synaptic matrices link conjuncts (i.e., subsets) of presynaptic neurons with each postsynaptic neuron. They allow computation of any Boolean function. They require a short presentation of fuzzy sets.
We present in this appendix the tests of the external and internal algorithms conducted by Nicolas Seube at Thomson-SINTRA to control the tracking of an exosystem by an autonomous underwater vehicle (AUV). This system has three degrees of freedom (planar motion), six state variables (positions, heading, and their derivatives), and three controls (thruster forces). The dynamics of an AUV are highly nonlinear, coupled, and sometimes fully interacting, thus making it difficult to control by the usual methods. Moreover, the dynamics are poorly known, because only approximate hydrodynamic models are available for realworld vehicles. Finally, we need to involve the marine currents that can significantly perturb the dynamics of the AUV.
In addition, the problem of controlling an AUV cannot be linearized about a single velocity axis because all vehicle velocities usually have the same range; conventional linear control techniques clearly are unable to provide adequate performance by the control systems.
We shall present three different learning rules that address the problems of uniform minimization and adaptive learning by a set-valued feedback control map. The three classes of algorithms presented here have been tested in the case of the Japanese Dolphin AUV.
In particular, it is shown that the gradient step size is critical for the external rule, but is not critical for the uniform external algorithm. The latter could also be applied to pattern-classification problems, and may provide a plausible alternative method to stochastic gradient algorithms.