This book is a detailed overview of the computational modeling of nervous systems from the molecular and cellular level and from the standpoint of human psychophysics and psychology. They divide their conception of modeling into descriptive, mechanistic, and interpretive models. My sole interest was in Part 3, which covers the mathematical modeling of adaptation and learning, so my review will be confined to these chapters. The virtue of this book, and others like it, is the insistence on empirical validation of the models, and not their justification by "thought experiments" and arm-chair reasoning, as is typically done in philosophy.
Part 3 begins with a discussion of synaptic plasticity and to what degree it explains learning and memory. The goal here is to develop mathematical models to understand how experience and training modify the neuronal synapses and how these changes effect the neuronal patterns and the eventual behavior. The Hebb model of neuronal firing is ubiquitous in this area of research, and the authors discuss it as a rule that synapses change in proportion to the correlation of the activities of pre- and postsynaptic neurons. Experimental data is immediately given that illustrates long-term potentiation (LTP) and long-term depression (LTD). The authors concentrate mostly on models based on unsupervised learning in this chapter. The rules for synaptic modification are given as differential equations and describe the rate of change of the synaptic weights with respect to the pre- and postsynaptic activity. The covariance and BCM rules are discussed, the first separately requiring postsynaptic and presynaptic activity, the second requiring both simultaneously. The authors consider ocular dominance in the context of unsupervised learning and study the effect of plasticity on multiple neurons. The last section of the chapter covers supervised learning, in which a set of inputs and the desired outputs are imposed during training.
In the next chapter, the authors consider the area of reinforcement learning, beginning with a discussion of the mathematical models for classical conditioning, and introducing the temporal difference learning algorithm. The authors discuss the Rescorla-Wagner rule , which is a trial-by-trial learning rule for the weight adjustments, in terms of the reward, the prediction, and the learning rate. They then discuss more realistic policies such as static action choice, where the reward/punishment immediately follows the action taken, and sequential action choice, where rewards may be delayed. The authors discuss foraging behavior of bees as an example of static action choice, reducing it to a stochastic two-armed bandit problem. The maze task for rats is discussed as an example of sequential action choice, and the authors reduce it to the "actor-critic algorithm." A generalized reinforcement learning algorithm is then discussed, with the rat water maze problem given as an example.
Chapter 10 is an overview of what the authors call "representational learning", which, as they explain, is a study of neural representations from a computational point of view. The goal is to begin with sensory input and find out how representations are generated on the basis of these inputs. That such representations are necessary is based on for example the consideration of the visual system, since, argue the authors, what is presented at the retina is too crude for an accurate representation of the visual world. The main strategy in the chapter is to begin with a deterministic or probabilistic input and construct a recognition algorithm that gives an estimate of the input. The algorithms constructed are all based on unsupervised learning, and hence the existence and nature of the causes must be computed using heuristics and the statistics of the input data. These two requirements are met via the construction of first a generative model and then a recognition model in the chapter. The familiar 'expectation maximization' is discussed as a method of optimization between real and synthetic data in generative models. A detailed overview of expectation maximization is given in the context of 'density estimation'. The authors then move on to discuss causal models for density estimation, such as Gaussian mixtures, the K-means algorithm, factor analysis, and principal components analysis. They then discuss sparse coding, as a technique to deal with the fact that the cortical activity is not Gaussian. They illustrate an experimental sample, showing the activity follows an exponential distribution in a neuron in the inferotemporal area of the macaque brain. The reader will recognize 'sparse' probability distributions as being 'heavy-tailed', i.e. having values close to zero usually, but ones far from zero sometimes. The authors emphasize the difficulties in the computation of the recognition distribution explicitly. The Olshausen/Field model is used to give a deterministic approximate recognition model for this purpose. The authors then give a fairly detailed overview of a two-layer, nonlinear 'Helmholtz machine' with binary inputs. They illustrate how to obtain the expectation maximization in terms of the Kullback-Leibler divergence. The learning in this model takes place via stochastic sampling and occurs in two phases, the so-called "wake and sleep" algorithm. The last section of the chapter gives a general discussion of how recent interest in coding, transmitting, and decoding images has led to much more research into representational learning algorithms. They discuss multi-resolution decomposition and its relationship to the coding algorithms available.