To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We have recently been doing some experiments inspired by Horace Barlow's ideas about adaptation and spatial integration. But instead of looking for support for Horace's point of view, we have been hoping to displace one of his ideas by one of our own. This revisionist attitude can be justified by an argument from information theory. Barlow is almost always right, so further demonstrations of his correctness are largely redundant; to catch him out is more difficult but also in a quite objective and technical sense more informative.
It is obvious that at high levels we see textures and details that we miss when the illumination is dim: more light means better sight. This improvement could be due to any of a number of factors, but here we wish to examine one suggestion in particular: that the improvement in vision occurs because light adaptation changes the spatial organization of the retina. That some such change occurs is well documented physiologically. Neurons in the vertebrate visual system typically receive antagonistic influences from the centre and surrounding regions of their receptive fields (Barlow, 1953). Barlow, Fitzhugh & Kuffler (1957) found in cat retinal ganglion cells that light adaptation increases the prominence of the antagonistic surround relative to the centre, thereby reducing the effective size of the central summing area (or, roughly speaking, of the spatial integration region) of each cell. As Barlow (1972) has noted, the effect is rather like reducing the grain size in a photographic film. Like the photographic analog, it could provide an efficient way of regulating sensitivity, because the system would gain a useful improvement in resolution by sacrificing sensitivity that is no longer needed or even desirable.
In order to describe the efficiency of a physiological system it is important to know its transfer characteristics for complex stimuli. It is not necessary to measure the reaction of the system for stimuli of every possible time-course. If a system can be linearly approximated, one can restrict oneself to the transfer characteristics for sinusoidal stimuli. All other stimulus time-courses can be represented as a sum of sine shaped components of varying amplitude and phase. Hearing can be tested with pure tones, and the spatial resolution of the eye is examined with spatial gratings that are modulated sinusoidally in brightness.
In principle, colour vision could be analogously examined with spectral lights that have sine shaped spectral energy distribution. Newton used rectangular ‘comb spectra’, and Barlow (1982) explained how comb spectra with a sinusoidal modulation of energy with wavelength can be used for the analysis of colour vision.
In 1969 R. Gemperlein came across a paper about a Fourier interferometer working in the infrared region of the spectrum, and he had the idea of using an interferometer for the visible and ultraviolet spectral region as a spectral modulator for a light source for the examination of the visual system. After preliminary experiments and discussions with various commercial firms in 1974, the realization of this idea started in 1976 based on a doctoral thesis in the physics department of the Technische Universität in Munich. Here Heinz Parsche had constructed a Fourier spectrometer for the visible and UV spectral region. With this instrument Riidiger Paul, while working on his diploma, proved the possibility of realizing this idea, with Parsche's support.
To function efficiently within reasonable information limits, any visual system, biological or artificial, must simplify the image and record it in some economical form (e.g. Barlow, 1957). Image features, such as lines and edges, are rich sources of information. In this chapter we review a simple, efficient and biologically plausible model of how the human visual system may detect, locate and identify edges and lines in any arbitrary image. The model predicts successfully the appearance of many stimuli (including visual illusions), makes accurate quantitative predictions about thresholds and apparent position, and has been applied successfully to artificial visual systems.
Mach bands
Our research started by considering the conspicuous but paradoxical features, the dark and light lines, that appear on waveforms where luminance ramps meet a plateau, or a ramp of different slope. They are usually referred to as ‘Mach bands’, after the German physicist Ernst Mach, who first observed and studied them more than 100 years ago (Mach, 1865).
Figure 18.1 A shows a clear example of Mach bands, on a trapezoidal waveform. The brightness of the pattern does not follow the luminance distribution, but produces sharp black and white bands, separated by a relatively homogeneous region. The stripes are even more apparent in two dimensions. Figure 18. IB is the product of a vertical and horizontal triangle-wave (from Morrone, Ross, Burr & Owens, 1986). Again brightness does not follow luminance, but clear black and white stars appear at the apexes of the waveform.
We have considered the following three questions regarding the physiological organization of binocular pathways in the visual cortex of the cat. First, what are the rules by which signals from left and right eyes are combined in the visual cortex? Second, how are these rules affected when normal visual experience is prevented during an early stage of development? Third, how early in the development process is the physiological apparatus for binocular vision established?
These questions have been examined by use of a technique that differs from other physiological procedures that have been employed to study binocular vision. The major feature of our method is use of large, bright, sinusoidal gratings which are varied in relative phase between the two eyes so that retinal disparity is systematically changed. Aside from the analytical advantage of this stimulus, the large spatial extent of the gratings increases the likelihood that receptive fields are stimulated. We have found, for example, that 56% of cortical cells in normal cats exhibit disparity- dependent binocular interaction. In another study in which a thorough examination was made of binocular properties of cortical cells by use of single bars of light, only 37% displayed disparity selectivity (Ferster, 1981).
With respect to the questions we have addressed, our results are as follows. First, most simple cells and around half the sample of complex cells show phase-specific binocular interaction. This leads to the conclusion that most binocular interaction in striate cortex can be accounted for by linear summation of signals from each eye.
Though stereoscopic vision is often regarded as a means of gauging distance, human stereo depth judgements depart strikingly from the predictions of geometry. Not only are judgements of absolute distance highly inaccurate, as shown by Helmholtz's experiments using a single vertical thread against a featureless background (Helmholtz, 1909), but also the perceived distance of an object is strongly affected by other objects nearby (Gogel, 1963; Gogel & Mershon, 1977), with the result that relative distance is often incorrectly estimated.
In the case of horizontal rows of features, what stereo vision seems to deliver is a measure of local protrusion or curvature (Mitchison & Westheimer, 1984). For instance, a linear horizontal gradient of disparity generates no curvature and is therefore poorly perceived. However, the same is not true of a vertical gradient of (horizontal) disparity, and indeed there is known to be a marked horizontal/vertical anisotropy in stereo perception (Rogers & Graham, 1983). We suggest here a rationale for some of these phenomena. It turns out that oblique viewing introduces gradients of horizontal disparity which are largely eliminated by the curvature measure, thereby allowing a stable percept under changing viewing conditions. These disparity gradients are present only in the horizontal direction, and this may be the basis of the horizontal/vertical anisotropy.
Depth judgements in horizontal rows of features
We first review the experiments which show how stereo depth judgements are made in figures consisting of horizontal rows of lines or dotted lines (Mitchison & Westheimer, 1984).
This paper describes a class of computational techniques designed for the rapid detection and description of global features in a complex image – for example, detection of a long smooth curve on a background of shorter curves (Fig. 37.1).
Humans can perform such detection tasks in a fraction of a second; the curve ‘pops out’ of the display relatively immediately. In fact, the time required for a human to detect the curve is long enough for at most a few hundred neural firings – or, in computing terms, at most a few hundred ‘cycles’ of the neural ‘hardware’. If we regard the visual system as performing computations on the retinal image(s), with (sets of) neuron firings playing the role of basic operations, then human global feature detection performance implies that there must exist computational methods of global feature detection that take only a few hundred cycles.
Conventional computational techniques of image analysis fall far short of this level of performance. Parallel processing provides a possible approach to speeding up the computation; but some computations are not easy to speed up. For example, suppose we input the image into a two-dimensional array of processors, one pixel per processor, where each processor is connected to its neighbors in the array; this is a very natural type of ‘massive parallelism’ to use in processing images.
One of the most powerful ideas in vision research in the last two decades has been the notion that luminance patterns are detected by independent linear mechanisms, which are selective for spatial frequency and orientation. This idea is consistent with psychophysical results from masking, sub-threshold summation, and adaptation studies. Moreover, physiological work on the striate cortex shows that simple cells are both linear, and selective for spatial frequency (Movshon et al., 1978) and orientation, making them a potential neural substrate for the detection mechanisms.
There are however some masking results which suggest that, even at modest contrasts, non-linearities may affect the mechanisms detecting luminance patterns.
The aim of this chapter is to present some physiological work which shows that non-linearities do indeed occur early in the visual pathway. A second aim is to show how it might be possible to remove the effects of these early non-linearities by cortical processing.
The chapter falls into four parts. The first part covers the background, describing some of the relevant psychophysical results. The second part describes the non-linear responses of X-cells in the lateral geniculate nucleus of the cat. The third part shows how the non-linear responses could be removed, to make linear receptive fields, like those of cortical simple cells. The final part discusses the implications and raises some unresolved issues.
Evidence for linear, spatial frequency-selective mechanisms
General
One key property of a linear mechanism is that it responds to a sinusoidal input with a sinusoidal output of the same frequency. Its response may differ in amplitude and phase from the input, but it will still be a sinusoid of the same frequency.
Science is analogous to doing a jig-saw puzzle where the overall picture is unknown. One of two strategies can be adopted. The obvious one is first to see which individual pieces best fit together and hope to obtain a better idea of the overall picture as more of the puzzle is completed. However if there are many pieces and the picture very complicated then this may not be successful. Another method is to hazard a guess at what the overall picture might be and then to segregate pieces on the basis of this, only then beginning to put individual pieces together. While the initial guess might not turn out to be totally correct, as time goes on and more pieces are segregated, it can be further refined.
Research into amblyopia has so far followed the first of these two strategies. Over the past decade or so a number of workers have been busy seeing how individual pieces to the amblyopia puzzle fit without much regard for what the overall completed picture might look like. Initially this might be a sensible approach but we have now reached the point where it might be useful to hazard a guess at what the overall picture might look like so that more key pieces can be sought. In this paper we will consider three different types of pictures of amblyopia and assess into which the already collected pieces best fit. We stress that these are ‘pictures’ in the broadest sense, for not enough pieces have been collected to be able to assess them in the detail that we would like.
Attempts to understand the quantum efficiency of vision have resulted in three distinct measures of efficiency. This chapter shows how they fit together, and presents some new measurements. We will show that the idea of equivalent input noise and a simplifying assumption called ‘contrast invariance’ allow the observer's overall quantum efficiency (as defined by Barlow, 1962a) to be factored into two components: transduction efficiency (called ‘quantum efficiency of the eye’ by Rose, 1948) and calculation efficiency (called ‘central efficiency’ by Barlow, 1977).
When light is absorbed by matter, it is absorbed discontinuously, in discrete quanta. Furthermore, it is absorbed randomly; the light intensity determines only the probability of absorption of a quantum of light, a photon (Einstein, 1905). This poses a fundamental limit to vision; the photon statistics of the retinal image impose an upper limit to the reliability of any decision based on that retinal image. An observer's overall quantum efficiency F is the smallest fraction of the corneal quanta (i.e. quanta sent into the eye) consistent with the level of the observer's performance (Barlow, 1958b, 1962a). (This is closely analogous to Fisher's (1925) definition of the efficiency of a statistic.) Surprisingly, the overall quantum efficiency of vision is very variable, and much smaller than best estimates of the fraction of photons absorbed by the photoreceptors in the retina.
At all reasonable luminances the fraction of corneal photons that excite photoreceptors is almost certainly quite constant. Barlow (1977) concluded that for rods it must be in the range 11% to 33% (for 507 nm light). This is independent of the size and duration of the signal, and independent of the background luminance, up to extremely high luminances.
It seems only fitting to begin this chapter with a quote from Horace Barlow (1980).
Our perceptions of the world around us are stable and reliable. Is this because the mechanisms that yield them are crude and insensitive, and thus immune to false responses? Or is it because a statistical censor who blocks unreliable messages intervenes between the signals from our sense organs and our knowledge of them? This question can be answered by measuring the efficiency with which statistical information is utilized in perception.
In this chapter, I describe some experiments done in the spirit of Barlow's suggestion. The results agree with his findings (1978, 1980) of very high human efficiency and are consistent with the view that humans can take a Bayesian approach to perceptual decisions. In this approach one combines a priori information (expectations) about what might be in the image and where together with image data. One does a detailed comparison (cross-correlation) of expectations with the new data and makes decisions based on a posteriori statistical probabilities (or likelihoods). This model gives a rather good explanation of human performance provided that one includes a number of limitations of the visual system. The efficiency method also allows one to investigate those sources of human inefficiency.
The experimental tasks all deal with high contrast signals in visual noise. The observer's task is therefore one of deciding whether one sees a signal in easily visible noise. The noise limits signal detection performance in a well known way and one can determine the best possible decision performance for the task.
The sensory systems of man measured at absolute threshold are extraordinarily sensitive. Under favourable conditions, the behavioural performance is close to perfect and the most significant limitations on the human observer are the physical limitations imposed by the nature of the detection task, such as the random fluctuations in the number of quanta arriving at the cornea from a flash of light of very weak intensity (Hecht et al, 1942), or limitations imposed at the transduction stages, such as random, thermallyinduced decomposition of photopigment molecules (Barlow, 1956). These demonstrations force one to consider a mechanistic question: namely, how is psychophysical performance of this quality supported by the individual elements of the nervous system, and, in particular, if the essential limitations on performance, even under a restricted set of conditions, can be shown to be either external to the organism or within the primary sense organs, how can information transmission be so reliable throughout the remainder of the system? In this work, we have examined the performance of neurons in the first visual cortical area of Old-World Primates (VI or striate cortex) on a number of tasks that have identifiable counterparts in perceptual behaviour.
There are a number of psychophysical tasks that are theoretically important in studying perceptual behaviour because they offer a ‘systems analysis’ of the visual system and its individual components. The study of these tasks has led to the steady evolution of a number of relatively sophisticated, quantitative models of early visual processing, which all incorporate the concept of a set of ‘receptive fields’ of different sizes or scales that are bandpass in the spatial frequency domain and orientation selective (Campbell & Robson, 1968).