Search results for Pattern Recognition and Machine Learning

8 - Universal Asymptotics in Committee Machines with Tree Architecture
- By Mauro Copelli, Limburgs Universitair Centrum B-3590 Diepenbeek, Belgium, Nestor Caticha, Instituto de Física, Universidade de São Paulo Caixa Postal 66318, 05389–970 São Paulo, SP, Brazil
Edited by David Saad, Aston University
Book:

On-Line Learning in Neural Networks

Published online:

28 January 2010

Print publication:

28 January 1999, pp 165-182
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract
On-line supervised learning in the general K Tree Committee Machine (TCM) is studied for a uniform distribution of inputs. Examples are corrupted by multiplicative noise in the teacher output. From the differential equations which describe the learning dynamics, the modulation function which optimizes the generalization ability is exactly obtained for any finite K. The asymptotical behavior of the generalization error is shown to be independent of K. Robustness with respect to a misestimation of the noise level is also shown to be independent of K.
Introduction
When looking into the properties of different neural network architectures by studying their performance in different model situations, the main objective, rather than delving into the many differences, is to search for similarities. It is from these similarities that intrinsic properties of learning, that go beyond the particular characteristics of the simple models, may be identified.
In order to develop a program of this nature several studies within the community of Statistical Mechanics of Neural Networks (Watkin, Rau and Biehl, 1993) have been pursued. Among the most important contributions that this approach brings to the study of machine learning is the possibility of dealing with networks of a very large size, that is in the thermodynamic limit (TL) and of introducing efficient techniques to average over the randomness associated to the data. The model scenarios that have been analized arise from combinations of the different learning conditioning factors. These include, among others, unsupervised versus supervised learning, realizable rules or not, learning in the presence of noise or in the more idealized noiseless case, learning in a time dependent or constant environment.

Glossary
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 347-354
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Feed-forward Neural Networks
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 143-180
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

A great deal of hyperbole has been devoted to neural networks, both in their first wave around 1960 (Widrow & Hoff, 1960; Rosenblatt, 1962) and in their renaissance from about 1985 (chiefly inspired by Rumelhart & McClelland, 1986), but the ideas of biological relevance seem to us to have detracted from the essence of what is being discussed, and are certainly not relevant to practical applications in pattern recognition. Because ‘neural networks’ has become a popular subject, it has collected many techniques which are only loosely related and were not originally biologically motivated. In this chapter we will discuss the core area of feed-forward or ‘back-propagation’ neural networks, which can be seen as extensions of the ideas of the perceptron (Section 3.6). From this connection, these networks are also known as multi-layer perceptrons.
A formal definition of a feed-forward network is given in the glossary. Informally, they have units which have one-way connections to other units, and the units can be labelled from inputs (low numbers) to outputs (high numbers) so that each unit is only connected to units with higher numbers. The units can always be arranged in layers so that connections go from one layer to a later layer. This is best seen graphically; see Figure 5.1.

Preface
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp ix-xi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Pattern recognition has a long and respectable history within engineering, especially for military applications, but the cost of the hardware both to acquire the data (signals and images) and to compute the answers made it for many years a rather specialist subject. Hardware advances have made the concerns of pattern recognition of much wider applicability. In essence it covers the following problem:
‘Given some examples of complex signals and the correct decisions for them, make decisions automatically for a stream of future examples.’
There are many examples from everyday life:
Name the species of a flowering plant.
Grade bacon rashers from a visual image.
Classify an X-ray image of a tumour as cancerous or benign.
Decide to buy or sell a stock option.
Give or refuse credit to a shopper.
Many of these are currently performed by human experts, but it is increasingly becoming feasible to design automated systems to replace the expert and either perform better (as in credit scoring) or ‘clone’ the expert (as in aids to medical diagnosis).
Neural networks have arisen from analogies with models of the way that humans might approach pattern recognition tasks, although they have developed a long way from the biological roots. Great claims have been made for these procedures, and although few of these claims have withstood careful scrutiny, neural network methods have had great impact on pattern recognition practice.

Frontmatter
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

A Statistical Sidelines
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 333-346
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Flexible Discriminants
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 121-142
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

References
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 355-390
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Tree-structured Classifiers
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 213-242
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The use of tree-based methods for classification is relatively unfamiliar in both statistics and pattern recognition, yet they are widely used in some applications such as botany (Figure 7.1) and medical diagnosis as being extremely easy to comprehend (and hence have confidence in).
The automatic construction of decision trees dates from work in the social sciences by Morgan & Sonquist (1963) and Morgan & Messenger (1973). (Later work such as Doyle, 1973, and Doyle & Fenwick, 1975, commented on the pitfalls of such automated procedures.) In statistics Breiman et al. (1984) had a seminal influence both in bringing the work to the attention of statisticians and in proposing new algorithms for constructing trees. At around the same time decision tree induction was beginning to be used in the field of machine learning, which we review in Section 7.4, and in engineering (for example, Sethi & Sarvarayudu, 1982).
The terminology of trees is graphic, although conventionally trees such as Figure 7.2 are shown growing down the page. The root is the top node, and examples are passed down the tree, with decisions being made at each node until a terminal node or leaf is reached. Each non-terminal node contains a question on which a split is based. Each leaf contains the label of a classification. A subtree of T is a tree with root a node of T; it is a rooted subtree if its root is the root of T.

3 - Linear Discriminant Analysis
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 91-120
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Author Index
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 391-398
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Belief Networks
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 243-286
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The supervised methods considered so far have learnt both the structure of the probability distributions and the numerical values from the training set, or in the case of parametric methods, imposed a conventional structure for convenience. Other methods incorporate non-numerical ‘real-world’ knowledge about the subject domain into the structure of the probability distributions. Such knowledge is often about causal relationships, or perhaps the lack of causality as expressed by conditional independence.
These ideas have been most explored within the field of expert systems. This is a loosely defined area, and definitions vary:
‘The label “expert system” is, broadly speaking, a program intended to make reasoned judgements or to give assistance in a complex area in which human skills are fallible or scarce. …’
(Lauritzen & Spiegelhalter, 1988, p. 157)
‘A program designed to solve problems at a level comparable to that of a human expert in a given domain.’ (Cooper, 1989)
‘An expert system has two parts. The first one is the knowledge base. It usually makes up most of the system. In its simplest form it is a list of IF … THEN rules: each specifies what to do, or what conclusions to draw, under a set of well-defined circumstances’.
The second part of the expert system often goes under the name of “shell”. As the name implies, it acts as a receptacle for the knowledge base and contains instruments for making efficient use of it.

9 - Unsupervised Methods
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 287-326
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Unsupervised methods are used when no classes are defined a priori, or when they are but the data are to be used to confirm that these are suitable classes. Examples of the latter type are quite common in biology, where species are often defined by physical characteristics, and datasets of biochemical measurements become available. The interesting question is then whether the physical and biochemical measurements define the same classification. A variant of this occurs with our Leptograpsus crabs data. There the division into species was based on colour, and the interesting question is whether this is supported by morphological differences. Our analyses hitherto have been to find supporting morphological differences, but this begs the question of whether there might be even more striking differences unrelated to colour.
Unsupervised methods are generally designed for visualization, either 0 to show views of the data which indicate groups, or to show affinities between the examples by displaying similar examples close together. Dendrograms are a one-dimensional display of similarity, with the height of the join indicating (dis)similarity. For example, Figure 9.1 shows a dendrogram of the Cushing's syndrome data. Each pair is joined in the tree, and the height at which they are joined is an indication of their dissimilarity. This plot shows clearly that one point (labelled u) is very different from the rest, and does tend to group the diseases together, imperfectly. However, this is two-dimensional data, and the data can be plotted as in Figure 1.2 on page 11.

Contents
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Statistical Decision Theory
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 17-90
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Subject Index
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 399-403
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Notation
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp xii-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

1 - Introduction and Examples
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 1-16
- Chapter
- - You have access
- PDF
- Export citation
Summary

This book is primarily about pattern recognition, which covers a wide range of activities from many walks of life. It is something which we humans are particularly good at; we receive data from our senses and are often able, immediately and without conscious effort, to identify the source of the data. For example, many of us can
recognize faces we have not seen for many years, even in disguise,
recognize voices over a poor telephone line,
as babies recognize our mothers by smell,
distinguish the grapes used to make a wine, and sometimes
even recognize the vineyard and year,
identify thousands of species of flowers and
spot an approaching storm.
Science, technology and business has brought to us many similar tasks, including
diagnosing diseases,
detecting abnormal cells in cervical smears,
recognizing dangerous driving conditions,
identifying types of car, aeroplane, …,
identifying suspected criminals by fingerprints and DNA profiles,
reading Zip codes (US postal codes) on envelopes,
reading hand-written symbols (on a penpad computer),
reading maps and circuit diagrams,
classifying galaxies by shape,
picking an optimal move or strategy in a game such as chess,
identifying incoming missiles from radar or sonar signals,
detecting shoals of fish by sonar,
checking packets of frozen peas for ‘foreign bodies’,
spotting fake ‘antique’ furniture,
deciding which customers will be good credit risks and
spotting good opportunities on the financial markets.

10 - Finding Good Pattern Features
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 327-332
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter we consider the problem of what features should be included when designing our classifier. We should make clear at the outset that this is an impossible problem; there may be no substitute for trying them all and seeing how well the resulting classifier works. However, this may be computationally impracticable, and unless a large test set is available it may be impossible to avoid selection effects, of choosing the best of a large class of classifiers on that particular test set and not for the population.
To illustrate the difficulty, consider a battery of diagnostic tests T1, …, Tm for a fairly rare disease, which perhaps around 5% of all patients tested actually have. Suppose test T1 correctly picks up 99% of the real cases and has a very low false positive rate. However, there is a rare special form of the disease that T1 cannot detect, but T2 can, yet T2 is inaccurate on the normal disease form. If we test the diagnostic tests one at a time, we will never even think of including T2, yet T1 and T2 together may give a nearly perfect classifier by declaring a patient diseased if T1 is positive or T1 is negative and T2 is positive. This illustrates that considering features one at a time may not be sufficient.
Our aim in this chapter is to indicate single features which are likely to have good discriminatory power (feature selection) or linear combinations of features with the same aim (feature extraction).

6 - Non-parametric Methods
Brian D. Ripley, University of Oxford
Book:

Pattern Recognition and Neural Networks

Published online:

05 August 2014

Print publication:

18 January 1996, pp 181-212
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

2327 results in Pattern Recognition and Machine Learning

8 - Universal Asymptotics in Committee Machines with Tree Architecture

Summary

Glossary

5 - Feed-forward Neural Networks

Summary

Preface

Summary

Frontmatter

A Statistical Sidelines

4 - Flexible Discriminants

References

7 - Tree-structured Classifiers

Summary

3 - Linear Discriminant Analysis

Author Index

8 - Belief Networks

Summary

9 - Unsupervised Methods

Summary

Contents

2 - Statistical Decision Theory

Subject Index

Notation

1 - Introduction and Examples

Summary

10 - Finding Good Pattern Features

Summary

6 - Non-parametric Methods

Pattern Recognition and Machine Learning

Refine search

Refine search

Actions for selected content:

Save Search

2327 results in Pattern Recognition and Machine Learning

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary