In the second part of this book (chapters 6–9), we treat vision as a machine learning problem and disregard everything we know about the creation of the image. For example, we will not exploit our understanding of perspective projection or light transport. Instead, we treat vision as pattern recognition; we interpret new image data based on prior experience of images in which the contents were known. We divide this process into two parts: in learning we model the relationship between the image data and the scene content. In inference, we exploit this relationship to predict the contents of new images.
To abandon useful knowledge about image creation may seem odd, but the logic is twofold. First, these same learning and inference techniques will also underpin our algorithms when image formation is taken into account. Second, it is possible to achieve a great deal with a pure learning approach to vision. For many tasks, knowledge of the image formation process is genuinely unnecessary.
The structure of Part II is as follows. In Chapter 6 we present a taxonomy of models that relate the measured image data and the actual scene content. In particular, we distinguish between generative models and discriminative models. For generative models, we build a probability model of the data and parameterize it by the scene content. For discriminative models, we build a probability model of the scene content and parameterize it by the data.
Review the options below to login to check your access.
Log in with your Cambridge Aspire website account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.