Home
• Get access
• Print publication year: 1990
• Online publication date: January 2013

### 10 - Investigating multiple regression by additive models

from PART III - Smoothing in high dimensions
Summary

While it is possible to encode several more dimensions into pictures by using time (motion), color, and various symbols (glyphs), the human perceptual system is not really prepared to deal with more than three continuous dimensions simultaneously. Huber, P.J. (1985, p. 437)

Huber, P.J. (1985, p. 437)

The basic idea of scatter plot smoothing can be extended to higher dimensions in a straightforward way. Theoretically, the regression smoothing for a d -dimensional predictor can be performed as in the case of a one-dimensional predictor. The local averaging procedure will still give asymptotically consistent approximations to the regression surface. However, there are two major problems with this approach to multiple regression smoothing. First, the regression function m(x) is a high dimensional surface and since its form cannot be displayed for d > 2, it does not provide a geometrical description of the regression relationship between X and Y. Second, the basic element of nonparametric smoothing - averaging over neighborhoods - will often be applied to a relatively meager set of points since even samples of size n ≥ 1000 are surprisingly sparsely distributed in the higher dimensional Euclidean space. The following two examples by Werner Stuetzle exhibit this “curse of dimensionality.”

A possible procedure for estimating two-dimensional surfaces could be to find the smallest rectangle with axis-parallel sides containing all the predictor vectors and to lay down a regular grid on this rectangle. This gives a total of one hundred cells if one cuts each side of a twodimensional rectangle into ten pieces. Each inner cell will have eight neighboring cells. If one carried out this procedure in ten dimensions there would be a total of 1010 = 10,000,000,000 cells and each inner cell would have 310 — 1 = 59048 neighboring cells. In other words, it will be hard to find neighboring observations in ten dimensions!

Recommend this book