Investigating multiple regression by additive models

doi:10.1017/CCOL0521382483.010

10 - Investigating multiple regression by additive models

from PART III - Smoothing in high dimensions

Published online by Cambridge University Press: 05 January 2013

Wolfgang Härdle

Show author details

Wolfgang Härdle: Affiliation:
Rheinische Friedrich-Wilhelms-Universität Bonn

Book contents

Get access

Summary

While it is possible to encode several more dimensions into pictures by using time (motion), color, and various symbols (glyphs), the human perceptual system is not really prepared to deal with more than three continuous dimensions simultaneously. Huber, P.J. (1985, p. 437)

Huber, P.J. (1985, p. 437)

The basic idea of scatter plot smoothing can be extended to higher dimensions in a straightforward way. Theoretically, the regression smoothing for a d -dimensional predictor can be performed as in the case of a one-dimensional predictor. The local averaging procedure will still give asymptotically consistent approximations to the regression surface. However, there are two major problems with this approach to multiple regression smoothing. First, the regression function m(x) is a high dimensional surface and since its form cannot be displayed for d > 2, it does not provide a geometrical description of the regression relationship between X and Y. Second, the basic element of nonparametric smoothing - averaging over neighborhoods - will often be applied to a relatively meager set of points since even samples of size n ≥ 1000 are surprisingly sparsely distributed in the higher dimensional Euclidean space. The following two examples by Werner Stuetzle exhibit this “curse of dimensionality.”

A possible procedure for estimating two-dimensional surfaces could be to find the smallest rectangle with axis-parallel sides containing all the predictor vectors and to lay down a regular grid on this rectangle. This gives a total of one hundred cells if one cuts each side of a twodimensional rectangle into ten pieces. Each inner cell will have eight neighboring cells. If one carried out this procedure in ten dimensions there would be a total of 1010 = 10,000,000,000 cells and each inner cell would have 310 — 1 = 59048 neighboring cells. In other words, it will be hard to find neighboring observations in ten dimensions!

Information

Type: Chapter
Information: Applied Nonparametric Regression , pp. 257 - 288

DOI: https://doi.org/10.1017/CCOL0521382483.010 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 1990

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

Accessibility standard: Unknown

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.