Parallelisation of sparse grids for large scale data analysis

Jochen Garcke; Markus Hegland; Ole Nielsen

doi:10.1017/S1446181100003382

Parallelisation of sparse grids for large scale data analysis

Published online by Cambridge University Press: 17 February 2009

Jochen Garcke ,

Markus Hegland and

Ole Nielsen

Show author details

Jochen Garcke: Affiliation:
Institut für Numerische Simulation, Rheinische Friedrich-Wilhelms-Universität Bonn, Wegelerstr. 6, 53115 Bonn, Germany; e-mail: garcke@ins.uni-bonn.de. Centre for Mathematics and its Applications, Mathematical Sciences Institute, Australian National University, Canberra ACT 0200, Australia; e-mail: jochen.garcke@anu.edu.au
Markus Hegland: Affiliation:
Centre for Mathematics and its Applications, Mathematical Sciences Institute, Australian National University, Canberra ACT 0200, Australia; e-mail: jochen.garcke@anu.edu.au
Ole Nielsen: Affiliation:
Centre for Mathematics and its Applications, Mathematical Sciences Institute, Australian National University, Canberra ACT 0200, Australia; e-mail: jochen.garcke@anu.edu.au

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Sparse grids are the basis for efficient high dimensional approximation and have recently been applied successfully to predictive modelling. They are spanned by a collection of simpler function spaces represented by regular grids. The sparse grid combination technique prescribes how approximations on a collection of anisotropic grids can be combined to approximate high dimensional functions.

In this paper we study the parallelisation of fitting data onto a sparse grid. The computation can be done entirely by fitting partial models on a collection of regular grids. This allows parallelism over the collection of grids. In addition, each of the partial grid fits can be parallelised as well, both in the assembly phase, where parallelism is done over the data, and in the solution stage using traditional parallel solvers for the resulting PDEs. Using a simple timing model we confirm that the most effective methods are obtained when both types of parallelism are used.

Keywords

predictive modelling sparse grids parallelism numerical linear algebra.

Information

Type: Research Article
Information: The ANZIAM Journal , Volume 48 , Issue 1 , July 2006 , pp. 11 - 22

DOI: https://doi.org/10.1017/S1446181100003382 [Opens in a new window]
Copyright: Copyright © Australian Mathematical Society 2006

References

[1]Berry, M. J. A. and Linoff, G. S., Mastering Data Mining (Wiley, New York, 2000).Google Scholar

[2]Bishop, C. M., Neural Networks for Pattern Recognition (Oxford University Press, Oxford UK, 1995).CrossRef Google Scholar

[3]Blackard, J. A., “Comparison of neural networks and discriminant analysis in predicting forest cover types”, Ph. D. Thesis, Department of Forest Sciences. Colorado State University, Fort Collins, Colorado, 1998.Google Scholar

[4]Blackford, L. S., Choi, J., Cleary, A., D'Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D. and Whaley, R. C., ScaLAPACK Users' Guide (Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997).CrossRef Google Scholar

[5]Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J., Classification and Regression Trees, Statistics/Probability Series (Wadsworth Publishing Company, Belmont, California, U.S.A., 1984).Google Scholar

[6]Bungartz, H.-J. and Griebel, M., “Sparse grids”, Acta Numer. 13 (2004) 1–123.CrossRef Google Scholar

[7]Friedman, J. H., “Multivariate adaptive regression splines”, Ann. Statist. 19 (1) (1991) 1–141, With discussion and a rejoinder by the author.Google Scholar

[8]Garcke, J., “Maschinelles Lernen durch Funktionsrekonstruktion mit verallgemeinerten dünnen Gittern”, Ph. D. Thesis, Institut für Numerische Simulation, Universität Bonn, 2004.Google Scholar

[9]Garcke, J. and Griebel, M., “On the parallelization of the sparse grid approach for data mining”, in Large-Scale Scientific Computations, Third International Conference, Sozopol, Bulgaria (eds. Margenov, S., Wasniewski, J. and Yalamov, P.), Lecture Notes in Computer Science 2179, (Springer, Berlin, 2001), 22–32.Google Scholar

[10]Garcke, J. and Griebel, M., “Classification with sparse grids using simplicial basis functions”, Intell. Data Anal. 6 (6) (2002) 483–502, (shortened version appeared in KDD 2001, Proc. Seventh ACM SIGKDD, F. Provost and R. Srikant (eds.), pages 87–96, ACM, 2001).CrossRef Google Scholar

[11]Garcke, J., Griebel, M. and Thess, M., “Data mining with sparse grids”, Computing 67 (3) (2001) 225–253.CrossRef Google Scholar

[12]Griebel, M., “A parallelizable and vectorizable multi-level algorithm on sparse grids”, in Parallel algorithms for partial differential equations (Kiel, 1990) (ed. Hackbusch, W.), Notes Numer. Fluid Mech. 31, (Vieweg, Braunschweig, 1991) 94–100.Google Scholar

[13]Griebel, M., “The combination technique for the sparse grid solution of PDEs on multiprocessor machines”, Par. Proc. Lett. 2 (1992) 61–70.CrossRef Google Scholar

[14]Griebel, M., “A domain decomposition method using sparse grids”, in Domain decomposition methods in science and engineering (Como, 1992), Contemp. Math. 157 (American Mathematical Society, Providence, RI, 1994) 255–261.CrossRef Google Scholar

[15]Griebel, M., Huber, W., Störtkuhl, T. and Zenger, C., “On the parallel solution of 3D PDEs on a network of workstations and on vector computers”, in Parallel Computer Architectures: Theory, Hardware, Software, Applications (eds. Bode, A. and Dal Cin, M.), Lecture Notes in Computer Science 732, (Springer, Berlin, 1993), 276–291.CrossRef Google Scholar

[16]Griebel, M., Schneider, M. and Zenger, C., “A combination technique for the solution of sparse grid problems”, in Iterative Methods in Linear Algebra (eds. de Groen, P. and Beauwens, R.), (IMACS, Elsevier, North Holland, 1992), 263–281.Google Scholar

[17]Hastie, T. and Tibshirani, R., “Generalized additive models”, Statist. Sci. 1 (1986) 297–318, With discussion.Google Scholar

[18]Hastie, T. J. and Tibshirani, R. J., Generalized additive models, Monographs on Statistics and Applied Probability 43 (Chapman and Hall Ltd., London, 1990).Google Scholar

[19]Heckerman, D., “A tutorial on learning with Bayesian networks”, in Learning in graphical models (ed. Jordan, M. I.), (Kluwer, Dordrecht, Netherlands, 1998).Google Scholar

[20]Hegland, M., “Adaptive Sparse Grids”, ANZIAM J. 44 (E) (2003) C335–C353.CrossRef Google Scholar

[21]Hegland, M., “Additive sparse grid fitting”, in Curve and surface fitting (Saint-Malo, 2002), Mod. Methods Math., (Nashboro Press, Brentwood, TN, 2003) 209–218.Google Scholar

[22]Hegland, M., Nielsen, O. M. and Shen, Z., “Multidimensional smoothing using hyperbolic interpolatory wavelets”. Electronic Trans. Numer. Anal. 17 (2004) 168–180.Google Scholar

[23]Vapnik, V. N., The Nature of Statistical Learning Theory, second ed. (Springer, New York, 2000).CrossRef Google Scholar

[24]Wahba, G., Spline models for observational data, CBMS-NSF Regional Conference Series in Applied Mathematics 59 (Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1990).CrossRef Google Scholar

[25]Zenger, C., “Sparse grids”, in Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, Kiel, 1990 (ed. Hackbusch, W.), Notes on Num. Fluid Mech. 31 (Vieweg, Braunschweig, 1991) 241–251.Google Scholar

Article contents

Parallelisation of sparse grids for large scale data analysis

Abstract

Keywords

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests