GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

A. Dziekonski; M. Rewienski; P. Sypek; A. Lamecki; M. Mrozowski

doi:10.4208/cicp.OA-2016-0168

GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

Part of: Partial differential equations, boundary value problems Numerical methods Numerical linear algebra Computer aspects of numerical algorithms Algorithms - Computer Science

Published online by Cambridge University Press: 28 July 2017

A. Dziekonski ,

M. Rewienski ,

P. Sypek ,

A. Lamecki and

M. Mrozowski

Show author details

A. Dziekonski*: Affiliation:
Department of Microwave and Antenna Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk 80-23, Poland; CUDA Research Center for Computational Electromagnetics atGdansk University of Technology (2012-2016)
M. Rewienski*: Affiliation:
Department of Microwave and Antenna Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk 80-23, Poland; CUDA Research Center for Computational Electromagnetics atGdansk University of Technology (2012-2016)
P. Sypek*: Affiliation:
Department of Microwave and Antenna Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk 80-23, Poland; CUDA Research Center for Computational Electromagnetics atGdansk University of Technology (2012-2016)
A. Lamecki*: Affiliation:
Department of Microwave and Antenna Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk 80-23, Poland; CUDA Research Center for Computational Electromagnetics atGdansk University of Technology (2012-2016)
M. Mrozowski*: Affiliation:
Department of Microwave and Antenna Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk 80-23, Poland; CUDA Research Center for Computational Electromagnetics atGdansk University of Technology (2012-2016)
*: *Corresponding author. Email addresses:adziek@eti.pg.gda.pl (A. Dziekonski), mrewiens@eti.pg.gda.pl (M. Rewienski), psypek@eti.pg.gda.pl (P. Sypek), adam.lamecki@ieee.org (A. Lamecki), m.mrozowski@ieee.org (M. Mrozowski)
*Corresponding author. Email addresses:adziek@eti.pg.gda.pl (A. Dziekonski), mrewiens@eti.pg.gda.pl (M. Rewienski), psypek@eti.pg.gda.pl (P. Sypek), adam.lamecki@ieee.org (A. Lamecki), m.mrozowski@ieee.org (M. Mrozowski)
*Corresponding author. Email addresses:adziek@eti.pg.gda.pl (A. Dziekonski), mrewiens@eti.pg.gda.pl (M. Rewienski), psypek@eti.pg.gda.pl (P. Sypek), adam.lamecki@ieee.org (A. Lamecki), m.mrozowski@ieee.org (M. Mrozowski)
*Corresponding author. Email addresses:adziek@eti.pg.gda.pl (A. Dziekonski), mrewiens@eti.pg.gda.pl (M. Rewienski), psypek@eti.pg.gda.pl (P. Sypek), adam.lamecki@ieee.org (A. Lamecki), m.mrozowski@ieee.org (M. Mrozowski)
*Corresponding author. Email addresses:adziek@eti.pg.gda.pl (A. Dziekonski), mrewiens@eti.pg.gda.pl (M. Rewienski), psypek@eti.pg.gda.pl (P. Sypek), adam.lamecki@ieee.org (A. Lamecki), m.mrozowski@ieee.org (M. Mrozowski)

Article contents

Abstract
References

Get access

Abstract

This paper presents a GPU-accelerated implementation of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method with an inexact nullspace filtering approach to find eigenvalues in electromagnetics analysis with higher-order FEM. The performance of the proposed approach is verified using the Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of the implementation based on functions from the Intel MKL on the Intel Xeon (E5-2680 v3, 12 threads) central processing unit (CPU) executed in parallel mode. Compared to the CPU reference implementation based on the Intel MKL functions, the proposed GPU-based LOBPCG method with inexact nullspace filtering allowed us to achieve up to 2.9-fold acceleration.

Keywords

LOBPCG inexact nullspace filtering multilevel preconditioning FEM GPU parallel computing

MSC classification

Secondary: 65N25: Eigenvalue problems 68W10: Parallel algorithms 65Y05: Parallel computation 65F08: Preconditioners for iterative methods 65F10: Iterative methods for linear systems 74S05: Finite element methods

Type: Research Article
Information: Communications in Computational Physics , Volume 22 , Issue 4 , October 2017 , pp. 997 - 1014

DOI: https://doi.org/10.4208/cicp.OA-2016-0168 [Opens in a new window]
Copyright: Copyright © Global-Science Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Ingelström, P., A new set of H (curl)-conforming hierarchical basis functions for tetrahedral meshes, Microwave Theory and Techniques, IEEE Transactions on, 54 (1) (2006), 106–114.Google Scholar

[2] Zhu, Y., and Cangellaris, A., Nested multigrid vector and scalar potential finite element method for three-dimensional time-harmonic electromagnetic analysis, Radio Science, 37 (3) (2002), 8:1–8:10.Google Scholar

[3] Chen, Y., Feng, J., Generalized eigenvalue analysis of symmetric prestressed structures using group theory, J. Comput. Civ. Eng., 10, (2012), 488–497.Google Scholar

[4] Absil, P. -A., Baker, C. G., and Gallivan, K. A., A truncated-CG style method for symmetric generalized eigenvalue problems, J. Comput. Appl. Math. 189, (2006), 274–285.Google Scholar

[5] Sorensen, D. C., Implicitly Restarted Arnoldi/Lanczos Methods for Large Scale Eigenvalue Calculations, Springer Netherlands, 1997.Google Scholar

[6] Knyazev, A. V., Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method, SIAM Journal on Scientific Computing 23 (2), (2001), 517–541.CrossRef Google Scholar

[7] Demmel, J., Dongarra, J., Ruhe, A. and van der Vorst, H., Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Bai, Zhaojun (Ed.). Soc. for Industrial and Applied Math., Philadelphia, PA, USA, 2000.Google Scholar

[8] Arbenz, P., Bečka, M., Geus, R., Hetmaniuk, U., T., and Mengotti, , On a parallel multilevel preconditioned Maxwell eigensolver, Parallel Computing, 32 (2), (2006), 157–165.Google Scholar

[9] Romero, E., Roman, J. E., A parallel implementation of Davidson methods for large-scale eigenvalue problems in SLEPc, ACM Transactions on Mathematical Software (TOMS) 40 (2) (2014), 13:1–13:29.Google Scholar

[10] Knyazev, A. V., Argentati, M. E., Lashuk, I., Ovtchinnikov, E. E., Block locally optimal preconditioned eigenvalue xolvers (BLOPEX) in hypre and PETSc, SIAM Journal on Scientific Computing, 29 (5), (2007), 2224–2239.Google Scholar

[11] Langr, D., Tvrdik, P., Evaluation criteria for sparse matrix storage formats, IEEE Transactions on Parallel and Distributed Systems, 27 (2), (2016), 428–440.CrossRef Google Scholar

[12] Anzt, H., Tomov, S., Luszczek, P., Sawyer, W., Dongarra, J., Acceleration of GPU-based Krylov solvers via data transfer reduction, International Journal of High Performance Computing Applications, 29 (3), (2015), 366–383.Google Scholar

[13] Zhang, S., Li, T., Jiao, X., Wang, Y., Yifeng, Y., HLanc: Heterogeneous parallel implementation of the implicitly restarted Lanczos method, 43rd International Conference on Parallel Processing Workshops, IEEE, (2014) 403–410.Google Scholar

[14] Anzt, H., Tomov, S., and Dongarra, J., Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product, In Proceedings of the Symposium on High Performance Computing (HPC ′15), Society for Computer Simulation International, San Diego, CA, USA, (2015) 75–82.Google Scholar

[15] Matrix Algebra on GPU and Multicore Architectures (MAGMA), http://icl.cs.utk.edu/magma/index.html.Google Scholar

[16] Rewienski, M., Lamecki, A. and Mrozowski, M., An extended basis inexact shift-invert Lanczos for the efficient solution of large-scale generalized eigenproblems, Computer Physics Communications 184 (2013), 2127–2135.Google Scholar

[17] Zhong, L., Two-grid methods for time-harmonic Maxwell equations, Numerical Linear Algebra with Applications, 20 (1) (2013), 93–111.Google Scholar

[18] Kolev, T. V., Pasciak, J. E. and Vassilevski, P. S., H (curl) auxiliary mesh preconditioning, Numerical Linear Algebra with Applications, 15 (5) (2008), 455–471.Google Scholar

[19] Arbenz, P. and Geus, R., Multilevel preconditioned iterative eigensolvers for Maxwell eigenvalue problems, Applied Numerical Mathematics, 54 (2) (2005), 107–121.CrossRef Google Scholar

[20] Zhu, Y., Cangellaris, A., Multigrid Finite Element Methods For Electromagnetic Field Modeling, Wiley-Interscience, 2006.Google Scholar

[21] NVIDIA Corporation, CUDA Programming Guide, http://docs.nvidia.com//cuda//index.html Google Scholar

[22] Dziekonski, A., Lamecki, A. and Mrozowski, M., GPU acceleration of multilevel solvers for analysis of microwave components with finite element method, Microwave and Wireless Components Letters, IEEE 21 (1) (2011), 1–3.Google Scholar

[23] Dziekonski, A., Lamecki, A. and Mrozowski, M, Tuning a hybrid GPU–CPU V-Cycle multilevel preconditioner for solving large real and complex systems of FEM equations, Antennas and Wireless Propagation Letters, IEEE, 10 (2011), 619–622.Google Scholar

[24] Dziekonski, A., Lamecki, A., and Mrozowski, M., A memory-efficient and fast sparse matrix vector product on a GPU, Progress In Electromagnetics Research, 116, (2011), 49–63.CrossRef Google Scholar

[25] Schöberl, J., NETGEN an advancing front 2D/3D-mesh generator based on abstract rules, Computing and Visualization in Science, 1 (1), (1997) 41–52.Google Scholar

[26] Lamecki, A., Balewski, L. and Mrozowski, M., An efficient framework for fast computer aided design of microwave circuits based on the higher-order 3D finite-element method, Radio-engineering, 23 (4), (2014), 970–978.Google Scholar

Article contents

GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests