Search results for Scientific Computing, Scientific Software

1 - Introduction to GPU Kernels and Hardware
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 1-21
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The key to parallel programming is sharing a task between many cooperating threads running in parallel. A chart is presented showing how since 2003 the Moore’s law growth in computing performance has depended on parallel computing. This chapter includes a simple introductory CUDA example which performs numerical integration using 1000 000 000 threads. Using CUDA gives a speed-up of about 1000 compared to a single CPU thread. Key CUDA concepts including thread blocks, thread grids and warps are introduced. The hardware differences between conventional CPU architectures and GPUs are then discussed. Optimisations in memory caching on GPUs are also explained as memory access time is often a key performance constraint for many programs. The use of OpenMP to share a single task across all cores of a multicore CPU is also discussed.

7 - Concurrency Using CUDA Streams and Events
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 209-238
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 7 explores the ability of GPUs to perform multiple tasks simultaneously, including overlapping IO with computation and the simultaneous running of multiple kernels. CUDA streams and events are advanced features that allow users to manage multiple asynchronous tasks running on the GPU. Examples are given and the NVIDIA visual profiler (NVVP) is used to visualise the timeline for tasks in multiple CUDA streams. Asynchronous disk IO on the host PC can also be performed and examples using the C++ <threads> are given. Finally, the new CUDA graphs feature is introduced. This provides a wrapper for efficiently launching large numbers of kernel calls for complex workloads.

Appendix B - Atomic Operations
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 382-386
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Appendix B discusses the role of atomic operations in parallel computing and the available function in CUDA. An example is provided showing the use of atomicCAS to implement another atomic operation.

11 - Tensor Cores
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 358-372
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter discusses the tensor core hardware available on newer GPUs. This hardware is designed to perform fast mixed precision matrix multiplications and is intended for applications in AI.However, CUDA exposes their use to programmers with the warp matrix function library. These functions support tiled matrix multiplication using 16 × 16 tiles.We provide examples of their use to improve on the early matrix multiplication example in Chapter 2.We also show how reduction operations can be performed using tensor codes as a potential non-AI application.

6 - Monte Carlo Applications
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 178-208
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 6 explains the CUDA random number generators provided by the cuRAND library. The CUDA XORWOW generator was found to be the fastest generator in the cuRAND library. The classic calculation of pi by generating random numbers inside a square is used as a test case for the various possibilities on both host CPU and the GPU. A kernel using separate generators for each thread is able to generate about1012 random numbers per second and is about 20 000 times faster than the simplest host CPU version running on a single core. The inverse transform method for generating random numbers from any distribution is explained. A 3D Ising model calculation is presented as a more interesting application of random numbers.The Ising example has a simple interactive GUI based on OpenCV.

4 - Parallel Stencils
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 106-141
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The solution of partial differential equations in two and three-dimensions using stencil iteration (Jacobi’s method) is discussed and illustrated for Laplace’s equation. A very simple kernel gives about a factor of 100 speed-up compared to the host CPU.The very slow convergence of the Jacobi method can be addressed by using solutions on lower resolution grids to initialise higher resolution grids. A convergence check using the maximum change per iteration is also illustrated. Digital image processing is another example of stencil use and a number of digital image filters are shown including the Sobel filter for edge finding and the median filter for noise reduction. The fast GPU-based median filter uses one thread per image pixel and is implemented using an optimal Batcher network.

Examples
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp xv-xviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Figures
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp x-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Richard Ansorge, University of Cambridge
Book:

Programming in Parallel with CUDA

Published online:

04 May 2022

Print publication:

02 June 2022, pp 448-454
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Programming in Parallel with CUDA

A Practical Guide
Richard Ansorge
Published online:

04 May 2022

Print publication:

02 June 2022
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
CUDA is now the dominant language used for programming GPUs, one of the most exciting hardware developments of recent decades. With CUDA, you can use a desktop PC for work that would have previously required a large cluster of PCs or access to a HPC facility. As a result, CUDA is increasingly important in scientific and technical computing across the whole STEM community, from medical physics and financial modelling to big data applications and beyond. This unique book on CUDA draws on the author's passion for and long experience of developing and using computers to acquire and analyse scientific data. The result is an innovative text featuring a much richer set of examples than found in any other comparable book on GPU computing. Much attention has been paid to the C++ coding style, which is compact, elegant and efficient. A code base of examples and supporting material is available online, which readers can build on for their own projects.

1 - Getting Started
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 1-26
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

An introduction to the syntax and conventions of Mathematica and the Wolfram Language, with tips to get new users up and running. The Basic Math Assistant palette is discussed in some depth.

4 - Algebra
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 145-190
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Using Mathematica and the Wolfram Language to engage with the the algebra encountered in a precalculus or college algebra setting. Includes solving equations and simplifying expressions.

9 - 3D Printing
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 457-522
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

An introduction to the computational geometry commands in the Wolfram Language with an eye toward creating high quality, watertight, 3D printable meshes. Numerous examples illustrate the ideas.

3 - Functions and Their Graphs
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 51-144
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Using Mathematica and the Wolfram Language to investigate mathematical functions, their graphs, creating tables of values, and working with real world data.

Index
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 523-534
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Working with Mathematica
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 27-50
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Practical information to and tips for using Mathematica and the Wolfram Language. Document creation, slideshow presentations, keyboard shortcuts, documentation, and troubleshooting are discussed.

Contents
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp vii-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Calculus
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp 191-248
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Using Mathematica and the Wolfram Language to engage with the calculus of functions of a single variable. Includes limits, continuity, differentiation, integration, sequences, and series.

Preface
Bruce F. Torrence, Randolph-Macon College, Virginia, Eve A. Torrence, Randolph-Macon College, Virginia
Book:

The Student's Introduction to Mathematica and the Wolfram Language

Published online:

01 April 2019

Print publication:

16 May 2019, pp xi-xiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Scientific Computing, Scientific Software

Refine search

Refine search

Actions for selected content:

849 results in Scientific Computing, Scientific Software

1 - Introduction to GPU Kernels and Hardware

Summary

7 - Concurrency Using CUDA Streams and Events

Summary

Appendix B - Atomic Operations

Summary

11 - Tensor Cores

Summary

6 - Monte Carlo Applications

Summary

4 - Parallel Stencils

Summary

Examples

Figures

Index

Programming in Parallel with CUDA

1 - Getting Started

Summary

4 - Algebra

Summary

9 - 3D Printing

Summary

3 - Functions and Their Graphs

Summary

Index

Frontmatter

2 - Working with Mathematica

Summary

Contents

5 - Calculus

Summary

Preface

Scientific Computing, Scientific Software

Refine search

Refine search

Actions for selected content:

Save Search

849 results in Scientific Computing, Scientific Software

Summary

Summary

Summary

Summary

Summary

Summary

Programming in Parallel with CUDA

Summary

Summary

Summary

Summary

Summary

Summary