Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 7: Parallel Algorithms and Techniques

Chapter 7: Parallel Algorithms and Techniques

pp. 211-268

Authors

, Indian Institute of Technology, Delhi
  • Add bookmark
  • Cite
  • Share

Extract

This chapter introduces some general principles of parallel algorithm design. We will consider a few case studies to illustrate broad approaches to parallel algorithms. As already discussed in Chapter 5, the underlying goal for these algorithms is to pose the solution into parcels of relatively independent computation, with occasional interaction. In order to abstract the details of synchronization, we will assume the parallel RAM (PRAM) or the bulk-synchronous parallel (BSP) model to describe and analyze these algorithms. It is a good time for the reminder that going from, say, a PRAM algorithm to one that is efficient on a particular architecture requires refinement and careful design for a particular platform. This is particularly true when “constant time” concurrent read and write operations are assumed. Concurrent reads and writes are particularly inefficient for distributed-memory platforms, and are inefficient for shared-memory platforms as well. It requires synchronization of the processors’ views of the shared memory, which can be expensive.

Question: How do parallel algorithms differ from sequential algorithms?

Recall that PRAM models focus mainly on the computational aspect of algorithm, whereas practical algorithms also require close attention to memory, communication, and synchronization overheads. PRAM algorithms may not always be practical, but they are easier to design than those for more general models. In reality, PRAM algorithms are only the first step toward more practical algorithms, particularly on distributed-memory systems.

Parallel algorithm design often seeks to maximize parallelism and minimize the time complexity. Even if the number of actually available processors is limited, higher parallelism translates to higher scalability in practice. Nonetheless, the work-time scheduling principle (Section 3.5) indicates that low work complexity is paramount for fast execution in practice. In general, if the best sequential complexity of solving the given problem is, say To(n), we would like the parallel work complexity to be O(To(n)). It is a common algorithm design pattern to assume up to To(n) processors and then try to minimize the time complexity. With maximal parallelism, the target time complexity using To(n) processors is O(1). This is not always achievable, and there is often a trade-off between time and work complexity. We then try to reduce the work complexity to O(To(n)), without significantly increasing the time complexity.

About the book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook
US$64.00
Paperback
US$64.00

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers