Skip to main content
    • Aa
    • Aa

A language for hierarchical data parallel design-space exploration on GPUs


Graphics Processing Units (GPUs) offer potential for very high performance; they are also rapidly evolving. Obsidian is an embedded language (in Haskell) for implementing high performance kernels to be run on GPUs. We would like to have our cake and eat it too; we want to raise the level of abstraction beyond CUDA code and still give the programmer control over the details relevant to kernel performance. To that end, Obsidian provides array representations that guarantee elimination of intermediate arrays while also using the type system to model the hierarchy of the GPU. Operations are compiled very differently depending on what level of the GPU they target, and as a result, the user is gently constrained to write code that matches the capabilities of the GPU. Thus, we implement not Nested Data Parallelism, but a more limited form that we call Hierarchical Data Parallelism. We walk through case-studies that demonstrate how to use Obsidian for rapid design exploration or auto-tuning, resulting in performance that compares well to the hand-tuned kernels used in Accelerate and NVIDIA Thrust.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

G. Blelloch (1996) Programming parallel algorithms. Commun. ACM 39 (3), 8597.

A. Persson , E. Axelsson & J. Svenningsson (2012) Generic monadic constructs for embedded languages. In Implementation and Application of Functional Languages, Andy Gill and Jurriaan Hage (eds), IFL '11. Berlin Heidelberg: Springer-Verlag, pp. 8599.

J. Svenningsson & E. Axelsson (2013) Combining deep and shallow embedding for EDSL. In Trends in Functional Programming, TFP '12, H.-W. Loidl & R. Pea (eds), Lecture Notes in Computer Science, vol. 7829. Berlin Heidelberg: Springer-Verlag, pp. 2136.

B. J. Svensson , M. Sheeran & R. R. Newton (2014) Design exploration through code-generating DSLs. Commun. ACM 57 (6), 5663.

J. Svensson , K. Claessen & M. Sheeran (2010) GPGPU kernel implementation and refinement using Obsidian. Procedia Comput. Sci. 1 (1), 20652074.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Journal of Functional Programming
  • ISSN: 0956-7968
  • EISSN: 1469-7653
  • URL: /core/journals/journal-of-functional-programming
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 41 *
Loading metrics...

Abstract views

Total abstract views: 213 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 24th July 2017. This data will be updated every 24 hours.