Breaking the GPU programming barrier with the auto-parallelising SAC compiler

Guo, J., Thiyagalingam, J. and Scholz, S. (2011) Breaking the GPU programming barrier with the auto-parallelising SAC compiler. ACM Press.

Copy

Over recent years, the use of Graphics Processing Units (GPUs) for general-purpose computing has become increasingly popular. The main reasons for this development are the attractive performance/price and performance/power ratios of these architectures. However, substantial performance gains from GPUs come at a price: they require extensive programming expertise and, typically, a substantial re-coding effort. Although the programming experience has been significantly improved by existing frameworks like CUDA and OpenCL, it is still a challenge to effectively utilise these devices. Directive-based approaches such as hiCUDA or OPENMP-variants offer further improvements but have not eliminated the need for the expertise on these complex architectures. Similarly, special purpose programming languages such as Microsoft's Accelerator try to lower the barrier further. They provide the programmer with a special form of GPU data structures and operations on them which are then compiled into GPU code. In this paper, we take this trend towards a completely implicit, high-level approach yet another step further. We generate CUDA code from a MATLAB-like high-level functional array programming language, Single Assignment C (SAC). To do so, we identify which data structures and operations can be successfully mapped on GPUs and transform existing programs accordingly. This paper presents the first runtime results from our GPU backend and it presents the basic set of GPU-specific program optimisations that turned out to be essential. Despite our high-level program specifications, we show that for a number of benchmarks speedups between a factor of 5 and 50 can be achieved through our parallelising compiler.

Item Type	Other
Keywords	code; compiler; CUDA; generation; GPU; optimization
Date Deposited	29 May 2025 08:59
Last Modified	29 May 2025 08:59

Explore Further

Scholz, S.

Procs of the 6th ACM Workshop on Declarative Aspects of Multicore Programming, DAMP'11, 83988

Full text not available from this repository.

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

Data Cite XML

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads