Course on Programming GPU Devices using OpenACC Directives on the Cray XK6

CSCS and HP2C are announcing the following course

Programming GPU Devices using OpenACC Directives on the Cray XK6
March, 6-7  2012
CSCS in Manno

The registration fee is of CHF 150 including the coffee breaks.

Registration and agenda »

Contents

Attendees of this HP2C training event will learn about the Cray XK6 hybrid multi-core and GPU architecture and its programming environment.

They will learn about the OpenACC directives, which were designed to help users develop and port applications to run on heterogeneous systems. They will have an understanding on how to use the Cray Performance tools to identify “hot areas” in the code to focus the use of OpenACC directives. They will have the opportunity to experiment the OpenACC directives with the Cray Compilation Environment (CCE). In addition, they will learn about the Cray scientific libraries for accelerators and will learn and experiment Allinea’s DDT and Cray’s Performance Tools for debugging and performance tuning of heterogeneous applications on the Cray XK6 systems.

Attendees are encouraged to bring in their own applications and codes for the hands-on sessions.  Experts from Cray PE, OpenACC and libsci development and performance tools and Allinea DDT debugger will be present at the meeting for discussions and feedback.  We also invite current users who have their applications running successfully on the Cray XK6 system to present brief user experience talks.

Agenda

– Welcome
– Overview of the Cray XK6 system
– Introduction to Cray XK6 Programming Environment
– Support for GPU application development and execution

  • GPU development environments (CUDA C & Fortran, OpenCL & OpenACC from Cray & PGI)
  • GPU accelerated libraries
  • Message passing communication (MPI)

– Introduction to OpenACC
– Development cycle of application porting

  • Static analysis of the application
  • Find hot loops
  • Scoping Analysis
  • Add OpenMP
  • Create OpenACC regions from OpenMP regions

– Using libsci_acc
– Debugging
– Performance tuning

  • Profile application
  • Using the accelerator hardware counters
  • Analysis of data transfers
  • Add data regions