Kernel-based GPU programming models

The course focuses on GPU kernel-based programming, with a strong emphasis on understanding how to write highly efficient parallel code that runs directly on GPU hardware. Rather than treating the GPU as a black box, the course dives into how computations are actually executed at the kernel level, and how programmers can control and optimize that execution.

A key theme throughout the course is the relationship between hardware architecture and software performance. Students learn that achieving high performance on GPUs is not just about parallelizing code, but about structuring computations in ways that align with how GPUs schedule threads, access memory, and execute instructions.

Prerequisites

  • Familiarity with one or more programming languages like C/C++ or Fortran is recommended

  • Basic understanding of computer architecture

  • Familiarity with parallel or concurrent programming concepts

Learning outcomes

This material is for all researchers and engineers who work with large or small datasets and who want to learn powerful tools and best practices for writing more performant, parallelised, robust and reproducible data analysis pipelines.

By the end of this module, learners should:

  • Understanding how GPU kernels work at a low level

  • Learning how to map computations onto GPU hardware

  • Writing programs that exploit massive parallelism

Credit

Don’t forget to check out additional course materials from XXX. Please contact us if you want to reuse these course materials in your teaching. You can also join the XXX channel to share your experience and get more help from the community.

License

Note

To module authors: For code you may use any OSI-approved license as mentioned in https://spdx.org/licenses/, such as Apache License 2.0, GNU GPLv3, MIT. Please make sure to update the deed above and LICENSE.code file accordingly.