Best Cuda Courses

Find the best online Cuda Courses for you. The courses are sorted based on popularity and user ratings. We do not allow paid placements in any of our rankings. We also have a separate page listing only the Free Cuda Courses.

CUDA programming Masterclass with C++

Learn parallel programming on GPU's with CUDA from basic concepts to advance algorithm implementations.

Created by Kasun Liyanage - Software engineer & founder of intellect, co founder at cpphive


Students: 5548, Price: $119.99

Students: 5548, Price:  Paid

This course is all about CUDA programming. We will start our discussion by looking at basic concepts including CUDA programming model, execution model, and memory model. Then we will show you how to implement advance algorithms using CUDA. CUDA programming is all about performance. So through out this course you will learn multiple optimization techniques and how to use those to implement algorithms. Also we will extensively discuss profiling techniques and some of the tools including nvprof, nvvp, CUDA Memcheck, CUDA-GDB tools in the CUDA toolkit. This course contains following sections.

                                             Introduction to CUDA programming and CUDA programming model

                                             CUDA Execution model

                                             CUDA memory model-Global memory

                                             CUDA memory model-Shared and Constant memory

                                             CUDA streams

                                             Tuning CUDA instruction level primitives

                                             Algorithm implementation with CUDA

                                             CUDA tools

With this course we include lots of programming exercises and quizzes as well. Answering all those will help you to digest the concepts we discuss here.

This course is the first course of the CUDA master class series we are current working on. So the knowledge you gain here is essential of following those course as well.

Scientific Computing Masterclass: Parallel and Distributed

Parallel & Distributed Programming: OpenMP, CUDA, MPI & HPC cluster systems with Slurm and PBS, AWS HPC Parallel Cluster

Created by Scientific Programmer™ Team - | Instructor Team


Students: 1440, Price: $19.99

Students: 1440, Price:  Paid

Welcome to the First-ever High Performance Computing (HPC) Systems course on the Udemy platform.  The goal main of this course is to introduce you with the HPC systems and its software stack. This course has been specially designed to enable you to utilize parallel & distributed programming  and computing resources to accelerate the solution of a  complex problem with the help of HPC systems and Supercomputers.  You can then use your knowledge in Machine learning, Deep learning, Data Sciences, Big data and so on.

HPC clusters typically have a large number of computers  (often called ‘nodes’) and, in general, most of these nodes would be configured identically. Though from the out side the cluster may look like a single system, the internal workings to make this happen can be quite complex. This idea should not be confused with a more general client-server model of computing as the idea behind clusters is quite unique. Cluster computing utilize multiple machines to provide a more powerful computing environment perhaps through a single operating system.


  • A Little bit of Supercomputing history, Supercomputing examples, Supercomputers vs. HPC clusters, HPC clusters computers, Benefits of using cluster computing.

  • Components of a High Performance Systems (HPC) cluster, Properties of Login  node(s), Compute node(s), Master node(s), Storage node(s), HPC networks  and so on.

  • Introduction to PBS, PBS basic commands, PBS `qsub`,  PBS `qstat`, PBS `qdel`  command,  PBS `qalter`, PBS job states, PBS variables,     PBS  interactive jobs, PBS arrays, PBS MATLAB example

  • Introduction to Slurm, Slurm commands, A simple Slurm job, Slurm distrbuted MPI and  GPU jobs, Slurm multi-threaded OpenMP jobs,     Slurm interactive jobs,  Slurm array jobs, Slurm job dependencies

  • OpenMP basics, Open MP - clauses,  worksharing constructs, OpenMP- Hello  world!,  reduction and parallel `for-loop`,  section parallelization,  vector addition,

  • MPI - hello world! send/ receive and `ping-pong`

  • Parallel programming - GPU and CUDA: Finally,  it gives you a concise beginner friendly guide to the GPUs - graphics  processing units, GPU Programming - CUDA, CUDA - hello world and so on! We understand that CUDA is a difficult API, particularly the memory models. We have added some easy to understand CUDA lessons with examples to make your life easy and comfortable to grasp the basics fast!

  • AWS HPC: With the recent advantage of the faster Cloud technologies, AWS provides the most elastic and scalable cloud infrastructure to run your HPC applications. With virtually unlimited capacity, engineers, researchers, and HPC system owners can innovate beyond the limitations of on-premises HPC infrastructure. We have added lectures to show and tell you on how to build a AWS HPC cluster and how to run codes -easily!

Based on your earlier feedback, we are introducing a Zoom live class lecture series on this course through which we will explain different aspects of the Parallel and distributed computing and the High Performance Computing (HPC) systems software stack: Slurm, PBS Pro, OpenMP, MPI and CUDA! Live classes will be delivered through the Scientific Programming School, which is an interactive and advanced e-learning platform for learning scientific coding. Students purchasing this course will receive free access to the interactive version (with Scientific code playgrounds) of this course from the Scientific Programming School (SCIENTIFIC PROGRAMMING IO). Instructions to join are given in the additional contents section.


We created here a total of one university semester worth of knowledge (valued USD $2500-6000) into one single video course, and hence, it's a high-level overview.  Don't forget to join our Q&A live community where you can get free help anytime from other students and the instructor. This awesome course is a component of the Learn Scientific Computing master course.

Cuda Basics

A comprehensive course on Cuda C programming principles

Created by HPC Specialist - High performance computing specialist.


Students: 75, Price: $19.99

Students: 75, Price:  Paid

This course is aimed at programmers with a basic knowledge of C or C++, who are looking for a series of tutorials that cover the fundamentals of the Cuda C programming language. This is done through a combination of lectures and example programs that will provide you with the knowledge to be able to design your own algorithms and leverage the full performance benefits of GPGPU programming.