CISC372: CUDA, part 3: Synchronization & Shared Variables

Synchronization between host and device, for kernel calls, and for threads in a block. Warps and a restriction on __syncthreads. Shared variables: why they are needed and how to create them. Reduction over threads in a block. Example: dot product.

Slides: 25_cuda3.pdf

Leave a Reply