рус eng
 
 Resume
 My work
 Papers
 Education
 Hobbies
 
GPU programming
Special course for master students
Fall term 2021-2022
Программа курса
Lectures notes
  • Introduction. Part 1 (pdf)
  • Introduction. Part 2 (pdf)
  • CUDA. API (pdf)
  • Texture and Surfaces (pdf)
  • Program optimization (pdf)
  • CUDA Driver API (pdf)
  • Multiple GPU (pdf)
  • GPU Optimized libraries(pdf)
  • Adaptation to GPU (pdf)
  • GPU programming in FORTRAN, Java, C# (pdf)
  • Introduction to OpenCL (pdf)
  • Introduction to OpenACC (pdf, pdf, Textbook)


ssh: cuda.ccfit.nsu.ru
The procedure
All questions\answers sent to the address: arom[ at ]nsu.ru. Subject prefix: "CUDA_2021".
To the report:
  1. The source code and Makefile or the compilation line on cuda.ccfit.nsu.ru server
  2. A half-page report with the results and analysis
Assignments
  1. Allocate an array of doubles. Init it with `a[i] = sin(2*Pi*i/N)`, calculate `sum(a[i])` on CPU and GPU. Copy array from GPU to CPU and check the error `sum(abs(a_gpu[i]- sin(2*Pi*i/N)))/N`. Explain the difference. Use OpenACC. Check `-gpu=fastmath` option
  2. Implement the code to solve the Poison equation on a square mesh (NxN). Iterative scheme. Use OpenACC directives. Compare CPU vs GPU performance for the main loop depending on N (N=128, 256, 512, 1024).
    1. Profile the code.
    2. Optimize the code.
  3. Implement the code from the item 2. Use the cuBLAS function to calculate the error. The same ode should be compiled for CPU and GPU. Use preprocessor directives. Compare the performance of the code from the prev. task.
  4. Adopt the code to GPU. Signal correlation. Profile the code, optimize, explain the result. Use OpenACC.
  5. Implement the task from the item 2 using CUDA. For the reduction operation use the CUB library.
  6. Implement the signal correlation algorithm from the item 4 as a module. Call it from a Python script.


Usefull links (Полезные ссылки)