рус eng
 
 Resume
 My work
 Papers
 Education
 Hobbies
 
GPU programming
Special course for master students
Spring term 2015-2015
Lectures notes
  • Introduction. Part 1 (pdf)
  • Introduction. Part 2 (pdf)
  • CUDA. API (pdf)
  • Texture and Surfaces (pdf)
  • Program optimization (pdf)
  • CUDA Driver API (pdf)
  • Multiple GPU (pdf)
  • GPU Optimized libraries(pdf)
  • Adaptation to GPU (pdf)
  • GPU programming in FORTRAN, Java, C# (pdf)
  • Introduction to OpenCL (pdf)
  • Introduction to OpenACC (pdf, pdf, Textbook)


ssh: 84.237.52.17:10022 or ssh:10.4.0.65 (NSU campus)
The procedure
All questions\answers sent to the address: arom[ at ]ccfit.nsu.ru. Subject prefix: "CUDA_2015".
To the report:
  1. The source code and Makefile or the compilation line on cuda.ccfit.nsu.ru server
  2. A half-page report with the results and its explanation
Задачи
  1. Allocate GPU array arr of 10^8 float elements and initialize it with the kernel as follows: arr[i] = sin((i% 360) * Pi/180). Copy array in CPU memory and count error as err = sum_i(abs (sin((i% 360) * Pi/180) - arr [i]))/10^8. Investigate the dependence of the use of functions: sin, sinf, __ sinf. Explain the result. Check the result for array of double data type.

  2. Implement a program for applying filters to your images. Possible filters: blur, edge detection, denoising. Implement three versions of the program, namely, using global, shared memory and texture. Compare the time.
  3. To work with image files, it is recommended to use libpng (man libpng). Examples are in /usr/share/doc/libpng12-dev/examples/.

  4. Modify the previous program so as to use all GPUs available for the program. The program should determine the amount of available GPU and distribute the work on them.
  5. Use the method of least squares to find a circle in the image. For each random sample points organize their processing on the GPU. Random sampling arrange with library CURAND.

    Input: an image size of 640x480 (for example, a speed limit sign), the number of samples N, the number of elements in each sample K.

    Output: The image with a circle painted on it.

    Recommendation: before processing apply Sobel filter for edge detection and consider the point at which the normalized color value>0.5.

    RANSAC Least-Squares Circle Fit

  6. Ray Tracing.
    Implement the generation of the stage, consisting of at least two spheres and at least one plane. Your choice is to implement refraction or reflection of the beam from the spheres. Plane must be tight with texture. The minimum size of the resulting image is 640x480 pixels.

    The plane can be considered as 2 triangles.

  7. For a given program in FORTRAN, perform the adaptation of the code to GPU. There are two options: using PGI Accelerated Fortran compiler, or by calling an external function with the kernel in C\C++.


Usefull links (Полезные ссылки)