Breif Record of SCT_Summer Course

Forgetable Knowledge Records Recently (Updated)

Finish Simple Linear Matrix Operation with MPI/CUDA/OpenMP in C (C++ is too hard!!!)
Source Code

  • 正在好好考虑需不需要解析一个一个子域名… 好像阿里云域名要到期了来着 /思考
  • 魚は最近少し怠けるですか?

SCT Summer Tasks And Notes

MakeFile

File used for compile codes

A not good makefile for MPI =>

1
2
3
4
5
6
7
8
9
matrix.o : matrix.h
truempicc matrix.c -c -lm
truemake main
main : main.c matrix.o
truempicc main.c matrix.o -o main -lm -O2 -march=native
trueecho -e "\nExcute main to Start\n"
truerm matrix.o
clean:
truerm main
  • The makefile usually contain the Pseudo-class(like clean or install)
  • the privilege can be modefied due to the compile dependency
  • gcc compile optimitation (Further Deeper…)

Basic Linear Matrix Operation in C

Use new Struct for storing Matirx

1
2
3
4
5
typedef struct matrix {
double *data;
int row;
int col;
} Matrix;

Then we just need to set All the date into the one dim data pointer.

  • Segment Fault Warnning Plaese check if data is well malloced !
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void initMatrix(Matrix * M); 

// Scan the number and set into the Matrix(the first two are column and row integer)

void ReadMatrix(char* Filename, Matrix * M); // Read the Matrix file while the two first integers are column and row

void AddMatrix(Matrix * M1, Matrix * M2, Matrix * M3);

void PrintMatrix(Matrix * M);

// Print the Matrix out

void MiltiplyMatrix(Matrix * M1, Matrix * M2, Matrix * M3);

void FunctionMatrix(Matrix * M1, Matrix * M3);

// MemberWise the Matrix and make function(x) for every element in the Matrix

void TransposeMatrix(Matrix * M);

MPI for Multi-process Tasks

  • #include <mpi.h>

Open MPI is A High Performance Message Passing Library for multi-process Tasks.

  • Some Simple Commands
1
2
3
$ lscpu  // check for cpu details and check the number of the cores
$ mpicc xxx.c -o xxx // hostfile is the details of your machine
$ mpirun xxx --hostfile xxx -np x //the np equals the number of the processes you want to use
  • Conclusion =>
  • MPI IS Just A Piece Of %*&#$?
  • For every process, cut the Matrix and send the data for calculation (MPI_SCATTER)
  • Gather the data back with MPI_GATHER
  • int MPI_Barrier(MPI_Comm_World) Useful and Tricky
  • Use the provided functions by MPI as little as possible for better performace!!!

CUDA && Nvidia

Well, the speed of GPU is much faster than CPUs. The Multi-Threads Programming is also much comfortable than MPI.

Some Simple Commands

1
2
$ nvidia-smi // check the gpu status (Not correct but useful)
$ nvcc xxx.cu -o xxx

Simply run the code with gpu running

  • A lot of Things with CUDA
  • the blocks, threads and grids!!!
  • __global__ and __host__ functions calling in different situation
  • calling function is like Function<<<blocks, threads>>>(a, b, c) for GPU calculation
  • cudaMalloc((void**)&GPU_data, sizeof(double) * area)
  • cudaMemcpy(GPU_data, CPU->data, sizeof(double) * area, cudaMemcpyHostToDevice)
  • cudaDeviceReset() is Really important and do not forget to free the memory on the GPU!

OpenMP Optimization

  • #include <omp.h>

Well designed multi-threads optimization tool(Comfortable to write)

  • #pragma omp parallel for num_threads(MAX_T) deal with for loops and distribute threads

Warnning of above

  • The gcc tool for version over 8.3.1 will not include math.h while compiling. Should add -ld for inclusion.

  • gcc compile parameter : optimization O[1-3] -march or other options

  • The time.h in C Standard Library => typeof time_t is unsigned long or long. SO the minus of the start and the final should be type trasnformed first

1
2
3
4
5
6
7
8
9
10
11
#include <time.h>
time_t start, final;
double time;

start = clock();

...functions...

final = clock();
// get time by seconds
time = (double)(final - start)/1000

Notes from Mr.gg

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
RISC
CISC
SIMD
SIMT

rax eax ax ah al
rbx
rcx
rdx

rdi string
rsi

cs 16*
ds
es
fs
gs
ss
cr0~ce3
r8~r15

MMX
SSE
AVX

rbp
rsp
ring kernel mode
Share