## Sunday, August 15, 2010

### Parallelizing Matrix Multiplication using Cilk++ in Two Lines

Following the parallelization of matrix multiplication using OpenMP in Parallelizing Matrix Multiplication using OpenMP in One Line, can we do the same using Cilk++?

## Parallel Matrix Multiplication using Cilk++

`/* matrix.cilk */const int size = 1000;float a[size][size];float b[size][size];float c[size][size];int cilk_main(){    // Initialize buffers.    for (int i = 0; i < size; ++i) {        for (int j = 0; j < size; ++j) {            a[i][j] = (float)i + j;            b[i][j] = (float)i - j;            c[i][j] = 0.0f;        }    }    // Compute matrix multiplication.    // C <- C + A x B    cilk_for (int i = 0; i < size; ++i) {        for (int j = 0; j < size; ++j) {            for (int k = 0; k < size; ++k) {                c[i][j] += a[i][k] * b[k][j];            }        }    }    return 0;}`
The cilk_main() is the entry function synonymous to main() in a typical C++ program, but cilk_main() function uses Cilk++ linkage such that you can spawn and sync in the function. This update is trivial.Effectively, there is only one line of non-trivial update to parallelize the serial matrix multiplication.
`\$ g++ -O2 matrix.cpp -o matrix\$ cilk++ -O2 matrix.cilk -o matrix-cilk\$ time ./matrixreal 0m12.613suser 0m12.392ssys 0m0.030s\$ time ./matrix-cilkreal 0m6.775suser 0m12.591ssys 0m0.134s`
The performance improvement is encouraging. You can try out with quad-core or hex-core if you have the machine.