Following the parallelization of matrix multiplication using OpenMP in 
Parallelizing Matrix Multiplication using OpenMP in One Line, can we do the same using Cilk++?
Parallel Matrix Multiplication using Cilk++
/* matrix.cilk */
const int size = 1000;
float a[size][size];
float b[size][size];
float c[size][size];
int cilk_main()
{
    // Initialize buffers.
    for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            a[i][j] = (float)i + j;
            b[i][j] = (float)i - j;
            c[i][j] = 0.0f;
        }
    }
    // Compute matrix multiplication.
    // C <- C + A x B
    cilk_for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            for (int k = 0; k < size; ++k) {
                c[i][j] += a[i][k] * b[k][j];
            }
        }
    }
    return 0;
}
The 
cilk_main() is the entry function synonymous to 
main() in a typical C++ program, but cilk_main() function uses Cilk++ linkage such that you can spawn and sync in the function. This update is trivial.Effectively, there is only one line of non-trivial update to parallelize the serial matrix multiplication.
$ g++ -O2 matrix.cpp -o matrix
$ cilk++ -O2 matrix.cilk -o matrix-cilk
$ time ./matrix
real 0m12.613s
user 0m12.392s
sys 0m0.030s
$ time ./matrix-cilk
real 0m6.775s
user 0m12.591s
sys 0m0.134s
The performance improvement is encouraging. You can try out with quad-core or hex-core if you have the machine.
 
No comments:
Post a Comment