SpeedGo Computing: Parallelizing Matrix Multiplication using Cilk++ in Two Lines

Sunday, August 15, 2010

Parallelizing Matrix Multiplication using Cilk++ in Two Lines

Following the parallelization of matrix multiplication using OpenMP in Parallelizing Matrix Multiplication using OpenMP in One Line, can we do the same using Cilk++?

Parallel Matrix Multiplication using Cilk++

/* matrix.cilk */
const int size = 1000;

float a[size][size];
float b[size][size];
float c[size][size];

int cilk_main()
{
    // Initialize buffers.
    for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            a[i][j] = (float)i + j;
            b[i][j] = (float)i - j;
            c[i][j] = 0.0f;
        }
    }

    // Compute matrix multiplication.
    // C <- C + A x B
    cilk_for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            for (int k = 0; k < size; ++k) {
                c[i][j] += a[i][k] * b[k][j];
            }
        }
    }

    return 0;
}

The cilk_main() is the entry function synonymous to main() in a typical C++ program, but cilk_main() function uses Cilk++ linkage such that you can spawn and sync in the function. This update is trivial.Effectively, there is only one line of non-trivial update to parallelize the serial matrix multiplication.

$ g++ -O2 matrix.cpp -o matrix
$ cilk++ -O2 matrix.cilk -o matrix-cilk
$ time ./matrix
real 0m12.613s
user 0m12.392s
sys 0m0.030s
$ time ./matrix-cilk
real 0m6.775s
user 0m12.591s
sys 0m0.134s

The performance improvement is encouraging. You can try out with quad-core or hex-core if you have the machine.

SpeedGo Computing

Sunday, August 15, 2010

Parallelizing Matrix Multiplication using Cilk++ in Two Lines

Parallel Matrix Multiplication using Cilk++

No comments:

Post a Comment