SpeedGo Computing

[Resolved] Skype crashed on Fedora 19

2013-07-04T21:41:00.000+08:00

Running Skype on Fedora 19 gets core dumped immediately. By luck I found a workaround using mesa-libGL library from Fedora 17. Steps: Download the mesa-libGL-8.0.4-1.fc17.i686.rpm from Fedora, or get it here. Extract the rpm file. $ rpm2cpio mesa-libGL-8.0.4-1.fc17.i686.rpm | cpio -idv Run skype with the library. $ LD_LIBRARY_PATH=usr/lib /usr/bin/skype That seems working

Personal Supercomputing System with Quad GPUs

2012-06-18T14:50:00.001+08:00

The secrets of the new Kepler GPUs have been revealed. The Kepler based graphics cards have been studied extensively for gaming performance. Most would suggest you don't need the upgrade. Furthermore, the supported PCI-E 3.0 is of little to no use. Well, it's probably a different story for CUDA programs. Here's the setup I'm going to use for testing CUDA programs extensively. Asus P8Z99-V

Being Nvidia CUDA Certified Programmer!

2011-06-26T08:29:00.000+08:00

It takes some courage and effort to take the Nvidia CUDA Certification exam. You'll have to pay S$350 for that yet there is no guarantee of real use in business and career. The exam questions are perfect to squeeze out all your brain juice. After much feedback and long awaiting, delayed plans, finally I received an email about being Nvidia CUDA certified programmer now. It's better arrived late

The Choice is Yours: CUDA in C++ or Ruby

2011-05-09T21:03:00.000+08:00

See the output here: Ruby Query Output See the output here: C++ Query Output

Web Seminar: Programming GPUs Beyond CUDA

2011-05-03T17:44:00.000+08:00

GPU/CUDA programming is easy if we ignore the performance, or even the correctness of the program. It becomes tough when the performance is critical, one has to optimize very hard on the specific hardware. Fortunately, GPU hardware performance improves drastically every 2 years. Unfortunately, the performance is not portable across different generations of GPUs. Prof Chen from Tshing Hua

First Release of SGC Ruby CUDA - Beginning of a long way path

2011-04-30T17:56:00.003+08:00

Today we decided to put up the first release of the SGC Ruby CUDA v0.1.0 as a mean to attract Rubyists to try out GPU programming as their new toy projects, and also to encourage HPC developers to evaluate if Ruby is good to use for their HPC applications. When important software libraries are not available in Ruby, we certainly do not expect much Ruby usage in the area. As time is running short

GPU Computing with Ruby

2011-04-24T10:32:00.001+08:00

Presented in RedDotRubyConf 2011 - PechaKucha Night Singapore.GPU Computing with RubyView more presentations from myxman.

Using SGC-Ruby-CUDA on the Newly Launched Amazon EC2 Cluster GPU

2010-11-19T21:16:00.003+08:00

Wonder if GPU works for you? No budget for a system with decent GPU? Installations and configurations are too much trouble for you? You can now try out SGC-Ruby-CUDA on Amazon EC2 with the system image, located at US East Virginia zone, called SGCRubyCUDA.1 which is available as a community AMI. Compile for rubycu shared library and run tests. [root@ip-10-17-130-174 sgc-ruby-cuda.git]# rake (in

GPU Anywhere with Cloud Computing

2010-11-16T10:35:00.000+08:00

Simulation taking months to run? Buying and maintaining new systems causing too much hassle? Perhaps Cluster GPU would be a good candidate to save time and trouble. Cloud solution is an excellent platform for proof of concept before committed to a large system in-house. Paying $2.10 per hour (Amazon pricing as of 16 Nov 2010) for the spec of: 22 GB of memory 33.5 EC2 Compute Units (2 x Intel

Parallel programming knowledge is must-have skill for Wall Street

2010-09-26T08:10:00.001+08:00

Parallel programming knowledge is must-have skill for Wall Street

Unigine crew: CUDA vs OpenCL vs SPU Part IV

2010-09-17T21:58:00.001+08:00

Which language or library you choose to use for your software development has great and prolong impact to the software. Just come across a simple yet interesting benchmark. Perhaps, more details on why such numbers are obtained would be even more enlightening.Unigine crew: CUDA vs OpenCL vs SPU Part IV

CUDA Programming with Ruby

2010-09-17T10:46:00.005+08:00

Need GPU computing power in your Ruby program? Great! SpeedGo Computing is developing Ruby bindings for CUDA, called sgc-ruby-cuda. Take advantage of your Nvidia CUDA-enabled graphics cards with Ruby now.Currently, only part of the CUDA Driver API is included. More components such as the CUDA Runtime API will be included to make it as complete as possible.CUDA Programming with Rubyrequire '

High Performance for All

2010-09-07T19:23:00.003+08:00

Parallel programming is much more affordable now as multi-core CPU and programmable GPU become commodity products. Unlike a decade ago where a minimum dual socket system equipped with lower clocked CPU & RAM would relatively cost a fortune to a typical desktop user, but dual-core system is basically everywhere nowadays. The use of dual-core systems is not really because it's affordable, but

AMD’s Bulldozer vs Intel's Hyper-Threading?

2010-08-25T12:38:00.020+08:00

AMD's so called Strong Thread approach in the Bulldozer module is that really compelling?Extra cores are added when a processor can't operate at a faster clock speed, that's a good and easy way to expand a product line with effectively faster products, even though it may NOT be any faster depending on whether the applications are taking advantage of the multiple cores. But fully duplicating x86

Parallelizing Matrix Multiplication using MPI

2010-08-17T12:41:00.005+08:00

MPI is a popular mechanism in high performance computing. It works for both cluster and shared memory environment. Why don't we simply use MPI when it works for both environments? Why do we care about OpenMP? Cilk++? etc. Perhaps that depends on the complexity of the applications you are dealing with.Parallel Matrix Multiplication using MPI/* matrix-mpi.cpp */#include <mpi.h>const int size

Parallelizing Matrix Multiplication using TBB

2010-08-15T14:13:00.004+08:00

Parallelizing matrix multiplication using TBB isn't too difficult. It's just a little more work than OpenMP or Cilk++.Parallel Matrix Multiplication using TBB/* matrix-tbb.cpp */#include <tbb/parallel_for.h>#include <tbb/blocked_range.h>using namespace tbb;const int size = 1000;float a[size][size];float b[size][size];float c[size][size];class Multiply{public: void operator()(

Parallelizing Matrix Multiplication using Cilk++ in Two Lines

2010-08-15T11:01:00.011+08:00

Following the parallelization of matrix multiplication using OpenMP in Parallelizing Matrix Multiplication using OpenMP in One Line, can we do the same using Cilk++?Parallel Matrix Multiplication using Cilk++/* matrix.cilk */const int size = 1000;float a[size][size];float b[size][size];float c[size][size];int cilk_main(){ // Initialize buffers. for (int i = 0; i < size; ++i) { for (

Parallelizing Matrix Multiplication using OpenMP in One Line

2010-08-14T22:29:00.006+08:00

Matrix multiplication is often used for academic study. It's well suited for parallelization due to its intensive O(N^3) computation and independent computation. Parallel programming is hard. Does it surprise you if we parallelize matrix multiplication in merely one line of OpenMP directive?Serial Matrix Multiplication/* matrix.cpp */const int size = 1000;float a[size][size];float b[size][size];

Parallel Programming - Hello World

2010-08-11T18:14:00.013+08:00

Many computer science/engineering students learn to write Hello World program at their first programming lecture. What's your first parallel program? What about Hello World program in OpenMP, MPI, Cilk++, TBB, Ruby thread, PThread?Hello World in C/* hello.c */#include <stdio.h>int main(){ printf("hello world\n"); return 0;}$ gcc hello.c -o hello$ ./hellohello worldHello World in

Parallel Programming - What Are The Options?

2010-07-31T02:21:00.013+08:00

There are simply way too many parallel programming languages and libraries to keep track of. Many of them are no longer active in development, or difficult to get them working in decent operating systems. What are the practical options currently available for multi-core CPU or GPU?OpenMPHardware: Shared memory multi-core CPU system.Parallelization: Use directives e.g. #pragma omp parallel {} in C

Who Is Responsible For The Programming Of Multi Core CPU And GPU?

2010-07-29T16:10:00.002+08:00

Multi core CPU and GPU are now commodity products. But, where are the software that could take advantage of their parallel architecture? Who should be developing such software? The domain expert? HPC (high performance computing) sofware engineer? or parallel programming tools such as auto parallelizing compilers?Domain experts typically do not wish to spend too much time on computing problems.

Why Can't Compilers Auto Parallelize Serial Code Effectively?

2010-07-28T17:33:00.007+08:00

An auto parallelizing tool takes in a serial code base in C/C++/Fortran etc. and produces parallel version of the code. For instance, specifying -parallel option at Intel compiler compilation produces parallelized binary with OpenMP runtime. MIPSpro compiler provides similar auto parallelizing function with -apo option, where you can view the code transformation which consists of SGI OpenMP

Where Are All The Practical Parallel Algorithms and Libraries?

2010-07-22T12:14:00.007+08:00

Multi-core CPU and GPU are everywhere nowadays from laptops to desktops to high-end computing clusters. Is your particular application running any faster? Nope. But generally you need parallel algorithms for an application to make full use of the multiple cores.Perhaps you'll expect doing some searches on the web, research publications and academic books would provide you all the state of art

Why Is Parallel Programming Difficult?

2010-07-21T03:14:00.020+08:00

Parallel programming is generally perceived as an activity only for people going after high tech, bleeding edge research. It is difficult and alien enough to drive most software engineers away, whether it is really the case or merely their misconceptions. The fact is, software engineers run away from parallel programming while modern general purpose processors consist more and more multiple cores