GPGPU: General Purpose Computing on Graphics Processing Units

By James Wang
10:23 Oct 17, 2006
Tags: GPGPU | Havok | PhysX
«  »
GPGPU: General Purpose Computing on Graphics Processing Units
GPGPU programming
Let’s suppose we want to write a GPGPU program to find the sum of 1024 numbers stored in memory. In order for the GPU to get to work, we first have to copy these numbers into the graphics memory. Tempting as it may be, you can’t just dump an Excel spreadsheet in there – it has to be a format the graphics processors understand, in this case, a texture.

To find the sum we need to devise an algorithm that will run in a parallel fashion. If we simply write a program to add the numbers one at a time, the calculations will end up on one pipeline. In order to use all the pipes, we need to break up the problem into independent parts. One way is break up the 1024 numbers as 512 pairs. We take each pair and add them, storing the 512 results. We then take the 512 numbers and add them again, storing the 256 results. In this way, we’ll eventually converge to one pair whose sum will be our result. Breaking up the workload into independent parts and finding a parallel algorithm is the key to exploiting the GPU.

For the highest performance, such a program should be written in to use the pixel processors, as there are more pixel than vertex processors. The execution involves every stage of the graphics pipeline. The vertex processor maps our texture onto a rectangle. The rasteriser takes the rectangle and converts it to fragments. The pixel processor then computes the sum, storing the result as a new texture half the size of the original. The process repeats itself using the smaller texture until the texture is reduced to one element. That is our sum.

How about performance? A GPU like the RADEON X1900 XTX can execute 384 add operations per cycle using its pixel processors alone. A Core 2 Duo Extreme, even when using all its SIMD units, can only do 16. Even when we factor in the 4.5x clock speed difference, the RADEON is still 5.2 times faster.

The above example is essentially how GPGPU programs work. Program data is expressed as textures. The calculations are done by the vertex and pixel processors. In turn, the results are written back as textures.

GPU physics
One of the most exciting GPGPU applications is game physics acceleration. Early this year, following AGEIA’s PhysX launch, Havok and NVIDIA announced GPU based physics acceleration using the new Havok FX API. The new API allows for what is known as ‘effects physics’ to be run on the GPU.

Effects physics is different from the more general gameplay physics supported by AGEIA’s PhysX board. Effects physics make the world look more interesting through particles, fog and fluid simulation but it won’t affect the outcome of the game. It is more akin to physics-based graphical effects. Gameplay physics on the other hand is about applying physics to elements of the gameplay. So while effects physics can make the ocean look alive by adding waves and particles, gameplay physics will calculate the force of the waves and apply it to elements of the game.

Although gameplay physics allows for greater interaction, effects physics is also important for realising greater visual richness. Its greatest advantage is that it can simulate much larger scales. According to Havok’s Vice President of Product Management Jeff Yates, effects physics can simulate 10 times as many non-critical objects and particles than a gameplay physics system.

Depending on who does the benchmarks, effects physics on the GPU is said to be more than 10 times faster than a CPU implementation. ATI has even gone as far as to claim that its RADEON X1900 XTX runs physics nine times faster than AGEIA’s PhysX board. We don’t know how ATI arrived at this number as the RADEON and PhysX share no common API.

Havok has announced physics support for both ATI and NVIDIA graphics processors. Games developed using Havok FX are expected to arrive at the end of the year. The system will by default share the physics and graphics processing on one GPU. If you’re fortunate enough to own a dual-GPU system, the API can offload the physics to the second GPU, using it as effectively a dedicated physics accelerator. ATI has even demonstrated a three GPU system, rendering graphics in CrossFire mode with two GPUs, while using the third for physics.

 
«  »
 
This article appeared in the October, 2006 issue of Atomic.

Behind the scenes with Mass Effect 3! GTX 560 VGA round-up! Essential Skyrim tweaks to improve your game! Plus reviews, news, hardware, more games, and easy to following modding guides for PC builders. ON SALE NOW!
 
Latest Competitions
 
Atomic Magazine

Issue: 133 | February, 2012

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
 
Latest User Reviews
Battlefield 3 is the new benchmark online FPS
90%
A very fun and realistic multiplayer ride.
 
Antec Kuhler 920 - liquid cool
90%
Antec Kuhler 920 silent but effientive out of the box no maintence water cooling kit
 
Antec's Lanboy Air - our new favourite case
90%
Antec Lan boy Air in red a very cool design
 
Antec's Lanboy Air - our new favourite case
90%
This product overall is awesome.
 
MSI's GT780 laptop as fast as it gets
90%
Nice laptop
 
 
Close Get the February, 2012 issue of Atomic mailed to you for $8.95, including postage.

SubscribeBuy nowDigital Version