GPGPU: General Purpose Computing on Graphics Processing Units

  • Email a Friend
  • Print Page
«  »
GPGPU: General Purpose Computing on Graphics Processing Units
By James Wang
Oct 17, 2006
Tags: GPGPU | Havok | PhysX
GPGPU programming
Let’s suppose we want to write a GPGPU program to find the sum of 1024 numbers stored in memory. In order for the GPU to get to work, we first have to copy these numbers into the graphics memory. Tempting as it may be, you can’t just dump an Excel spreadsheet in there – it has to be a format the graphics processors understand, in this case, a texture.

To find the sum we need to devise an algorithm that will run in a parallel fashion. If we simply write a program to add the numbers one at a time, the calculations will end up on one pipeline. In order to use all the pipes, we need to break up the problem into independent parts. One way is break up the 1024 numbers as 512 pairs. We take each pair and add them, storing the 512 results. We then take the 512 numbers and add them again, storing the 256 results. In this way, we’ll eventually converge to one pair whose sum will be our result. Breaking up the workload into independent parts and finding a parallel algorithm is the key to exploiting the GPU.

For the highest performance, such a program should be written in to use the pixel processors, as there are more pixel than vertex processors. The execution involves every stage of the graphics pipeline. The vertex processor maps our texture onto a rectangle. The rasteriser takes the rectangle and converts it to fragments. The pixel processor then computes the sum, storing the result as a new texture half the size of the original. The process repeats itself using the smaller texture until the texture is reduced to one element. That is our sum.

How about performance? A GPU like the RADEON X1900 XTX can execute 384 add operations per cycle using its pixel processors alone. A Core 2 Duo Extreme, even when using all its SIMD units, can only do 16. Even when we factor in the 4.5x clock speed difference, the RADEON is still 5.2 times faster.

The above example is essentially how GPGPU programs work. Program data is expressed as textures. The calculations are done by the vertex and pixel processors. In turn, the results are written back as textures.

GPU physics
One of the most exciting GPGPU applications is game physics acceleration. Early this year, following AGEIA’s PhysX launch, Havok and NVIDIA announced GPU based physics acceleration using the new Havok FX API. The new API allows for what is known as ‘effects physics’ to be run on the GPU.

Effects physics is different from the more general gameplay physics supported by AGEIA’s PhysX board. Effects physics make the world look more interesting through particles, fog and fluid simulation but it won’t affect the outcome of the game. It is more akin to physics-based graphical effects. Gameplay physics on the other hand is about applying physics to elements of the gameplay. So while effects physics can make the ocean look alive by adding waves and particles, gameplay physics will calculate the force of the waves and apply it to elements of the game.

Although gameplay physics allows for greater interaction, effects physics is also important for realising greater visual richness. Its greatest advantage is that it can simulate much larger scales. According to Havok’s Vice President of Product Management Jeff Yates, effects physics can simulate 10 times as many non-critical objects and particles than a gameplay physics system.

Depending on who does the benchmarks, effects physics on the GPU is said to be more than 10 times faster than a CPU implementation. ATI has even gone as far as to claim that its RADEON X1900 XTX runs physics nine times faster than AGEIA’s PhysX board. We don’t know how ATI arrived at this number as the RADEON and PhysX share no common API.

Havok has announced physics support for both ATI and NVIDIA graphics processors. Games developed using Havok FX are expected to arrive at the end of the year. The system will by default share the physics and graphics processing on one GPU. If you’re fortunate enough to own a dual-GPU system, the API can offload the physics to the second GPU, using it as effectively a dedicated physics accelerator. ATI has even demonstrated a three GPU system, rendering graphics in CrossFire mode with two GPUs, while using the third for physics.

 
«  »
 
This article appeared in the October, 2006 issue of Atomic.

Want to check out the first Australian review of Final Fantasy XIII? We got in this month's Atomic!

Plus HD projectors, Napoleon: Total War, Intel's new six-core processor, PC upgrading guide, and a whole lot more.

ON SALE NOW!
Comments

Be the first to comment on this article.
Thoughts on this article? Add a comment below.
Login or register to submit a comment.
 
 
Atomic Magazine

Issue: 111 | April, 2010

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
"Send your good taste to celebration by delivering our mouthwatering cakes to Dehradun and exotic ..."
by rony24 | Mar 20, 2010 4:56 PM
 
"So. Much. Awesome."
by The Manta | Mar 20, 2010 4:23 PM
 
"@sirtrancealot, BF started on the PC and BC1 only on Consoles was a kick to the PC gamers ..."
by NRUFrost | Mar 20, 2010 8:14 AM
 
"RAGE!!!"
by Hawkeye | Mar 20, 2010 1:24 AM
 
"alex - bugger all. 78mg of caffeine. About the same as a cup of instant coffee. Taurine, Gurana ..."
by tantryl | Mar 20, 2010 12:51 AM
 
1) Nokia E7147 plans 33%
2) Apple iPhone 3GS 32GB36 plans 33%
3) Apple iPhone 8GB43 plans 22%
4) HTC Magic5 plans 33%
5) Nokia N9740 plans 33%
1) iiNet32 plans 100%
2) Optus41 plans 14%
3) Vodafone7 plans 5%
4) Telstra BigPond30 plans 1%
5) Dodo34 plans 6%

Mobiles | Broadband | Credit Cards

Haymarket - Atomic MPC
Latest User Reviews
Logitech MX518 Gaming-Grade Optical Mouse
90%
Good shape, design and Ergonomics
 
Coolermaster HAF 922
100%
A case to make a statment and give your pc the Heavy Hardcore Grunt it needs.
 
Coolermaster Excalibur
50%
Atomic is under attack
 
XFX 9300 Motherboard
40%
HUGE letdown
 
CM Storm Sentinel gaming mouse
90%
Sexy and instant geek respect.