Monday March 22, 2010 10:47 PM AEST

CPU and GPU now, the convergence goes on

  • Email a Friend
  • Print Page
« 
CPU and GPU now, the convergence goes on
By The Inquirer
Nov 2, 2009 | 6 Comments
Tags: CPU | GPU | CPGPU | graphics | processor | news

GPU advances
On the other hand, GPU priorities are a little different. Multiplying the thread counts and processing unit numbers here was more important than the power of each processing unit within the GPU, as the typical graphics pipeline is far more predictable and more parallel than most tasks run on general purpose CPUs. So, if an AMD/ATI HD5870 GPU has 1,600 simple shaders in parallel, or an Nvidia GT300 has 512 more complex and more CPU-like shader cores, the GPU looks way different from 4 to 8 CPU cores on a processor die.

Then, despite the four times slower average clock speed for the core, or three times for the shaders, versus the standard CPUs, the vast parallelism of GPUs allows far higher theoretical computational power. When it comes to the double-precision floating-point throughput we discussed before, let's look at what AMD/ATI and Nvidia might have in a few months, in the same timeframe with Intel's Gulftown and AMD's Magny Cours.

On the ATI side, a speed update for the HD5870, probably something called HD58X0, should be there with the refinement and stepping updates of the R800 family dies. If running at a default 950MHz GPU and proportionally sped up shaders, the new device should reach 3 TFLOPS in single-precision floating-point and, more importantly, 600 GFLOPS in double-precision floating-point, both IEEE compliant. In fact, some of the overclockable HD5870 entries, like those from Asus, already provide such speeds.

So, if your code can run efficiently with AMD Stream libraries and such, a dual-GPU hypothetical HD58X0 card will likely give you 1.2 TFLOPS of double-precision floating-point power for precision runs, and 6 TFLOPS of single-precision floating-point for parametrisation and estimation runs. Now, just make sure there is enough memory in there to hold the data sets of multiple threads without running over the PCIe bus to the main memory, as, despite the limited GPU caching, the slow link can cut the performance by as much as an order of magnitude. Therefore, 2GB of GDDR5 memory per GPU is strongly recommended, if doing GPU computation.

By early next year, we all hope that Nvidia's GT300 will already be launched and shipping, because if it isn't, that will be big trouble for the green graphics gang. Let's assume it does. With 512 shader processors that can do either 512 single-precision or 256 double-precision fused multiply adds per clock, that would at, say, 1.8GHz shader clock, give you 1.8 TFLOPS in single-precision mode or 900 GFLOPS in double-precision mode. Not bad at all.

But what's far more interesting is that the GT300 promises to enable a far greater range of codes to make use of all that power. With an overall architecture far closer to a CPU this time, many normal C, C++ and Fortran codes should be able to run on it out of local GPU memory. With up to 6GB of onboard memory in the first iteration, and 8GB in the subsequent one, the latter with a 512-bit memory bus, the GT300 should be quite a bundle.

What the GT300 misses to really be a true CPU and run all the usual stuff, including booting an OS, are a full fledged memory management unit (MMU), for virtual to physical memory translation, and a front-end general purpose CPU instruction set. That's why I was saying many times that Nvidia should have had a real CPU, like say the Alpha did, which would provide both ultrahigh performance better than the X86 to fill in that niche, and also offer the built-in capability to run X86 code very fast via a real-time translator like the famed FX!32 without having to pay for an X86 license.

Don't forget that the last planned Alpha incarnation, the EV9 21564, was supposed to have a kilobyte-wide (yes 8,192 bits) vector unit able to put out over 100 GFLOPS in double-precision floating-point, some 9 years ago. Imagine what would it be able to achieve today.

The Tegra and other ARM-based stuff is simply too weak to be a front end for a gigantic TFLOPS-class GPU. For a proper "fusion" at the system level, you need very fast and wide main system memory, a multi-channel multi-gigabyte setup at least, to feed it from the CPU side, and very fast multiple HyperTransport or QuickPath or Alpha EV links to connect multiple GPUs with the main CPUs for efficient coherent shared memory access between GPU and CPU memory banks. In the absence of a general purpose CPU that's able to do this, Nvidia might have to negotiate a QPI license with Intel to directly link its GPUs to the Westmere and future CPUs, in order to enable more of the coprocessor model here. But wait, wouldn't the long delayed Larrabee be gunning for the same role?

I'll have more on this, and the 'ideal' CPU-GPU system configuration, in Part 2.

 
« 
 
Want to check out the first Australian review of Final Fantasy XIII? We got in this month's Atomic!

Plus HD projectors, Napoleon: Total War, Intel's new six-core processor, PC upgrading guide, and a whole lot more.

ON SALE NOW!
6 Comments
Thoughts on this article? Add a comment below.
pkroeze
Nov 2, 2009 10:20 AM
i think it will be a long time before we get an intergrated CPU/GPU because that would mean less money for all companies involved. take AMD/ATI if they do not merge these technologies they sell 2 cores for every PC but once they merge CPU and GPU they only sell 1. On the other side Nvidia and intel probably won't try it either first of all because of legal bindings between the 2 but also if Intel tried to make a cpu/gpu they would have a hard time selling them until these technologies match seperate CPU and GPU's.
Jeruselem
Nov 2, 2009 10:42 AM
I'm not a fan of absolute single point of failure given how hot CPUs and GPUs get.
tunksy
Nov 2, 2009 10:46 AM
thanks for a very intresting read.
thesorehead
Nov 2, 2009 1:01 PM
As Jerusalem said: top-of-the-line GPU and CPU parts are going to need to be separate for some time simply for the purposes of heat dissipation.

However, GPUs integrated into the MOBO have been fine for "home/office" use where all you need is the Windows desktop and a bit of Java/Flash/whatever shininess. I can see a scenario where Intel/AMD compete in this budget-conscious arena with a fully-integrated system.
CK
Nov 2, 2009 11:13 PM
How about a motherboard with 2 Separate sockets on it? One the CPU and other the GPU. Both spaced apart enough to get aftermarket heat sinks on them for cooling/overclocking, Able to both access system memory when they need it(hopefully DDR5 or something for the GPU's sake).Didn't P55 just do away with an extra chip on the motherboard? Just coincidence maybe???
omega
Nov 3, 2009 1:23 PM
CK, then you'd have things like an AMD/Intel, AMD/Nvidia (would they even...), AMD/Intel(?), Intel/ATI, Intel/Nvidia, Intel/Intel(?) motherboard options as I dont see ATI/Nvidia and Intel all sharing the same socket type for their GPU's.

It would make buying a mobo a lot harder as you dont know which company (blue, green or red) will have the best GPU in 2 years time when you upgrade.
Login or register to submit a comment.
 
 
Atomic Magazine

Issue: 111 | April, 2010

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
"when I read brand and model for ISP I filled it in as company and speed/connection type ..."
by Bundywow | Mar 22, 2010 10:31 PM
 
"Anyone suggesting this controller is a wii-mote knock off isn't comprehending all the facts. ..."
by alexlow8 | Mar 22, 2010 7:59 PM
 
"just ordered 3 of these this afternoon should be here by Thursday ill be sure to post my ..."
by alexdtree | Mar 22, 2010 7:54 PM
 
"one of those votes was from me :) Happy to hear he's no longer AG but not celebrating to find ..."
by Seloh | Mar 22, 2010 7:19 PM
 
"I find it amazing how you can find 2 pages of stuff to write on a drink!"
by Kasalal | Mar 22, 2010 6:55 PM
 
1) Nokia E7147 plans 50%
2) Apple iPhone 3GS 32GB36 plans 50%
3) Apple iPhone 8GB43 plans 20%
4) HTC Magic5 plans 30%
5) Nokia N9740 plans 30%
1) iiNet32 plans 100%
2) Optus41 plans 10%
3) Vodafone7 plans 5%
4) Telstra BigPond30 plans 2%
5) Virgin Mobile6 plans 6%

Mobiles | Broadband | Credit Cards

Haymarket - Atomic MPC
Latest User Reviews
Logitech MX518 Gaming-Grade Optical Mouse
90%
Good shape, design and Ergonomics
 
Coolermaster HAF 922
100%
A case to make a statment and give your pc the Heavy Hardcore Grunt it needs.
 
Coolermaster Excalibur
50%
Atomic is under attack
 
XFX 9300 Motherboard
40%
HUGE letdown
 
CM Storm Sentinel gaming mouse
90%
Sexy and instant geek respect.