Thursday May 24, 2012 2:28 PM AEST

NVIDIA Tesla is reborn as ATI Streams Fire

By Nebojsa Novakovic
10:42 Jun 30, 2008
Tags: NVIDIA | Tesla | is | reborn | as | ATI | Streams | Fire
« 
NVIDIA Tesla is reborn as ATI Streams Fire
Now, the NVIDIA GTX280 (and equivalent Tesla card) offers DP throughput at 1/8 of the SP peak; i.e. just over 125GFLOPs for the new Tesla 10 series or the GTX280 OC. The ATI 4800 series offers DP at 1/4 the SP peak performance; i.e. 300GFLOPs on the HD4870. In either case, if solely dependent on the card memory throughput, it wouldn't be easy to get anywhere near that peak. But of course, hundreds of those stream processors in GPUs can hold some data in their local registers and shared memory, to be processed at full speed.

At the recent Tesla briefing, NVIDIA suggested that its four-card, 16GB and four single precision tflops slim rackmount box should be able to get somewhere around 350 Linpack Rmax GFLOPs (measurable maximum) in double precision. For a box costing somewhere around $US9,000, that is a great number - as long as your app can get anywhere near that number. It is quadruple the speed of a dual-CPU overclocked 4GHz Skulltrail in that same Linpack DP - at about the same cost. The problem? Unless your app is CUDA-coded for the GPU support, there will be far more software that can make use of 80Gflops on that Skulltrail than 350GFflops on the custom NVIDIA box.

The GTX280 chips' wide 512-bit memory bus, considered a burden among gaming GPUs as it complicated both the die and board design, is a huge plus in technical computing use. Simply, for a given memory technology, when you need to max out the capacity, you'll at anytime have double the possible capacity - and bandwidth - with green goblin's cards. The new Tesla cards have 4GB GDDR3 RAM per card - even though more conservatively clocked due to the dual-rank mounting and higher loads, it still allows packing that much more data into the fast local memory rather than losing 10x performance when going over PCI-E to the system memory.

The ATI side compensates for the narrower 256-bit bus with faster GDDR5 memory. However, it not only cuts the maximum capacity by half, but also requires - still very rare - higher capacity GDDR5 memories if needing to go to 2GB or more memory.

So, from the raw hardware point of view, ATI offers higher peak SP and much higher peak DP flops, but its narrower memory bus could turn it into a little bit of a capacity expansion 'flop' for those computing apps in need of more on-board memory. How do the two, Tesla and Firestream, compare internally at the chip level? What about the software? You'll have to wait to read all about that soon.

 
« 

theinquirer.net (c) 2010 Incisive Media

 
Aliens: Colonial Marines in depth; Z-77 Motherboard round-up; strategy gaming special; Home Server tutorial. PLUS MUCH MORE - ON SALE NOW!
 
Atomic Magazine

Issue: 137 | June, 2012

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
 
Latest User Reviews
Battlefield 3 is the new benchmark online FPS
90%
A very fun and realistic multiplayer ride.
 
Antec Kuhler 920 - liquid cool
90%
Antec Kuhler 920 silent but effientive out of the box no maintence water cooling kit
 
Antec's Lanboy Air - our new favourite case
90%
Antec Lan boy Air in red a very cool design
 
Antec's Lanboy Air - our new favourite case
90%
This product overall is awesome.
 
MSI's GT780 laptop as fast as it gets
90%
Nice laptop