One of the most talked about unannounced products of the last year finally has a name and a pile of tasty features. Join John Gillooly for a first look inside GeForce FX.
After months of speculation with no sign of an actual product, NVIDIA swung itself back onto the horse at Comdex this year and launched, on paper, the GPU formerly known as NV30, and now known as GeForce FX. With ATI now breathing at its heels this GPU is one of the most critical releases from NVIDIA in years.
GeForce FX's release has been a long time coming. NVIDIA has spent the best part of six months trickling information to the public, ever since CEO Jen-Hsun Huang admitted that the tape-out of the chip occurred much later than anyone expected. Following this was a strong NVIDIA showing at Siggraph, where it released the first solid information about its CineFX architecture and began to woo the pro 3D community with promises of cinematic quality 3D achievable with NV30. This is a very interesting turn of events; ATI has spent the past year wooing gamers with pre-launch hype for the RADEON 9700 but NVIDIA chose to push past its traditional stronghold into the pro 3D market.
During the time before the launch at Comdex, speculation was rife about what the new chip was to be called. Huang had announced soon after the launch of the GeForce4 that there would be no GeForce5. Hot contenders for the name were Omen, Nitro and Eclipse. But in the end it was a minor stuff up on the NVIDIA Website that let the cat out of the bag. GeForce would stay, but rather than a number it would be followed by the letters 'F' and 'X'.
Reasoning behind the naming shows an uncharacteristic sentimentality (or at least a willingness to sell cards to sentimentalists). There is a double meaning to FX. There is the obvious visual effects angle, but it also references the intellectual and human resources from 3dfx that have gone into creating this beast.
Big and smallThe GeForce FX GPU is a work of semiconductor art. It has around 125 million transistors -- over 10 million more than the RADEON 9700. It is one of the first 0.13-micron graphics chip, as SiS recently launched the 0.13-micron Xabre600. Trident is still to release working XP4 samples, and the transition to this process in TSMC's troubled foundries has been a big reason for the delays. Much as NVIDIA pioneered the use of DDR-RAM on the original GeForce256, the GeForce FX will be the first widespread implementation of the DDR II standard, with the memory running at a double-pumped 500MHz -- an effective 1GHz RAM speed.
Using DDR II has been NVIDIA's way of avoiding the move to a complex and expensive 256-bit memory bus. ATI and Matrox are both using such architectures, but NVIDIA strongly believes that combination of the GeForce FX GPU running at 500MHz with 1GHz DDR II will postpone issues with memory bandwidth for another generation. Details about the makeup of the memory controller architecture, to be dubbed Lightspeed Memory Architecture 3, are still scant.
As a contrast, the RADEON 9700 PRO runs with a 325MHz core speed and an effective 600MHz RAM speed over a 256-bit memory bus. ATI has already demonstrated DDR II running with the R300 core and have confirmed that the memory controller was designed to accommodate DDR II, however the performance delivered by the current configuration is still more than ample for today's games.
Cards bearing the GeForce FX moniker are still as of yet unannounced, however the hot tip is that there will be multiple variants, probably including a high-end ultra model and a mid-range card. NVIDIA's demonstration cards at Comdex included some interesting features, namely a Molex power connector, and one model with a cooling system reminiscent of the copper heat pipe solution that ABIT calls OTES and uses on its Ti4200 cards.
It is still unknown whether the cooler will make it to the mass market, and whether or not the power connector will be a feature of the final cards. ATI has used a floppy power plug on its RADEON 9700 but pre-launch word was that the less power consumption achieved by going with a 0.13-micron process would mean that the GeForce FX would escape the need for more power.
Shading overkillInformation is still scarce on some parts of the GeForce FX, with the vast majority of launch white papers devoted to CineFX, the name given to GeForce FX's shader array.
In DirectX 8, pixel shaders were at a much more primitive stage than now, lacking the true flexibility that should be provided for developers to work with them.
The DirectX 9 specification includes several major new features like displacement mapping support, however the most important are the updated shader engines -- the Pixel Shader 2.0 and Vertex Shader 2.0. The RADEON 9700 and 9500 cards from ATI support these already, but NVIDIA has gone one step further, creating what it dubs Pixel Shader 2.0+ and Vertex Shader 2.0+.
This is because CineFX goes beyond the DX 9 specifications. In the case of vertex shaders this involves adding support for an extra four registers for us in shading programs.
However the major changes come in the pixel shader, which NVIDIA is basing a lot of its cinematic quality computing philosophy upon. While the vertex shader is still an integer-based unit, albeit with some major updates, the pixel shaders on the GeForce FX are full floating point units.
Vertex shaders now support looping, branching and flow control, which allows for incredibly long operations to be undertaken by passing results back through the pipeline.
The extra four registers help the GeForce FX to store more of these operations temporarily. At the most fundamental level these longer instructions will help when an object needs multiple vertex shader operations. Whereas this would involve multiple shaders in the past, the GeForce FX can happily do it with one complex shader.
Float my boatBy adopting a floating point approach NVIDIA opens up the potential of the pixel shaders. Take a look at the numbers touted by NVIDIA in the table ****. Floating point allows for thousands of constants and registers, blowing the limits of the specs way out of the water.
GeForce FX's pixel shaders can be seen as the heart and soul of the card. Despite the fact that vertex shaders are much more commonly used, NVIDIA's push towards cinematic computing revolves largely around the 128-bit precision of the pixel shaders and colours.
Pixel shaders are now amazingly flexible beasts, supporting ridiculous numbers of operations on every pixel. In fact, the pipeline is so long and flexible that other companies argue that it will impact performance on large shader operations. For example, the shaders can now perform up to 1,024 texture related instructions for every pixel, which will be a boon for anything that requires multiple texture references, like some lighting effects.
Colourful languageOne feature common to NVIDIA and ATI's DirectX 9 cards is support for 128-bit colour operations. NVIDIA has gone one better and introduced a lossless 4:1 colour compression algorithm (much like the one used in NVIDIA's Z-compression function).
While it is doubtful that we will see 128-bit colour on our desktops until at least the launch of Microsoft's Windows XP successor, Longhorn, it is internal colour precision that is the major issue de jour. NVIDIA has taken an interesting approach to this, including support for two different floating point formats throughout the pipeline while allowing developers to switch between the two mid-operation.
DirectX 8 allowed for 8-bit colour precision, while DirectX 9 has two levels
Issue: 133 | February, 2012