GeForce FX

From Free net encyclopedia

(Difference between revisions)

Current revision

Image:NVIDIA GeFORCE-FX logo.png Image:NVidia Dawn.jpg

The GeForce FX (codenamed NV30) is a graphics card in the GeForce line, from the manufacturer NVIDIA.


Contents

Overview

NVIDIA's GeForce FX series is the fifth generation in the GeForce line. With GeForce 3, NVIDIA introduced programmable shader units into their 3D rendering capabilities, in line with the release of Microsoft's DirectX 8.0 release. With real-time 3D graphics technology continually advancing, the release of DirectX 9.0 ushered in a further refinement of programmable pipeline technology with the arrival of Shader Model 2.0. The GeForce FX series brings to the table NVIDIA's first generation of Shader Model 2 hardware support. The architecture was a major departure from the GeForce 4 series.

While it is the fifth major revision in the series of GeForce graphics cards, it wasn't marketed as a GeForce 5. The FX ("effects") in the name was decided on to illustrate the power of the latest design's major improvements and new features, and to virtually distinguish the FX series as something greater than a revision of earlier designs. The FX in the name also was used to market the fact that the GeForce FX was the first GPU to be a combined effort from the previously acquired 3DFX engineers and NVIDIA's own engineers. NVIDIA's intention was to underline the extended capability for cinema-like effects using the card's numerous new shader units.

The FX features DDR, DDR-II or GDDR-3 memory, a 130 nm fabrication process, and Shader Model 2.0/2.0A compliant vertex and pixel shaders. The FX series is fully compliant and compatible with DirectX 9.0b. The GeForce FX also included an improved VPE (Video Processing Engine), which was first deployed in the GeForce4 MX. Its main upgrade was per pixel video-deinterlacing — a feature first offered in ATI's Radeon, but seeing little use until the maturation of Microsoft's DirectX-VA and VMR (video mixing renderer) APIs. Among other features was an improved anisotropic filtering algorithm which was not angle-dependant (unlike its competitor, the Radeon 9700/9800 series) and offered better quality, but affected performance somewhat. Though NVIDIA reduced the filtering quality in the drivers for a while, the company eventually got the quality up again, and this feature remains one of the highest points of the GeForce FX family to date (However, this method of anisotropic filtering was dropped by NVIDIA with the GeForce 6 series for performance reasons).

The seemingly-last model (by late 2005), GeForce FX 5950 Ultra, is comparable to competitor ATI Technologies's Radeon 9800 XT.

The advertising campaign for the GeForce FX featured the Dawn fairy demo, which was the work of several veterans from the computer animation Final Fantasy: The Spirits Within. NVIDIA touted it as "The Dawn of Cinematic Computing", while critics noted that this was the strongest case of using sex appeal in order to sell graphics cards yet. It is still probably the best-known of the NVIDIA Demos.

Delays

The NV30 project had been delayed for three key reasons. One was because NVIDIA decided to produce an optimized version of the GeForce 3 (NV 20) which resulted in the GeForce 4 Ti (NV 25), while ATI cancelled its competing optimized chip (R250) and opted instead to focus on the Radeon 9700. The other reason was NVIDIA's commitment with Microsoft, to deliver the Xbox console's graphics processor (NV2A). The Xbox venture diverted most of NVIDIA's engineers over not only the NV2A's initial design-cycle but also during the mid-life product revisions needed to discourage hackers. Finally, NVIDIA's transition to a 130 nm manufacturing process encountered unexpected difficulties. NVIDIA had ambitiously selected TSMC's then state-of-the-art (but unproven) Low-K dielectric 130 nm process node. After sample silicon-wafers exhibited abnormally high defect-rates and poor circuit performance, NVIDIA was forced to re-tool the NV30 for a conventional (FSG) 130 nm process node. (NVIDIA's manufacturing difficulties with TSMC spurred the company to search for a second foundry. NVIDIA selected IBM to fabricate several future GeForce chips, citing IBM's process technology leadership. Yet curiously, NVIDIA avoided IBM's Low-K process.)

Disappointment

Analysis of the Hardware

Image:GeForceFX5800.jpg Hardware enthusiasts saw the GeForce FX series as a disappointment as it did not live up to expectations. NVIDIA had aggressively hyped the card up throughout the Summer and Fall of 2002, to combat ATI Technologies' Fall release of the powerful Radeon 9700. ATI's very successful Shader Model 2 card had arrived several months earlier than NVIDIA's first NV30 board, the GeForce FX 5800.

When the FX 5800 launched it was discovered after much testing and research on the part of hardware review websites that the 5800 was not a match for Radeon 9700, especially when pixel shading was involved. The 5800 had roughly a 30% memory bandwidth deficit caused by the use of a narrower 128-bit memory bus (compared to ATI's 256-bit). The card used expensive and hot GDDR-2 RAM while ATI was able to use cheaper lower-clocked DDR SDRAM with their wider bus. And, while the R300 core used on 9700 was capable of 8 pixels per clock with its 8 pipelines, the NV30 was discovered to be a 4 pixel pipeline chip. However, because of both the expensive RAM and 130 nm chip process used for the GPU, NVIDIA was able to clock both components significantly higher than ATI to close these gaps somewhat. Still, the fact that ATI's solution was more robust architecturally caused FX 5800 to fail to defeat the older Radeon 9700. The initial version of the GeForce FX (the 5800) was so large that it required two slots to accommodate it, requiring a massive heat sink and blower arrangement called "Flow FX" that produced a great deal of noise. To make matters worse, ATI's refresh of Radeon 9700, the Radeon 9800, arrived shortly after NVIDIA's boisterous launch of the disappointing FX 5800, and Radeon 9800 brought a significant performance boost over the already superior Radeon 9700, further separating the failed FX 5800 from its competition.

With regards to the much-vaunted Shader Model 2 capabilities of the NV3x series, the performance was shockingly poor. The chips were designed for use with a mixed precision programming methodology, using 64-bit FP16 for situations where high precision math was unnecessary to maintain image quality, and using the 128-bit FP32 mode only when absolutely necessary. The GeForce FX architecture was also extremely sensitive to instruction ordering in the pixel shaders. This required more complicated programming from developers because they had to not only concern themselves with the shader code mathematics and instruction order, but also with testing to see if they could get by with lower precision. Additionally, the R300-based cards from ATI did not benefit from partial precision in any way because these chips were designed purely for DirectX 9's required minimum of 96-bit FP24 for full precision. The NV30, NV31, and NV34 also were handicapped because they contained a mixture of DirectX 7 fixed-function T&L units, DirectX 8 integer pixel shaders, and DirectX 9 floating point pixel shaders. The R300 chips emulated these older functions on their pure Shader Model 2 hardware allowing the SM2 hardware to use far more transistors for SM2 performance with the same transistor budget. For NVIDIA, with their mixture of hardware, this resulted in non-optimal performance of pure SM2 programming, because only a portion of the chip could calculate this math, and due to programmers' neglect of partial precision optimizations in their coding seeing as ATI's chips performed far better even without the extra effort. NVIDIA released several guidelines for creating GeForce FX-optimized code over the lifetime of the product, and worked with Microsoft to create a special shader model called "Shader Model 2.0A", which generated the optimal code for the GeForce FX, and improved performance noticeably. It was later found that even with the use of partial precision and Shader Model 2.0A, the GeForce FX's performance in shader-heavy applications trailed behind the competition. However, the GeForce FX still remained competetive in OpenGL applications, which can be attributed to the fact that most OpenGL applications use manufacturer-specific extensions to support advanced features on various hardware and to obtain the best possible performance, since the manufacturer-specific extension would be perfectly optimized to the target hardware.

To industry analysts, the GeForce FX's poor shader 2.0 performance was evidence of bad architectural decisions. A contractual dispute over the pricing of the Xbox's NV2A graphics processor led to Microsoft withholding the specifications for Shader Model 2.0 (in DirectX 9.0.) The FX's disorientation in its feature set and pipeline architecture are directly attributed to NVIDIA's designers mis-guessing the direction of the Direct3D API. NVIDIA felt confident that Microsoft would base DirectX 9's shader model on NVIDIA's own Cg programming language. However, Microsoft instead chose the High Level Shader Language (HLSL) model, which misled NVIDIA's design teams somewhat. Another aspect was how the design of the shaders was focused - while ATI designed the R300-series to support the minimum DirectX 9 requirements and optimised for speed, NVIDIA designed the GeForce FX's shaders to offer far more capabilities than what the DirectX 9 specifications required. NVIDIA heavily promoted this fact during the initial launch of the series, but it backfired massively when it was discovered that actually using the shaders in the way NVIDIA was promoting would result in terrible and effectively unusable performance figures. The succeeding generation of GeForces would discard the FX designation, reverting to the labels GeForce 6 and GeForce 7.

The FX series was a moderate success but because of its delayed introduction and flaws, NVIDIA ceded market leadership to ATI's Radeon 9700. Due to market demand and the FX's deficiency as a worthy successor, NVIDIA extended the production life of the aging GeForce 4, keeping both the FX and 4 series in production for some time, at great expense.

Valve's Presentation

In late 2003, the GeForce FX series became known for poor performance with DirectX 9 Vertex & Pixel shaders because of a very vocal presentation by popular game developer, Valve Software. Early indicators of potentially poor Pixel Shader 2.0 performance had come from synthetic benchmarks (such as 3DMark 2003). But outside of the developer community and tech-savvy computer gamers, few mainstream users were aware of such issues. Then, Valve Software dropped a bombshell on the gaming public. Using a pre-release build of the highly anticipated Half-Life 2 game, using the "Source" engine, Valve published benchmarks revealing a complete generational gap (80-120% or more) between the GeForce FX 5900 Ultra and the ATI Radeon 9800. In Shader 2.0 enabled game-levels, NVIDIA's top-of-the-line FX 5900 Ultra performed about as fast as ATI's mainstream Radeon 9600, which cost approximately a third as much as the NVIDIA card. Valve had initially planning on supporting partial floating point precision (FP16) to optimize for NV3x, however they eventually discovered that this plan would take far too long to accomplish. As said earlier, ATI's cards did not benefit from FP16 mode, so all of the work would be entirely for NVIDIA's NV3x cards, a niche too small to be worthy of the time and effort especially at a time when DirectX 8 cards such as GeForce4 were still far more prevalent than DirectX 9 cards. When Half-Life 2 was released a year later, Valve forced all GeForce FX hardware to use the game's DirectX 8 shaders, in order to avoid the FX series' poor Shader 2.0 performance.

Note that it is possible to force Half Life 2 to run in DirectX 9 mode on all cards with a simple tweak to a configuration file. When this was tried, users and reviewers noted a significant performance loss on NV3x cards, with only the top of the line variants (5900 and 5950) remaining playable.

Questionable Tactics

NVIDIA's GeForce FX era was one of great controversy for the company. The competition had soundly beaten them on the technological front and the only way to get the FX chips competitive with the Radeon R300 chips was to optimize the drivers to the extreme.

This took several forms. NVIDIA historically has been known for their impressive OpenGL driver performance and quality, and the FX series certainly maintained this. However, with image quality in both Direct3D and OpenGL, they aggressively began various questionable optimization techniques not seen before. They started with filtering optimizations by changing how trilinear filtering operated on game textures, reducing its accuracy, and thus quality, visibly. Anisotropic filtering also saw dramatic tweaks to limit its use on as many textures as possible to save memory bandwidth and fillrate. Tweaks to these types of texture filtering can often be spotted in games from a shimmering phenomena that occurs with floor textures as the player moves through the environment (often signifying poor transitions between mip-maps). Changing the driver settings to "High Quality" can alleviate this occurrence at the cost of performance.

NVIDIA also began to clandestinely replace pixel shader code in software with hand-coded optimized versions with lower accuracy, through detecting what program was being run. These "tweaks" were especially noticed in benchmark software from Futuremark. In 3DMark03 it was found that NVIDIA had gone to extremes to limit the complexity of the scenes through driver shader changeouts and aggressive hacks that prevented parts of the scene from even rendering at all. This artificially boosted the scores the FX series received. Side by side analysis of screenshots in games and 3DMark03 showed vast differences between what a Radeon 9800/9700 displayed and what the FX series was doing. NVIDIA also publically attacked the usefulness of these programs and the techniques used within them in order to undermine their influence upon consumers.

Basically, NVIDIA programmed their driver to look for specific software and apply aggressive optimizations tailored to the limitations of the poorly designed NV3x hardware. Upon discovery of these tweaks there was a very vocal uproar from the enthusiast community, and from several popular hardware analysis websites. Unfortunately, disabling most of these optimizations showed that NVIDIA's hardware was dramatically incapable of rendering the scenes on a level of detail similar to what ATI's hardware was displaying. So most of the optimizations stayed, except in 3DMark where the Futuremark company began updates to their software and screening driver releases for hacks.

Both NVIDIA and ATI are guilty of optimizing drivers like this historically. However, NVIDIA went to a new extreme with the FX series. Both companies optimize their drivers for specific applications even today (2006), but a tight reign and watch is kept on the results of these optimizations by a now more educated and aware user community.

Competitive Response

By early 2003, ATI had captured a considerable chunk of the high-end graphics market and their popular Radeon 9600 was dominating the mid-high performance segment as well. In the meantime, NVIDIA introduced the mid-range 5600 and low-end 5200 models to address the mainstream market. With conventional single-slot cooling and a more affordable price-tag, the 5600 had respectable performance but failed to measure up to its direct competitor, Radeon 9600. As a matter of fact, the mid-range GeForce FX parts did not even advance performance over the chips they were designed to replace, the GeForce 4 Ti. In DirectX 8 applications, the 5600 lost to or matched the Ti 4200. Likewise, the entry-level FX 5200 performed only about as well as the GeForce 4 MX 460, despite the FX 5200 possessing a far better 'checkbox' feature-set. FX 5200 was easily matched in value by ATI's older R200-based Radeon 9000-9250 series and outpeformed by the even older Radeon 8500.

With the launch of the GeForce FX 5900, NVIDIA fixed many of the problems of the 5800. While the 5800 used fast but hot and expensive GDDR-2 and had a 128-bit memory bus, the 5900 reverted to the slower and cheaper DDR, but it more than made up for it with a wider 256-bit memory bus. The 5900 performed somewhat better than the Radeon 9800 in everything not heavily using shaders, and had a quieter cooling system than the 5800, but most cards based on the 5900 still occupied two slots (the Radeon 9700 and 9800 were both single-slot cards). By mid-2003, ATI's top product (Radeon 9800) was outselling NVIDIA's top-line FX 5900, perhaps the first time that ATI had been able to displace NVIDIA's position as market leader. Image:Gffx5950.jpg NVIDIA later attacked ATI's mid-range card, the Radeon 9600, with the GeForce FX 5700 and 5900XT. The 5700 was a new chip sharing the architectural improvements found in the 5900's NV35 core. The FX 5700's use of GDDR-2 memory kept product prices expensive, leading nVIDIA to introduce the FX 5900XT. The 5900XT was identical to the 5900, but was clocked slower, and used slower memory.

The final GeForce FX model released was the 5950 Ultra, which was a 5900 Ultra with higher clockspeeds. This model did not prove particularly popular, as it was not much faster than the 5900 Ultra, yet commanded a considerable price premium over it. The board was fairly competitive with the Radeon 9800XT, again as long as pixel shaders were lightly used.

The Way It's Meant To Be Played

Image:Twimtbp.jpg NVIDIA debuted a new campaign to motivate developers to optimize their titles for NVIDIA hardware at the Games Developers Conference (GDC) in 2002. The program offered game developers the added publicity of NVIDIA's program in exchange for the game being consciously optimized for NVIDIA graphics solutions. The program aims at delivering the best possible user experience on the GeForce line of graphics processing units.

GeForce FX Models

Name           Codename Core Design Clocks
core/mem
Memory Bus Architecture Info
FX 5200 NV34 1:2:4 250/200 64 or 128 bit Entry level chip. Replacement for GeForce4 MX family. Quadro FX 500 is based on the GeForceFX 5200. Lacked IntelliSample technology. No lossless color compression or Z compression. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Has 4 pixel pipelines if no pixel shading is used. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD)
FX 5200 Ultra NV34 1:2:4 325/325 128 bit
PCX 5300 NV34 1:2:4 250/325 64 or 128 bit
FX 5500 NV34 1:2:4 275/200 128 bit
FX 5600 NV31 1:2:4 325/275 64 or 128 bit Midrange chip. Sometimes slower than GeForce4 Ti 4200. Quadro equivalent FX 700,1000,1400. Actually has 3 vertex shaders, but 2 are defective. Has 4 pixel pipelines if no pixel shading is used. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD)
FX 5600 Ultra NV31 1:2:4 350/350 128 bit
FX 5600 XT NV31 1:2:4 235/200 128 bit
FX 5700 NV36 3:2:4 425/250 128 bit NV36, like NV35, swapped hardwired DirectX 7 T&L Units + DirectX 8 integer pixel shader units for DirectX 9 floating point units. Quadro equivalent is the Quadro FX 1100. Later models were equipped with GDDR3, which was also clocked higher than the DDR2 modules previously used. On Ultra, RAM speed of 475 MHz also seen. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Has 4 pixel pipelines if no pixel shading is used. Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FP32 mini ALU (each one can do 1 MUL or 1 ADD or 1 FP16 MAD).
FX 5700 LE NV36 3:2:4 250/200 128 bit
FX 5700 Ultra NV36 3:2:4 475/450 128 bit (DDR2/GDDR-3)
PCX 5700 NV36 3:2:4 425/250 128 bit
PCX 5750 NV36 3:2:4 475/425 128 bit (GDDR-3)
FX 5800 NV30 3:4:4 400/400 128 bit (DDR2) Production was troubled by migration to 130 nm processes at TSMC. Produced a lot of heat. Cooler nicknamed the 'Dustbuster', 'Vacuum Cleaner', or 'Hoover' by some sites; nVIDIA later released a video mocking the cooler. Due to manufacturing delays it was quickly replaced by the on-schedule NV35. Its Quadro sibling, Quadro FX 2000 was somewhat more successful. Double Z fillrate (helps shadowing). Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FX12 Mini-ALU (each one can do 2 MULs or 1 ADD or 1 MAD)
FX 5800 Ultra NV30 3:4:4 500/500 128 bit (DDR2)
FX 5900 NV35 3:4:4 400/425 256 bit Swapped hardwired DirectX 7 T&L Units + DirectX 8 integer pixel shader units for DirectX 9 floating point units. Introduced a new feature called 'UltraShadow', upgraded to CineFX 2.0 Specification. Removed the noisy cooler, but still stole the PCI slot adjacent to the card by default. Quadro equivalent is QuadroFX 3000. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards. Double Z fillrate (helps shadowing). Each pixel pipe = 1 FP32 ALU handling 2 TMUs + 2 FP32 mini ALU (each one can do 1 MUL or 1 ADD or 1 FP16 MAD).
FX 5900 Ultra NV35 3:4:4 450/425 256 bit
PCX 5900 NV35 3:4:4 350/??? 256 bit
FX 5900 XT NV35 3:4:4 400/350 256 bit
FX 5950 NV38 3:4:4 475/475 256 bit Essentially a speed bumped GeForceFX 5900. Some antialiasing and shader unit tweaks in hardware. PCX uses AGP to PCIe bridge chip for use on PCIe motherboards.
PCX 5950 NV38 3:4:4 350/475 256 bit

Core Design = # Vertex Shaders : # Pixel Pipelines : # ROPs

References

See also

External links

Template:NVIDIA