Wednesday, 22 July 2020

Rumors Point Towards Remarkable Gains for AMD’s Upcoming ‘Big Navi’ GPUs

There’s been a lot of debate in the past 12 months over whether RDNA2 would deliver a huge improvement over RDNA. The Radeon 5700 and 5700 XT were significant leaps forward for AMD’s products, but they failed to cleanly beat Turing on absolute power efficiency, and while they challenged Nvidia’s RTX GPUs, they weren’t enough to deliver knockout blows. RDNA was important because it demonstrated that after years of iterating on GCN, AMD was still capable of delivering significant advances in GPU technology.

AMD raised eyebrows when it claimed RDNA2 would offer a 1.5x performance per watt improvement over RDNA, in the same way RDNA had improved dramatically over GCN. Generally speaking, such dramatic improvements only come from node shrinks, not additional GPUs built on the same node. Nvidia’s Maxwell is probably the best example of a GPU family that improved over its predecessor without a node change, and the gap between Maxwell and Kepler was smaller than the gap between Pascal and Maxwell, as far as power efficiency improvements and performance gains.

If you increase something by 1.5x twice, your gain over baseline is 2.25x. AMD’s graph conforms to that relative improvement if you measure the heights of the graph bars in pixels.

There are rumors going around that Big Navi might dramatically faster than expected, with performance estimated at 1.95x – 2.25x higher than the 5700 XT. This would be an astonishing feat, to put it mildly. The slideshow below shows our test results from the 5700 XT and 5700. The 5700 XT matched the RTX 2070 (and sometimes the 2080) well, while the 5700 was modestly faster than the RTX 2060 for a slightly higher price. A 1.95 – 2.25x speed improvement would catapult Big Navi into playable frame rates even on the most demanding settings we test; 18fps in Metro Exodus at Extreme Detail and 4K becomes 35-41 fps depending on which multiplier you choose. I have no idea how Big Navi would compare against Ampere at that point, but it would handily blow past the RTX 2080 Ti.

Evaluating the Chances of an AMD Surge

Let’s examine the likelihood of AMD delivering a massive improvement of the sort contemplated by these rumors. On the “Pro” side:

  • AMD has openly declared that it’s trying to deliver a Ryzen-equivalent improvement on the GPU side of its business. As I noted back when RDNA debuted, it’s not fair to judge GCN-RDNA the same way we judged Bulldozer-Ryzen. AMD had five years to work on Ryzen, while the gap from RX Vega 64 to RDNA wasn’t even two.
  • AMD claims to have improved power efficiency by 1.5x with RDNA, and our comparisons between the Radeon RX 5700 and the Radeon Vega 64 back up this claim. The Radeon 5700 delivers 48fps in 1080p in Metro Last Light Exodus and draws an average of 256W during the fixed-duration workload. The Radeon Vega 64 hit 43fps and drew an average of 347W. That works out to an overall performance-per-watt improvement of ~1.5x.
  • Rumors around Big Navi have generally pointed to a GPU with between 72-80 CUs. That’s a 1.8x – 2x improvement, and it makes the claim of 1.95x – 2.25x more likely on the face of it. Nvidia has not been increasing its core counts generation on generation by this much. The 980 Ti had 2,816 GPU cores, the 1080 Ti packed 3,584 and the 2080 Ti has 4,352. Nvidia has been increasing its GPU core count by about 1.2x per cycle.
  • The PlayStation 5’s GPU core clocks remarkably high for a GPU, at over 2GHz. If we assume that the specified 2.23GHz boost clock for the PS5 is equivalent to the boost clock for RDNA2’s top-end GPU’s with the game clock a little lower, we’d be looking at a 1755MHz Game Clock on 5700 XT versus a 2.08GHz game clock on the Radeon RX Next. That’s a 1.18x gain. A 1.18x gain in clock speed plus a 1.8x gain in CU count = 2.124x improved performance. Pretty much bang on estimated target. A 1.18x IPC improvement without any clock increase (or a mix of the two) could also deliver this benefit.

And the cons?

A 1.5x performance per watt improvement is the kind of gain we typically associate with new process nodes. Nvidia pulled this level of improvement once with Maxwell. The GTX 980 Ti was an average of 1.47x faster than the GTX 780 Ti at the same power draw. AMD never delivered this kind of performance-per-watt leap with GCN over the seven years that architecture drove their GPUs, though GCN absolutely became more power-efficient over time.

Running GPUs at high clock speeds tends to blow their power curves, as the Radeon Nano illustrated against the Radeon Fury five years ago. In order for RDNA2 to deliver the kind of improvements contemplated, it needs to be 1.8x – 2x the size while simultaneously increasing clock without destroying its own power efficiency gains. That’s a difficult, though not impossible trick.

Promising a 1.5x improvement in performance per watt — the one piece of information AMD has confirmed — doesn’t tell us whether that gain is coming from the “performance” side of the equation or the “wattage” side. For example, the GTX 980 Ti and the GTX 780 Ti have virtually the same power consumption under load. In that case, the 1.47x improvement came entirely from better performance in the same power envelope. If AMD delivered a successor to the 5700 XT that drew 197W instead of 295W but offered exactly the same performance, it could also claim a 1.5x improvement in performance-per-watt without having improved its actual real-world performance at all. I don’t think this is actually likely, but it illustrates that improvements to performance per watt don’t necessarily require any performance improvements at all.

I haven’t addressed the question of IPC at all, but I want to touch on it here. When Nvidia launched Turing, it paid a significant penalty in die size and power consumption relative to a GPU with an equivalent number of cores, TMUs, and ROPs but without the tensor cores and RT cores. What does that mean for AMD? I don’t know.

The Nvidia and AMD / ATI GPUs of any given generation almost always prove to respond differently to certain types of workloads in at least a few significant ways. In 2007, I wrote an article for Ars Technica that mentioned how the 3DMark pixel shader test could cause Nvidia power consumption to surge.

Certain feature tests could cause one company’s GPU power consumption to spike but not the others. Image by Ars Technica.

I later found a different 3DMark test (I can’t recall which one, and it may have been in a different version of the application) that caused AMD’s power consumption to similarly surge far past Nvidia.

Sometimes, AMD and Nvidia implement more-or-less the same solution to a problem. Sometimes they build GPUs with fundamental capabilities (like asynchronous computing or ray tracing) that their competitor doesn’t support yet. It’s possible that AMD’s implementation of ray tracing in RDNA2 will look similar to Nvidia’s in terms of complexity and power consumption penalty. It’s also possible that it’ll more closely resemble whatever Nvidia debuts with Ampere, or be AMD’s unique take on how to approach the ray tracing efficiency problem.

The point is, we don’t know. It’s possible that RDNA’s improvements over RDNA1 consist of much better power efficiency, higher clocks, more CUs, and ray tracing as opposed to any further IPC gains. It’s also possible AMD has another IPC jump in store.

The tea leaves and indirect rumors from sources suggest, at minimum, that RDNA2 should sweep past the RTX 2000 family in terms of both power efficiency and performance. I don’t want to speculate on exactly what those gains or efficiencies will be or where they’ll come from, but current scuttlebutt is that it’ll be a competitive high-end battle between AMD and Nvidia this time around. I hope so, if only because we haven’t seen the two companies truly go toe-to-toe at the highest end of the market since ~2013.

Now Read:



No comments:

Post a Comment