Friday, 16 October 2020

AMD’s A9-9820 8-Core Jaguar APU Won’t Offer Xbox One S Performance

There’s a rumor going around claiming that an AMD APU called the A9-9820 is capable of offering Xbox One S-class performance. This is potentially interesting to folks because the nameless motherboard with this piece of silicon attached is selling for just $125 — significantly less than the supposedly-equivalent Xbox One S.

As a curiosity, the A9-9820 would be a fabulous buy. This is an 8C/8T Jaguar (probably) rumored to be the Xbox One’s SoC, and the rumor may well be true. Unfortunately, despite what people are saying, it’ll never match the performance of an Xbox One S. Let’s talk about why.

Core Confusion

I am not willing to absolutely swear that the A9-9820 is a Jaguar CPU. It’s numbered like a member of the Excavator CPU family and there’s no performance evidence that I can find either way, save for some CPU-Z benchmarks that seem much lower than they ought to be. There’s also the fact that one of the lines below refers to 8MB of cache, split between L2 and L3. The Xbox One S doesn’t have an L3 cache. Excavator, however, does.

See below for translation

The above reads, in order:

  1. 8 CPU cores
  2. 8 Threads
  3. Max turbo 2.35GHz
  4. Base Frequency 1.75GHz
  5. TDP 135W (Possibly a typo, intended to read “35W”. The Xbox One S only draws ~90W of power for gaming)
  6. Cache = 8MB split between L2 and L3
  7. Max RAM bandwidth = 68GB/s
  8. Socket: PGA

On the side of it being the Xbox One S’s CPU, there’s the shape of the die, the packaging, and the fact that AMD factually built an eight-core Jaguar part with an attached GPU. We’ve never heard of an eight-core APU based on a Piledriver-derived core. We also know that AMD did scale Jaguar up as high as 2.5GHz on 28nm. In theory, some of the original 32MB of ESRAM could be getting tapped as an L3 cache, though it’s not clear what type of performance boost the chip would get, or what its L3 clock would be.

Given the claims Chuwi has made about its Aerobox, which also uses the A9-9820, it seems much more likely to be Xbox One-related than not.

This spec sheet implies a base and boost clock of 1.75GHz / 2.35GHz, but the actual Aliexpress listing only states 2.35GHz. If the core has a boost mode, it would probably mean the CPU is based on Puma+ rather than the original Jaguar. A 16nm Jaguar with these characteristics and a small slice of its old ESRAM cache makes sense.

From Chuwi’s Aerobox copy.

If the APU can hold that 2.35GHz clock at full load, the A9-9820 would outperform the Xbox One S CPU (1.75GHz, no Turbo), or at least it would if both CPUs were being tested in standard Windows 10. But the Xbox One runs a specialized variant of the OS designed specifically for the platform. Running on the Xbox One S’s APU isn’t the same thing as having access to the Xbox One S’s operating system.

Anemic GPU Performance

According to both Chuwi and this advertisement, the GPU inside the core is an R7 350. As a desktop card, the R7 350 GPU featured 512 cores, 32 Texture Mapping Units (TMUs), and 16 Render OutPuts (ROPs) at 800MHz, with 72GB/s of memory bandwidth. AMD’s A12-9800 Pro APUs carried a similar version of this GPU, with a 512:32:8 configuration. Let’s assume, for the sake of argument, that the R7 350 inside the A9-9820 is identical to the original dGPU, but with 68GB/s of memory bandwidth instead of 72GB/s. Chuwi reports that their version clocks in at 935MHz, which is just a touch faster than the original Xbox One S’ 914MHz.

Unfortunately, the desktop version of the R7 350 was never a match for Edmonton (the Xbox One GPU). Edmonton is a 768:48:16 GPU, and the disadvantage in GPU cores is going to drag heavily on this chip. The actual Xbox One S has 1.5x more GPU cores and 1.5x the TMUs, best-case. The tiny advantage in clock isn’t going to offset the stark disadvantage in GPU cores, and the A9-9820 has only a fraction of the cache available to the Xbox One S.

Could AMD have labeled the onboard GPU an R7 350 and actually included the full Xbox One S core? Yes. So little information is available, I can’t prove they didn’t. But AMD also could have just re-used the R7 360 moniker, since that GPU matches Edmonton exactly. If the company claims it performs like an R7 350, we should treat that as a given until we know differently, and the R7 350 cannot match the Radeon 7790 (aka, the Xbox One S GPU).

Why Dismiss the Extra CPU Performance?

Neither the Xbox One nor the PS4 ever showed much sign of being limited by CPU performance. The PS4 was faster than the Xbox One/One S in virtually every title. The Xbone had a 1.09x clock advantage on the PS4, but we almost never saw real-world performance favor Microsoft. Jaguar (or Puma+) should have the horsepower to do some light gaming at relatively low resolution, and there are undoubtedly at least a few games that would benefit from the faster clock, but this isn’t the Xbox One X APU, and it doesn’t have the more powerful onboard GPU that would require a dramatically faster chip to keep it fed. The R7 350 was a budget card at launch, after all.

Where the A9-9820 Might Shine

If I wanted to build a low-end gaming system for a child (or just a low-end PC for somebody), this little chip+motherboard might be one of the cheapest options around. You’re not going to be gaming above 720p — frankly, you might even need to drop to 540p — but any relatively lightweight task with good threading will run well here. The integrated GPU and its 68GB/s of bandwidth have a shot at matching current AMD APUs, which only offer 51.2GB/s of bandwidth. It would actually be interesting to see the two compared: Would the A9-9820’s 1.32x bandwidth be enough to overcome the 3400G’s advantages in core efficiency, GPU clock (935MHz vs. 1.4GHz), and the larger number of cores? I suspect the 3400G would easily take the fight, despite its lower memory bandwidth, but it’d be interesting to see the differences. The 3400G, it must be noted, is also selling for ~$140 – $150, which is significantly more than the total price of this motherboard+APU combo.

This is a much easier decision if you still have 8GB of DDR3 lying around or can scare some up cheap. For $125, you get a motherboard + CPU, with an integrated GPU that ought to still put up decent numbers compared with AMD chips today. Single-thread performance would be pretty anemic at an estimated 109 in CB20 (compared with ~450 on the Ryzen 3100), but the estimated CB20 multi-core score would be 764, which is… also really anemic compared with the Ryzen 3 3100, at ~2350. But the Ryzen 3 3100 is $120 for just the chip, compared with ~$125 for the A9-9820, motherboard, and (theoretically) some old DDR3 RAM. All benchmark numbers are estimates based on the known performance of Jaguar desktop chips like the A4-5000 in single and multi-core testing. Cinebench R20 tends to be pretty easy to predict, so I’m reasonably confident in the figures.

If we compare against the Athlon 3000G ($87), we get an interesting mix of wins and tradeoffs. The R7 350 GPU core is much more powerful than the 192 GPU cores on the 3000G, and clocking the newer GPU core at 1100MHz isn’t going to make up the gap. In Cinebench R20, the Athlon 3000G scores 339 in single-core (3x faster than our estimated Jaguar) but just 887 in multi-core (1.16x faster than our estimated performance numbers for the A9-9820). If we assume $87 for the CPU and $50 for a motherboard, the A9-9820 might be a better option for a lightweight gaming system. At the very least, it promises a faster GPU for less money.

I’m not considering whether the A9-9820 is overclockable because it’s highly unlikely the motherboard offers this option. Any overclocking features would be icing on the cake. DDR3-2400 would probably boost the GPU’s performance another 8-10 percent given historic scaling patterns, and given that AMD shipped Jaguar CPUs at 2.5GHz, it might even be possible to squeeze another 5-10 percent out of the CPU core as well. It wouldn’t change the value proposition much, but any additional amount of performance is helpful, and while the half-speed L2 cache will always drag on performance, bringing the clock up reduces the latency in absolute terms.

There’s an interesting argument for using this chip in a low-end, ultra-low-cost gaming box intended for SD or near-SD resolutions, or for using it to handle well-threaded lightweight applications. The price point could also make it an interesting option for folks who need a cheap system that might still offer reasonable performance at a rock-bottom price.

Now Read:



No comments:

Post a Comment