Computing

Nvidia RTX 50-series rumors: everything we know so far

August 1, 2024

RTX 4090. — Jacob Roach / Digital Trends

Nvidia already makes some of the best graphics cards, but it’s also not resting on its laurels. Although the RTX 40-series, which has been bolstered by a refresh, is still somewhat recent, Nvidia is also working on its next-gen GPUs from the RTX 50-series.

The release date of RTX 50-series GPUs is still at least a couple of months away, but various rumors and leaks give us a better idea of what to expect. Here’s everything we know about Nvidia’s upcoming generation of graphics cards.

Table of Contents

RTX 50-series: pricing and release date

Nvidia's RTX 4070 graphics cards over a pink background. — Jacob Roach / Digital Trends

We haven’t heard any specifics from Nvidia about the release date just yet, but most estimates pin the launch of Blackwell around the end of 2024 and the beginning of 2025. Some very tentative whispers even mention a possible RTX 50 refresh in 2026, but that’s way too far into the future to pay it much mind.

Get your weekly teardown of the tech behind PC gaming

The latest scoop on the RTX 50-series’ release date is that we may not see any of the GPUs until after CES 2025. This information comes from kopite7kimi, who is often a solid source, but this is still just a rumor at this point. It’s one of the many back-and-forth hush-hush reports we’ve heard in recent months.

Nvidia is also said to be scaling back its production of RTX 40-series cards by as much as 50%, as per UDN, a Taiwanese financial news publication. If this is true, it certainly indicates that RTX 50 cards might be right around the corner, but then, we’ve heard many rumors that contradict that theory.

According to early rumors, Nvidia wasn’t supposed to be ready to launch the new graphics cards until 2025, so that kind of tracks. This would give AMD a major edge, seeing as it’s rumored to launch RDNA 4 GPUs later this year — although there might be delays. However, according to YouTuber and frequent leaker Moore’s Law Is Dead, Nvidia may not give AMD the breathing room it so badly needs.

Moore’s Law Is Dead said in a recent video that a source at Nvidia told him that “Blackwell is being prepared to be ready to launch in the fourth quarter of 2024,” but only if Nvidia wants it to. This depends on whether AMD’s RDNA 4 cards will be competitive enough to take away sales from Nvidia during the holiday season at the end of this year, as well as how Ada (RTX 40) sales are going around that time. However, a newer report claims that AMD may not launch RDNA 4 until the first quarter of 2025, so it’s hard to know what to believe.

No matter what, Nvidia is supposedly planning to “make a big deal about RTX 5000 efficiency at CES 2025.” This means that the GPUs are reportedly launching either at the end of 2024 or near the beginning of 2025, but a new report from UDN tells us that we might see a similar launch strategy for Blackwell as we did for Ada.

According to the report, Nvidia is getting ready to launch the RTX 5090 first in the final quarter of 2024, followed by the RTX 5080 a few weeks later. This mirrors the approach in the RTX 40-series, where the RTX 4090 hit the shelves first, followed by the RTX 4080 shortly after. The more budget-friendly graphics cards only came out early the next year, which is what might happen this time around, too.

Still, the release cadence for Nvidia is a mystery right now. Another report from kopite7kimi implies that the RTX 5080 will launch first, marking a return to Nvidia’s previous strategy, where the xx80 card was released first and the xx90 card followed later. Meanwhile, Moore’s Law Is Dead said that the RTX 5060 may not be launched until a while later when Nvidia can deck it out with more VRAM. What’s the reality? Only Nvidia can say for sure.

It’s not ture. RTX 5080 should be released first.

— kopite7kimi (@kopite7kimi) May 7, 2024

The pricing of these GPUs is pure speculation at this point. In this generation, Nvidia adopted a pricing strategy that can only be referred to as “expensive.” It might follow down that path and push the prices even higher, especially if the demand for AI GPUs remains as high as it is right now. After all, the current demand pushed the RTX 4090 way above $2,000, even though it launched at an already very high price point of $1,600. This certainly makes the RTX 5090 a worrying prospect, but Nvidia’s price cut in the RTX 40-series Super refresh gave many enthusiasts some hope.

Assuming the flagship 5090 will cost close to $1,800 to $2,000, the rest of the lineup is, unfortunately, likely to follow with price increases across the board. However, for Nvidia to remain the go-to against AMD, the prices can’t keep rising forever. There is some hope that Nvidia will realize this and keep its pricing more reasonable for the next generation, but it’s too early to tell.

RTX 50-series: specs

	Nvidia RTX 50-series
Process node	TSMC 3nm or TSMC 5nm (N4P)
Architecture	Blackwell
Chip	GB202, GB203, GB205, GB206, GB207
Memory type	GDDR7
Maximum bus width	384-bit/448-bit/512-bit
Display connectors	DisplayPort 2.1, HDMI 2.1

With the release of RTX 50-series GPUs still a while away, Nvidia hasn’t confirmed any specifications for any of the cards. In fact, we’re not even sure which models are coming. However, piecing together speculation from various hardware leakers gives us some idea of what we can expect. Remember to take the following with a healthy dose of skepticism until Nvidia itself spills the beans.

Process node and chips

TSMC3

— kopite7kimi (@kopite7kimi) November 15, 2023

We know for a fact that the follow-up to Ada Lovelace will be called Blackwell, honoring American mathematician David Blackwell. Rumor has it that it will be manufactured by TSMC based on a 3nm process, but it’s unclear whether Nvidia will be using one of TSMC’s existing 3nm nodes or a custom node.

The release of the Blackwell B200 GPU threw a wrench into the 3nm rumor. The B200, made for high-performance computing (HPC) and data center use cases, is built on a TSMC 4NP (4nm Nvidia Performance) node. If the B200 uses a 4NP node, it’s easy to imagine that the consumer lineup might do the same. However, it’s not a given — Nvidia might use the 3nm node for its RTX 50 lineup instead.

The lineup is said to include chips spanning from the high-end, RTX 4090-equivalent GB202 through the GB203, GB205, GB206, and entry-level GB207. This will be an interesting, perhaps worrying, change if proven true. It would mean that the AD104 GPU powering the RTX 4070 would have no successor in the next generation. The RTX 5070 and RTX 5070 Ti might, therefore, utilize the GB205 chip.

One of the most talkative sources of information on the RTX 50-series has been kopite7kimi on X (formerly Twitter). The leaker revealed that we can expect the new GPUs to feature support for DisplayPort 2.1, something that the Lovelace lineup doesn’t provide, and also for HDMI 2.1.

Memory interface

I think my persistence is correct. So the difference is that GB202 is 512-bit and AD102 is 384-bit.

— kopite7kimi (@kopite7kimi) March 11, 2024

Kopite’s latest update talks about the memory interface for Blackwell. The leaker now states that the flagship card will indeed have a 512-bit memory bus, despite their previous statements that it would stick to 384-bit. Meanwhile, one user on the Chiphell forum claims that the RTX 5080 will have a 448-bit memory bus. That gives us now not two but three estimated bus widths for the flagship alone.

The maximum bus width of Blackwell has been a very contentious topic among popular leakers, so it’s hard to know what’s true. However, one thing that they all agree on is that Nvidia will use the new GDDR7 memory standard, which AMD is said not to be using in its upcoming RDNA 4 lineup.

The leaker also updated the expectations for the speed of those GDDR7 memory modules found in the RTX 50-series. Despite previous rumors that we might see 32Gb/s modules right out of the gate, kopite7kimi says that Nvidia will use 28Gb/s for this generation. This still marks a solid upgrade over Ada, delivering up to 1.8TB/s of memory bandwidth on the rumored RTX 5090 — assuming the 512-bit memory bus checks out.

Regardless of bus width, we know that GDDR7 will be an upgrade. Memory maker Micron recently shared some performance figures for its new VRAM, claiming that it’ll be an up to 30% performance increase in gaming scenarios, including pure rasterization and ray tracing. GDDR7 memory starts at 28Gb/s and may offer over 1.5TB/s in system bandwidth.

The amount of VRAM in GPUs has been a hot topic as of late, and to that end, RedGamingTech speculates that we might see up to 36GB of memory in the RTX 5090. However, those numbers aren’t finalized, so we might end up with 24GB, like in the RTX 4090.

Rumored specs

	GPU	Streaming Multiprocessors (SM)	CUDA cores	Memory interface	Memory bandwidth
RTX 5090	GB202	192	24,576	GDDR7 28GB 448-bit	1.5TB/s
RTX 5080	GB203	84	10,752	GDDR7 16GB 256-bit	896GB/s
RTX 5070	GB205	50	6,400	GDDR7 12GB (?) 192-bit	672GB/s
RTX 5060	GB206	36	4,608	GDDR7 8GB (?) 128-bit	448GB/s
RTX 5050 (?)	GB207	20	2,560	GDDR7 8GB (?) 128-bit	?

So, what can we expect from the RTX 50-series in terms of actual specifications? What you’re seeing in the table above is what’s often being referred to as rumored specs, but that’s what we’re working off — rumors. Please take all of the following with a healthy dose of skepticism.

As is often the case, kopite7kimi has been a good source of intel on the topic of RTX 50-series specs. The leaker shared the (suspected) number of streaming multiprocessors (SMs) for each GPU. That’s what gives us the idea that the RTX 5090 might have 192 SMs, which marks an impressive 33% boost compared to the RTX 4090; meanwhile, the RTX 5080 would only enjoy a 5% boost, and the RTX 5070 might actually feature fewer SMs than its predecessor. However, it’s too early to panic.

For one, we don’t know whether these specs are true or not. Even if they are, what Kopite shared was actually the number of SMs in the GPU, which doesn’t necessarily mean that Nvidia will use all of them in the graphics card. In fact, the RTX 4090 doesn’t utilize the full power of the AD102 chip, and that may also be the case with the GB202. YouTuber Graphically Challenged supplied these SM expectations with some information about the bandwidth and the amount of VRAM for most GPUs.

GB202 12*8 512-bit GDDR7
GB203 7*6 256-bit GDDR7
GB205 5*5 192-bit GDDR7
GB206 3*6 128-bit GDDR7
GB207 2*5 128-bit GDDR6

— kopite7kimi (@kopite7kimi) June 11, 2024

All the cautionary skepticism aside, at least some of these rumors might check out, as they’ve been circulating for a while from multiple sources. YouTuber RedGamingTech has also said previously that the flagship chip may come with 192 streaming multiprocessors (SMs). However, RedGamingTech predicted that the GB203 (RTX 5080) would have 108 SMs, which is still better than what kopite7kimi now claims to be the reality. One way or another, we’re looking at a big performance gap between the RTX 5080 and the RTX 5090.

More speculation shared by kopite7kimi corroborates this. According to the leaker, the GB203 chip will be “half of GB202,” marking a similar drop in performance as what we’ve seen in the RTX 4090 versus the RTX 4080. It’s worth noting that RedGamingTech, unlike kopite7kimi, believes that we’re getting a maximum bus width of 384 bits, which would affect performance figures.

I think GB203 is half of GB202, just like GB102 and GB100. But I don’t know if GB202 has a multi chip package.

— kopite7kimi (@kopite7kimi) March 11, 2024

If the RTX 5090 really turns out as beastly as it seems, many would expect a behemoth of a card, but rumor has it that the RTX 5090 will only feature a dual-slot design in the Founders Edition. That’d be a shocking change from the current generation, where the RTX 4090 can take up to four slots.

We’ve even heard rumors of a GPU referred to as Titan AI. Just as in the Lovelace generation, Nvidia didn’t end up using the whole AD102 chip — the RTX 4090 has some more juice available to it, but it was never used — and Nvidia may do something similar with the GB202 chip. As a result, the RTX 5090 is said to be a cutdown version of the GB202 GPU, which will offer a 48% boost over the RTX 4090. Meanwhile, the Titan AI graphics card would likely unlock everything the GB202 chip has to offer, coming in with a 63% performance uplift.

It’s too early to know the specifics of any individual card at this point, and all of this is subject to change. It’s likely that Nvidia will release models ranging from the RTX 5060 to the RTX 5090, with some Ti options added into the mix, and perhaps even the rumored Titan AI graphics card. Let’s hope that it will keep the specs balanced to offer a good spread of cards for enthusiasts and entry-level users alike; otherwise, DLSS 4 might have to be its saving grace in this generation.

RTX 50-series: laptops

A slide showing Nvidia's RTX 50-series road map.

If the desktop versions of the RTX 50-series are an enigma, their laptop counterparts have been even more steeped in shadows, with next to no information about them being shared by leakers. However, laptop gamers can rejoice, because we recently got some big news about the RTX 50-series for laptops. The best part is that it’s not yet another speculative rumor, but instead, it’s an actual leak from Clevo, a Taiwanese laptop maker.

The company was, unfortunately, recently hit by a ransomware attack that resulted in some confidential slides being shared online. According to the slides, Nvidia will launch six mobile GPUs, but they’re all set to launch no sooner than 2025. It’s also implied that Nvidia will not be retiring its older low-end GPUs. The RTX 4050, RTX 3050, and even the RTX 2050 will continue appearing in laptops.

RTX 50-series graphics cards have been given code names in the slide, but it’s easy to figure out which is which, as they’re being compared to their 40-series counterparts. It appears that the laptop version of the RTX 5080 might get a much-needed memory upgrade, now featuring the same 16GB of GDDR7 memory as the RTX 5090. The two cards are also said to share the same GB203 GPU, making the GB202 chip absent from laptops for now.

When exactly are these fancy new laptops going to come out? It might take a while, as Moore’s Law Is Dead claims that Nvidia is currently tweaking the mobile versions of the RTX 5080 and the RTX 5070/Ti. One last anecdote from RedGamingTech states that the GB207, meaning the least performant chip in the lineup, will likely only appear in laptops at first. This tracks with what we’ve seen in the RTX 40-series, where the RTX 4050 has only made an appearance in laptops thus far.

RTX 50-series: architecture

Nvidia is keeping the architecture used in Blackwell chips hush-hush, but it won’t stay that way much longer. With the GPUs a few months away, we’ll learn more as the release date draws closer. For the time being, Nvidia talked about the architecture for its data center Blackwell GPUs, which may not be very indicative of what could happen in the consumer lineup — but there are still some interesting tidbits.

The first curious part is that the enterprise version of Blackwell is built on TSMC’s 4NP node, which is actually a 5nm process. Previous rumors indicated that the RTX 50-series might be built on a 3nm process, but that now seems quite unlikely, given the recent announcement. Moreover, the B200 GPU comes with a dedicated decompression engine. While there’s no telling if that will make it to the consumer GPUs, it could bring a major boost to the graphics cards.

Although Nvidia discussed the Blackwell architecture in relation to enterprises, it stayed silent on its consumer lineup. As a result, all we have is more speculation from various sources, but the information is often somewhat conflicting.

RedGamingTech talked about the Blackwell architecture at length in a recent video. The YouTuber referred to it as “one of the most influential graphics architectures,” predicting that the RTX 50-series will introduce significant improvements to things like path tracing and ray tracing, offering gains for both enthusiast-grade and midrange cards.

To that end, the YouTuber said we might see significant architectural changes, including a major redesign of Nvidia’s SMs. He also mentioned the addition of a denoising accelerator, either as a part of the chip or as a function of Nvidia’s Tensor cores. More importantly, RedGamingTech initially teased that Nvidia may use a multi-chip module (MCM) design. This means a design approach where multiple smaller chips are packaged together to form a single, larger, and more powerful processor. Switching to an MCM design over monolithic could give Nvidia a major edge, including scalability, higher yields, and more design flexibility.

Unfortunately, a recent update from the same YouTuber revealed that Nvidia won’t be using an MCM design in Blackwell. Reportedly, Nvidia initially planned to use dual GB202 dies glued together, possibly with some SMs cut, but ultimately decided against it. The YouTuber remarked that issues such as high prices, the latency between the two dies, and various difficulties in getting it to work made Nvidia stick to its previous architecture.

Take this with a healthy dose of skepticism. It’s possible that Nvidia may be planning to switch to MCM in the future, but such architectural changes are never made last minute, so that plan for Blackwell may have never existed. However, it’s also possible that Nvidia may introduce architectural changes instead of pushing for top performance to allow the new technology to mature before ramping up the performance in RTX 6000-series graphics cards a few years from now.

One small hint of what to expect comes from, once again, the B200 data center GPU. Nvidia reworked its Tensor cores in that graphics card. As a result, they now support FP4 and FP6 numerical formats for AI inference natively. We might see this happen in consumer GPUs too, but it’s all speculation at this point.

RTX 50-series: performance

A graph showing the performance of Blackwell HPC GPUs.

As the specifications of RTX 50-series graphics cards are still mostly a mystery, it’s hard to make accurate predictions about their performance. However, many have tried, which is why we have some juicy rumors to dig into while we wait for official benchmarks.

According to Moore’s Law Is Dead, the performance uplift between Ada and Blackwell may not be major. The YouTuber’s source mentioned that “Blackwell’s rasterization uplift over Ada will not be as impressive as [from] Ampere to Ada.” However, the source also said that Nvidia could make the RTX 5090 feel like a similar uplift “if it felt threatened.” That seems unlikely, seeing as AMD is reportedly stepping down from making high-end GPUs in the next generation, potentially leaving Nvidia as the only source of high-end graphics cards for the next couple of years.

Based on the above, we might be looking at performance gains along the lines of 30% to 50% for the flagship. Midrange and entry-level cards typically see a smaller boost in performance gen-on-gen, so those might be even less impressive.

However, on the other end of the spectrum is speculation from sources like RedGamingTech. The YouTuber claims in his video that we’re looking at an up to 2x increase in performance between Lovelace and Blackwell. He mentioned that the RTX 50-series should double the ray tracing performance compared to the RTX 40-series, as well as provide a performance boost of up to 2x. RedGamingTech is unsure if this means rasterization, though, so it’s hard to know the metric by which to measure these gains. He does, however, predict clock speeds reaching over 3GHz, which would be a sizable boost over Ada, but also says that this only applies to overclocked models.

In a later video, RedGamingTech added that we might see an up to 60% boost from one flagship to the next. He then later clarified that we can expect to see the following performance boosts, which should be viewed with some skepticism:

RTX 4090 to RTX Titan AI: 63% faster
RTX 4090 to RTX 5090: 48% faster
RTX 4080 Super to RTX 5080: 29% faster
RTX 4070 Super to RTX 5070: 26% faster

The YouTuber also stressed that Nvidia’s focus was heavily on ray tracing and path tracing, with up to a 2.5x boost in those workloads. Again, approach all of this information with some skepticism.

The only real hint of performance figures we have right now comes from a slide made by Nvidia, but unfortunately, the slide talks about its next-gen high-performance computing (HPC) graphics card used in data centers. The graph, which measures GPU performance in GPT-3 175B inference, shows that the H200 GPU will be up to 18 times faster than the A100 — but that’s not Blackwell architecture yet. B100, the first Blackwell graphics card on the list, offers significantly higher performance, although Nvidia didn’t put a number on it. It looks to be about twice as fast as the H200.

While that’s exciting for those in need of an HPC GPU, gamers and other consumers will need to wait to find out the reality about the capabilities of RTX 50-series GPUs.

RTX 50-series: power draw

The RTX 4090 graphics card on a table alongside a set of cables held in hand. — Jacob Roach / Digital Trends

Prior to the release of the RTX 40-series, the flagship RTX 4090 was the subject of a lot of rumors, and its power draw was an especially hot topic. Some sources claimed that the GPU would have truly monstrous power consumption, even reaching up to 900 watts. We now know that those claims were false, as the RTX 4090 consumes 450 watts, and its connector supports up to 600W — while occasionally melting. It’s hard to imagine that Nvidia will push those numbers even higher in the next generation of GPUs.

The RTX 50-series has been unable to avoid some power-related controversies, though. Moore’s Law Is Dead recently revealed that Nvidia is planning to use a whole new connector, which would mark the fourth such change in a span of just three years. The YouTuber cites anonymous sources, claiming that Nvidia is switching to a 16-pin connector, all dedicated to 12V power delivery. However, many other sources are pointing out that this is unlikely.

Hardware Busters reached out to its own sources and confirmed that “no one is aware of a new connector.” Nvidia would have to be working with major PSU brands, especially after the issues with the 12VHPWR connector. If these brands don’t know anything about it, Nvidia might not be making these changes in this generation yet.

In fact, Nvidia may actually double down on the choice to use the 12VHPWR connector. According to TechRadar, Nvidia might make the 12VHPWR connector a mandatory thing for every GPU across the entire RTX 50-series stack, even the entry-level RTX 5060. This is also said to apply to cards made by Nvidia’s board partners.

Assuming Nvidia sticks to the (somewhat controversial) 12VHPWR connector that it’s currently using, the maximum power consumption will remain at 600W. The flagship RTX 5090 might go on to see an increase in power draw if it offers significantly more performance, but it’ll still need to leave some room for potential overclocking, so a maximum of 500W seems reasonable.

For the rest of the lineup, it’s possible that Nvidia will try to keep things more conservative instead of pushing for higher power consumption. As pointed out by NotebookCheck, Nvidia’s current trend of increasing total board power (TBP) is still fairly new — especially on cards like the RTX 4080. Historically, xx80 cards stayed well under 300W, even dipping below 200W at times. In the last couple of generations, the RTX 3080 and the RTX 4080 both pushed the TBP to new heights, with each requiring up to 320W.

With power consumption as high as this, it doesn’t make a lot of sense for Nvidia to keep pushing for even higher wattages, especially seeing as AMD is likely to keep it more conservative in RDNA 4. If Nvidia dials it back a little, we might see the RTX 5080 with a TBP of around 250W to 280W. However, if Nvidia sticks to its current scheme, it might go in the other direction and hit as high as 350W.

Source link