For many years, Intel has had better fabs than AMD, which gave them a considerable advantage in the CPU competition. This was obvious when AMD had their own fabs, and remained obvious after AMD sold their fabs to Global Foundries. It was a huge advantage for Intel back when we were still seeing rapid advancements in CPUs, as having a process node available a year sooner meant Intel could bring a new tier of performance to market a year sooner.
For AMD's Athlon 64 to finally claim the performance crown in 2003 required a combination of a very good AMD architecture and a terrible Intel one in order to overcome Intel's considerable foundry advantage. Even then, Intel was tremendously profitable in that era, while having a good Intel architecture coincide with a bad AMD one in 2011 on top of Intel's foundry advantage was enough to drive AMD to the brink of bankruptcy.
But GPUs were different. For a number of years, AMD and Nvidia both had their GPUs fabricated at TSMC. A process node was available to both GPU vendors at the same time. However good or bad a node was, it was equally good or bad for both GPU vendors. TSMC doing their job well or poorly meant gamers could get new generations of performance sooner or later, but didn't directly advantage AMD or Nvidia relative to each other.
But now AMD has moved their GPU production to Global Foundries, which originally bought AMD's fabs. This is partially because as part of agreement for Global Foundries to buy AMD's fabs, AMD was required to buy many wafers from Global Foundries for a number of years. Abu Dhabi didn't just want empty foundries with no customers.
But AMD moving GPU production to Global Foundries while Nvidia stayed with TSMC has meant that if one foundry has a better process node than the other, that gives them a big advantage in the GPU wars. So far, that hasn't seemed to matter very much, as it's not at all clear whether Global Foundries 14 nm (actually licensed from Samsung) is better or worse than TSMC 16 nm.
It's not at all difficult to imagine a near future where that difference is enormous. If one goes purely by the public roadmaps and assumes that all foundries will deliver what they promise when they promise it, that could mean a considerable advantage for AMD as soon as next year. Whereas Samsung and TSMC focus on low power devices such as cell phone chips with their early processes on a given size, the LP in Global Foundries 7 nm LP stands for "leading performance", not "low power". Global Foundries is jumping straight to a high performance, high power process node, rather than that coming along a year after some node for cell phones.
Furthermore, Global Foundries is aggressively pushing for 7 nm. TSMC is focusing on 10 nm now, and only moving to 7 nm later. Samsung is going likewise. Global Foundries doesn't even have a 10 nm process node, but is betting heavily on 7 nm, in part using resources acquired in purchasing IBM's foundries.
I hope that "if all foundries deliver what they promise" clause above threw up red flags for you. Because it's a huge if. Global Foundries history on this isn't terribly promising. 32 nm was delayed, and had considerable yield problems at first. 28 nm was really delayed. 20 nm was irrelevant. 14 nm was canceled in favor of licensing Samsung's 14 nm process node. With that kind of history, this has the potential to blow up in AMD's face catastrophically.
Global Foundries is hardly the only fab to have problems. TSMC 40 nm was seriously troubled for a while. TSMC 32 nm was canceled outright. TSMC 20 nm was terrible. But TSMC's 28 nm and 16 nm process nodes delivered well, meaning that TSMC has some real victories in their history of delivering on promises, which Global Foundries is thus far lacking. Past performance does not necessarily predict future results, of course, but I'd argue that there's a lot more uncertainty in how Global Foundries will perform than TSMC or Samsung.
But if TSMC or Global Foundries has markedly better process nodes for a while, that could mean that AMD or Nvidia GPUs win a generation by default simply because the other vendor doesn't have access to a competitive process node. Nvidia could switch to Global Foundries if so inclined, of course. But trying to move your design to a different foundry than planned is such an enormous delay that it means you lose that generation because you don't bring anything to market until a later generation. Don't be shocked if either Nvidia or AMD wins a generation just by virtue of partnering with a foundry that won a generation as soon as next year.
Comments
But a process node being canceled is hardly the only thing that can go wrong. What if Nvidia and AMD are planning on launching new GPUs and six months out, it looks like they'll launch at about the same time. Then one of the foundries has its process node delayed, and delayed, and delayed again. Think of the serial delays that afflicted TSMC 40 nm, Global Foundries 32 nm, or Intel 10 nm, for example. Finally the new GPUs are able to launch a year after intended. If that only happens to one foundry and not both, the GPU vendor relying on that one foundry basically misses a generation.
It is possible to design the same chip for multiple foundries as insurance against one foundry having problems. Apple has done that at least once. But that's a massive development cost increase and neither Nvidia nor AMD has the volume to justify that expense.
It's a good part of the reason we see some generations end up being mostly (or entirely) respins. We've had a lot of generation skips, on both sides of the aisle, and they haven't all be due to issues at the fab.
It was inferred - not announced, admittedly - that when Polaris was not to include a "high end" model, because Vega was the High End model, that Vega would launch shortly after Polaris.
And here we are on our second iteration of Polaris now.
So if you go strictly based on what their PR says, sure, there's no delay. If you move that to the court of public opinion though, AMD is a year or more late with Vega already.
I think that AMD just looked at Polaris and saw that their architecture wasn't competitive. Big Polaris could have been a 300 W chip that struggled to keep pace with a GTX 1080. Rather than putting a ton of resources into launching a huge, expensive product that wasn't competitive, they decided to allocate their development resources toward better products that wouldn't arrive until later. Ryzen and Epyc certainly fit that description, and Vega likely will, too.
You spend $600 on a 1080Ti, but it only mines 1.5 timer as fast as a 1070 that costs $300, you would pay off the 1070 faster than the 1080Ti, and that's why we see these mid-tier cards getting gobbled up but they are avoiding the very low end and the very high end - they represent the best "bang for the buck," as we like to say.
Also, seems that most of these algorithms are memory sensitive - I see a lot of chatter about overclocking VRAM and underclocking the GPU itself in order to maximize hash rates while minimizing power draw (these guys do consider the price of electricity as an operating costs, if they are doing it right).
*edit* Or I should say, the ~smart~ miners are looking at return on investment. There are a whole lot of folks who are just looking online and see this or that is the best card, and they are going and blindly buying those cards at any price. The lemming crowd that's into mining is why we see insane prices right now - they aren't actually evaluating price/performance or return on investment.