So, fairly recently there was some discussion about multiGPu and a lot saying how it sucks and what not. But heres the thing
You can now see that 2xRX480 (400-500$) beat/are on par with 1200$ single GPU when multiGPU is properly used.
So next time you go ranting how "multiGPU sucks" stop and think about your own interests and instead ranting about multiGPU demand better multiGPU support and proper usage of multiGPU.
And this was the initial idea of multiGPU: instead making insanely expencive single chip cards, use smaller much cheaper chips that perform same for much less money. Its completely irrelevant which vendor it is (athough its nice to see RX480 performing so well) but NVidia is going in completely opposite direction by removing SLI capability from GTX1060 completely.
The point is: if you bought 1200$ GPU....or 700$ GPU....why in the world would you need another? Arent those supposed to be fast enough? And why go against customer and remove capability to pair 2 lower priced GPUs to get good performance? Since that was actual idea and actually makes sense?
Comments
The issue has always been, that they usually don't.
Maybe DX12 will help with that, but right now, if you want to support SLI/CF, the game developer and the driver developer have to work together to do a lot of optimization. That doesn't happen very often, it's very time consuming (therefore expensive), and to date there hasn't been a good method of homogenizing it across different games - even those using the same game engine.
Therefore, multiGPU has sucked. For a long while now. With only a few exceptions - those exceptions run exceptionally well, but they are still very much exceptions.
People think that you can go Crossfire or SLI and it just works with everything but that is not the case.
"Be water my friend" - Bruce Lee
Well, for the exact reason you bring up - they don't want people buying lots of low margin cards. They want people buying the high margin cards.
And then, for those people where the high margin cards still aren't enough, that's the niche where SLI/CF has found itself in lately.
Maybe DX12 changes that, but I sorta doubt it.
CrossFire and SLI previously relied on driver magic, which required game-specific optimizations because how to make it work varied by game. DirectX 12 and Vulkan give the developer the ability to do use multiple GPUs in a way that makes sense in their particular game, rather than relying on Nvidia and AMD to make custom driver optimizations for them. But that just doesn't fix the basic problem that there simply isn't a nice way to spread rendering a game across multiple GPUs.
There are some things you can try. For example, you could have the left half of the frame on one GPU and the right half on the other. But you don't find out which half of the frame a vertex is in until after four of the five programmable shader stages are done. You can do some culling host side, but if you try to go this route, you're guaranteed to have a lot of work replicated on both GPUs before you learn which GPU "should" have done it.
Another approach is to simply have different GPUs handle different draw calls, and then piece together the final frame at the end. If you go this route, then not only is there extra work at the end to combine the two partial frames, but you also have no reliable way to divide the work evenly between the two GPUs. Furthermore, both GPUs have to do extra work that would have been skipped if they had the other GPU's depth buffer to discard fragments sooner. And both have to finish and merge before you can do any post processing effects.
You could also try doing the initial rendering on one GPU, and then post processing on the other, using the two GPUs in a pipeline of sorts. This again creates the problem that you can't divide the workload evenly between the two GPUs, on top of the extra overhead of transferring frame buffers what not.
So maybe you can make two of GPU X offer 50% or 70% or whatever better performance than one of GPU X. And you can do this for all recent and future GPUs, not just whichever ones AMD and Nvidia are willing to provide custom driver support for. But the development effort spent on this could have been spent on something else, instead of something that is cool in some esoteric sense but only 1% of your playerbase will ever benefit from.
A modern GPU is made up of a handful of Steaming Multiprocessors, or GCN cores, depending on which brand you choose. They aren't exactly the same, but conceptually, work with me here. Each of those is made up of a bunch of shader cores, some scheduling logic, some texture handling, and some register space - to extremely simplify the thing.
The difference between a $170 GTX1060 and a $700 GTX1080, aside from some video RAM, is pretty much that the 1060 has 9 SMs, whereas the 1080 has 20 SMs. Those SMs aren't all that dissimilar. The real difference in the power of a GPU across a generation is mostly to do with those SM or GCN core counts, and lower tiered cards are often cut down higher tiered cards that just have some disabled.
At it's heart, that's what SLI/CF should be - allowing two cards to merge their resources into one pool. In reality, it's what Quiz describes - you have 2 (or more) physical cards, each acting very independantly and trying to share workloads. Rather than, say, having a 1080 and a 1060, and having 29 total SMs available (I know, a pipe dream, there are other things like core frequency and VRAM and things besides SMs, but bear with me here), and having a better experience than either card individually.
So I do get Malabooga's point, I really do. GPU architectures are highly scalable. And it ~seems~ like it should be an easy thing to just extend that past a die to multiGPU. It's really a shame that real life hasn't panned out that way. Maybe it's a good case for a third party to swoop in, come up with a scalable architecture (although, it seems to me that SMs and GCN cores are pretty scalable already, they just need a more flexible controlling architecture and maybe higher bandwidth between physical cards), that really can rock the socks off of current multiGPU implementations.
One issue specific to rasterization is that you don't find out where a vertex is on the screen, or even if it's on the screen, until far into the rendering pipeline. Try to spread a single framebuffer and depth buffer across two GPUs and each very often needs to grab the other's buffer. You'd need a ton of bandwidth to make that alone work. And it's not just bandwidth; you'd need both GPUs to be able to atomically access those buffers and somehow mitigate the race conditions that would happen with a naive implementation.
The problems with multi GPU scaling basically come down to trying to find ways to work around the inability to spread a depth buffer and frame buffer across two GPUs.
If on the other hand you want to be able to play all games, then you're going to need a single GPU solution.
If it were otherwise, and two RX 480 could reliably match Titan X, AMD would attach them together at the factory and sell them as a package. AMD are not complete idiots. If they had an economical way to match the speed of Titan X they would be selling it.
So there are rumors that AMD will make small ARM chip and put it on dual GPU card to manage GPUs resources. From any point of view outside of the carsd, card would be seen as a single card as it would communicate through ARM chip which would then control 2 graphic chips resuorces internally.
And when we see Microsoft push into mGPU, new Scorpio might be based on such design as well as future AMD cards.
The thing with that is that it wouldnt have to stop on only 2 chips as theoretically you could put any number of chips on a PCB and tie them with ARM controller to be seen as single resurce from outside.
https://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_1080_SLI/14.html
Out of the 20 games they tested there is almost always a performance boost with sli.
The dual 460's would flat out refuse to play some games like World of Tanks which would crash on launching a match immediately. The dual 560's had so many issues that I frequently had to disable one GPU to play several different games. Issues from micro-stuttering to horrible screen tearing to FPS that was worse than with one GPU turned off.
Never again.
That review found that at 4K, SLI on average increased the frame rates by 52%. Even if you include non-scaling games, it was an average of 71%. That's not double, and not terribly close to it. Furthermore, because Nvidia and AMD tend to heavily optimize their drivers specifically for the games that appear in reviews, you can expect that other games will tend to have worse SLI scaling--and likely not scale with SLI at all for games not released near the time that Pascal was Nvidia's latest architecture, as even the most popular games released a few years from now will have the game-specific driver optimizations go to Volta or whatever rather than then-older Pascal cards.
And that's even assuming that SLI works flawlessly, with no micro-stutter or added latency. Which is, of course, a false assumption. The micro-stutter problem has lessened considerably with improved frame pacing over the last few years, but there's nothing that can be done about the added latency from using SLI. Even with perfect frame pacing, x frames per second with a single GPU will tend to give you a better gaming experience than 1.2x frames per second with SLI, as a lower net frame rate is compensated for by lower latency. With imperfect frame pacing, that 1.2 factor can get much larger and even approach 2 in degenerate cases.
The reason there is better scaling at higher resolutions is simple and obvious: higher resolutions add far more GPU work but not much more CPU work, making CPU bottlenecks less common and less severe.
The problem is besides the extra cost of components is dealing with the bugs to get it working and most games don't launch with proper support for SLI/Crossfire. It can take weeks/months for drivers to be released.
https://www.pcper.com/news/Graphics-Cards/DX12-Multi-GPU-scaling-and-running-Deus-Ex-Mankind-Divided
and in DX12 and SFR rendering process is done exactly the same as on a single card as 2 cards act as one both rendering same frame at the same time. Everything is pooled together, even VRAM so if you have 8GB cards you effectively have 16GB VRAM, not 8 as you have with AFR.
And games release in broken state for single cards too, just look at Wtach Dogs 2/Dishonored 2/Mafia 3. Completely broken with plethora of problems.
now ask yorself why you cant buy 2xRX480 for 400$ and have same performance as 1200$ Titan XP (now proven in 3 games, ~90% scaling needed). Or why NVidia removed SLI capability from GTX1060 (same calss a little slower card than RX480 but would be as fast as Titan XP as 2xRX480s are)
and yeah, GTX1080 SLI? Shouldnt 700$ card run everything great WITHOUT need to use second 700$ card?
Back to the OP. You might not always get the boost you want out of dual cards. But sometimes you will and never do you actually get less.
Smart people ignore Malabooga and instead look at some RX480 CrossFire review where games were more randomly picked, like this one:
https://www.techpowerup.com/reviews/AMD/RX_480_CrossFire/
Besides Win10 adoption isn't going all that great (about 23%), almost 50% users are still running Win7 which means no DX12. There's no incentive for devs to support multi GPU (Vulcan could help here being available broader but I've no info about it's multiGPU support).
Really, though, if a game can't run well on 8 GB of video memory, multi-GPU scaling is the least of their worries.
GL running those games running on a dual GPU system.
"going into arguments with idiots is a lost cause, it requires you to stoop down to their level and you can't win"
well, thats what API and drivers are for. Along with devs fine-tuning. And since DX12 mGPU is vendor agnostic and you can use any 2 cards....
memory is just as an example since up until now thre was no pooling, if your cards had 8 GB each, you get 8GB, having it pooled to 16GB is just a bonus in the process ;P