With high-end mobile GPUs shipping with utterly ridiculous amounts of VRAM, I can't help but wonder if it can't be used as extra RAM - with PCIe 3.0 x16's bandwidth, it could come close to single channel DDR3-1600 in performance.
[Edit] It seems there is a lot of confusion about what I am asking, so allow me to clarify: This is not about replacing main memory. This is about speeding up specific programs that both require a swap file to operate and aggressively page out their contents to swap, in scenarios where there is an abundance of main memory.
-
I lol'd!
Sent from my HTC One_M8 using TapatalkLast edited by a moderator: May 12, 2015 -
It'd be even funnier if he said nein nein's.
-
-
Because nobody has yet....
<iframe width='420' height="315" src="//www.youtube.com/embed/WWaLxFIVX1s" frameborder='0' allowfullscreen></iframe>Last edited by a moderator: May 6, 2015 -
Theoretically yes, you'd need a very very deep understanding of OpenCL and a specialized application that can span both memory arrays, however I don't think anyone has ever managed to do more than just mirroring the data. However, your biggest bottleneck is the PCIe bus, even at PCIe 3.0 x16, it is still very slow, not to mention high latency. This might be possible if AMD's HSA model becomes more common.
-
When the next best option is an M.2 SSD, I'd say VRAM over PCIe 3.0 x16 is very fast, not very slow...
-
Even with two 880M/K5100M you only have 16G VRAM, so not much to utilize in the first place.
And swapping through existing open APIs with the driver would add tons of overhead. -
not very useful even if it was possible..
-
Doesn't the way SLI works mirror the data on both cards, so you'd only have 8 GB usable anyways?
-
correct, only if you disengage the SLI can you use the VRAM independently to my knowledge.
-
Not related to using VRAM as swap, but would there be any reason to do this aside from having the cards working on two completely independent workloads?
-
You could potentially use that to have one card working on one task (say, rendering) and the other for gaming. A desktop with a Quadro/FirePro + GeForce/Radeon comes to mind.
This, of course, assumes the CPU can handle both tasks at once.
------------
IIRC, it used to be the case where one GPU would handle PhysX and the other to handle other aspects of gaming(?). -
Doesn't that fall in the two completely independent workloads scenario I mentioned though?
-
Yeah, I just can't think of any other reason to use such a setup. That, or BTC mining back when that was a thing. Don't know if that used independent cards or Crossfired cards.
-
not that I can think of. I do it with a pair of quadros on my render server regularly when I allocate one to MAYA and one to Premier Pro. but that is a dual workload situation
-
I guess it would be possible from a technical standpoint, but you'd have to decouple part of the workload, send it to the GPUs, then recouple the data which may have problems with the time it takes to perform those operations kinda like you can run into latency issues when you're crunching numbers on a lot of CPUs and it ends up taking more time to have the CPUs communicate with each other than it takes to actually crunch the numbers. I've had colleagues who worked on CFD that ran into that problem actually.
-
SLI/CF is for real time viewport/game rendering only. Any other multi GPU work doesn't involve that.
And no, you don't have to disable SLI/CF for have stacking VRAM. You can still throw as much data onto different cards as you like, via any GPU API. The VRAM management is still completely independent. Only data used in the specific context with SLI/CF on would be duplicated by the driver.
BTW, theoretically you don't have to use SLI/CF at all for multi GPU rendering. You could just push your workload to different GPUs independently with different contexts, and then sync the frame buffers via PCI-e, like how muxless switchable graphics works. AMD's CF on the Hawaii GPUs is actually running in a similar manner. They have given up on CF connector designs. -
CFD always has decoupling issues as the coupling range is expending with every loop, and smaller the chunks are the more decoupling ouverhead you suffer from. Scanline rendering is probably easier in general, and if the pipeline design allows it the overhead could be masked completely. Ray tracing rendering would be purely decoupled after the initial model data transfer.
Use VRAM as swap?
Discussion in 'Gaming (Software and Graphics Cards)' started by Peon, Aug 23, 2014.
eh huh
