Ok guys with the recent posts on these technologies I thought I would make a post as to why the performance will still be poor.
Sure since I last made a post like this memory speeds have come along (not a huge amount for notebooks), yet the same things apply.
Hypermemory and turbocache use the PCI express bus to connect to the ram (pci express has a more direct connection anyway compared to AGP) and due to each pci express bus being separate this takes nothing away from the PCI bandwidth. HOWEVER that is where the advantages end, and here are the same pitfalls:
1. The ram only has so much bandwidth, the CPU and rest of the system need bandwidth too and if you have a gfx card stealing most of it the rest of the system will really suffer (as will your fps)
2. This still takes away ram, so if your running 512mb, its going to hurt performance, even with a gig if the chip is taking 192mb you are loosing a 1/5th of your mem.
3. The ram in the system is at best 233mhz 128bit (in centrino notebooks) compare that to 350mhz 128bit mem and you can see even if you did have full access to it, its still going to be slow.
-
Meaker@Sager Company Representative
-
This is the exact reason I got 2 gigs of 667MHz RAM in my system. Yeah it has some negitive effects on system performance, but if it didnt help more than it hurt it wouldnt exist. It still oveall helps gaming performance to have a turbochache/hypermemory card then not.
-
Nice post. You might want to add that the purpose of these technologies is not to offer "good" performance. It is a cost-saving measure more than anything else. It allows them to sacrifice onboard memory (which is expensive), while still having a working card.
It's not intended as a performance-enhancing technology in the first place, for the reasons you mention.
So it might not even be fair to call the technology "poor". It does what it's designed to do very well. But it doesn't, and won't ever, offer good performance. -
Then again, you'd still probably get bad performance without it. I wouldn't really recommend it to anyone.
-
OMG seriously dont you people think? What do you think would give you better performance. Having 64mb of Vram and thats it, or having 64mb of VRAM dedicated and another 192mb taken from your system ram to help your GPU? Seriously, TC/HM arnt perfect, but they still HELP video performance, and they help more than they hurt so why not? No they dont make super powerful cards, but they make low-mid range cards a little better.
-
Also...this has been beaten to death...but it DOES NOT steal system memory away that can be used. If your computer has 2GB of ram and you're using 1.5GB right now...that means up to 512MB can be allocated to Hypermemory. Let's say your system will allocate 128MB to Hypermemory or turbocache. Now, in Hypermemory, if you suddenly spike in memory use to 1.95GB, the system will immediately release that 128MB of ram from Hypermemory, back to the system ram as it needs it. It's "dynamic". I assume turbocache works the same way.
So it will NOT lead to any noticeable performance degradation because it's being allocated to video use. It only takes it when it needs it, and only if it's available.
Frankly...I think it helps more than many give it credit for. Yes, it's not the same as having physical video memory, but it does provide a noticeable performance bump and also allows systems to utilize it for more texture memory while lowering costs for various applications. -
Dustin Sklavos Notebook Deity NBR Reviewer
It's also worth noting that nVidia's TurboCache implementation is remarkably better and more efficient than ATI's HyperMemory. TC is done in hardware, while HM is usually done in software. If you need proof, look no further than the performance of a 64MB dedicated Go 7400.
Ultimately, these technologies are a good cheap idea to get more dedicated parts into more notebooks, which is ALWAYS better than getting an IGP. -
here's an example. if you have 64MB onboard memory, you are better off disabling TC/HM because constantly kicking something out of onboard memory and putting something new in (which is what happens when you have too little onboard memory) is still faster than constantly accessing system RAM for buttloads of data across a much slower interface (which is what happens when you use system RAM as video memory). yes, pci-x is super fast compared to AGP, but it is still several hundred times slower than access to onboard memory.
unless you have an application that is solely dependant on how much memory you have and cares nothing for the speed at which it runs, hypermemory and equivalents are of no value. -
Meaker@Sager Company Representative
A 64mb card and a 64mb with 192mb of turbo cache would perform EXACTLY THE SAME, and the 64mb card will perform almost identically to the 64mb cards of old. It helps so little.
Its pure BS advertising. In the end its all up to how much dedicated you have, if a game needs more than that then your really going to start seeing a drop in performance.
-
As for "proof", well, you could look up some of the benchmarks and tests of the technology.
Without HM/TC, the CPU would run slightly better, and the GPU would run worse. But since the GPU (with these low low-end cards) would be a bigger bottleneck, you'd get overall lower performance.
It can help average out performance a bit, in cases where you have a lousy GPU, and a decent CPU/RAM, by taking a bit of RAM, a bit of RAM bandwidth, and using it to assist the GPU.
You just have to keep in mind, it's a solution for the cheapest, slowest GPU's only. For higher-end cards, the penalties might well outweigh the benefits.
Quite simply, a card with 32MB onboard memory would be useless. It just wouldn't be able to run games. TC/HM allows it to run games, but it doesn't allow you to do so with good performance. But it's still an improvement. -
I would like to add that Windows Vista's Aero Glass isn't supported on cards with only 64 MB of memory (128 MB or more required). Turbocache and Hypermemory are useful in that they enable Aero Glass support on many low end cards with little dedicated memory.
-
Meaker@Sager Company Representative
It is but this technology has been around for ages, its nothing new, TC and HM are just tweaks to it. Left to being dynamic is the best and I wish advertisements were banned for incorperating it as its not shipping with their product.
-
meaker, any proof on performance hits if the system ram is used on current end systems? ie duo laptops running 566mhz. Obviously its still not the best option, but most mid range dedicated video cards dont even run their memory at that.
Also when I think about it, most memory transfer is allways to the video card. The cpu computes some math, sound gets a small stream, Hard drive should only need to send the data of a certain area once, the rest is constant texure movements from video and ram.
I think people downplay it to much. -
The reason why TC and HM are again surfacing after the 'original' AGP implementation sunk is because PCI-e has as much BW to the processor as from.
Also: I was under the impression that the dynamic characters of the memory mapping was about the fact that it would not take mem if it needed it. I don't think it will just release mem if the system asks for it, that will probably depend on priority mapping between GPU/CPU. -
Meaker@Sager Company Representative
-
I think mine helps
! it's running in dual channel 1066MHz is the MAX system bottleneck atm on my m1210
-
I've never liked Hypermemory and Turbocashe. I am crazy for optimal syste performance and these technologies just hurt it. I notice systems are much less peppy with this enabled and I find it unaceptable.
Another thing, I use a lot of system intensive software like Maya and 3DStudio Max. When rendering, these stress the RAM and CPU to no end. I cant have that much of my RAM going into video performance when I NEED it going into the system so I can get something rendered quicker. Friends and I have run tests to this theroy. It takes much longer to complete a render with HM/TC on than it does with it off.
In gaming performance the boost isnt as great as one would think. You're still working with a lower end card (usually) and will get performance that still matches that card. Sometimes less if they put an extra small amount of memory on the card.
On higher end cards this REALLY hurts. When enableing Hypermemory on my x700 I have 768MB of total video RAM. Now, an x700 cant even use the 256mb that I have on it (In most cases). I loose about 250 points off my 3DMark scores with it enabled. Oblivion plays extremely choppy with it on. Overall FPS in most newer games is way down. Not to mention windows doesnt have the "pep" it usually does with 1.5gb RAM. Thats not acceptable. -
Meaker, hmm yea I never did take into account that video memory quotes would be before DDRs x2 multiplyer. Everyone quotes system memory after the fact.
Id do some kind of extensive testing but my x1300 cant really power the games that need lots of memory, hyper memory or not. I do know that I'm running guildwars and WoW quite well, almost as good as my 9600xt. -
Meaker@Sager Company Representative
The x1300 core has a few improvements over the 9600 core, its two generations ahead afterall. They dont amount to much though.
-
Excellent post. -
usapatriot Notebook Nobel Laureate
What I dont get is that is have have 2gigs of 667mhz RAM why cant your GPU with hypermemory use some it it running @ 667mhz?
-
Meaker@Sager Company Representative
(Yes I am using the search feature)
The bandwidth is finite, ie, your system in games is using all of the amount of data that can be got from and sent to the memory, while not all the mem might be used, if the bandwidth intensive gpu wants to use some memory the rest of the system has to wait for it to do it before it can use what it was using before. -
ltcommander_data Notebook Deity
I think it should be pointed out that in due to the FSB frequency, only 1 single channel of memory bandwidth is theoretically needed. Such as a 667MHz FSB only using 1 DDR2 667 channel. Of course, nothing ever reaches theoretical, but with dual channel DDR2 667 on a 667MHz FSB or dual channel DDR2 533 on a 533MHz FSB there is certainly bandwidth left over because the FSB is the bottleneck. How effectively that extra bandwidth can be used by the GPU will depend on how smart the memory controller is between balancing GPU memory requests in the empty gaps of the CPU memory requests. The 965 series chipsets include Fast Memory Access which specifically improve this load balancing, but it won't be available for mobile until Santa Rosa in Q2 2007.
Now it's been said that TC is better than HM since it's in hardware, but I wonder how Intel's shared memory architecture performs in comparison? Intel's GPUs are designed from the ground up to use shared memory since they will never see dedicated memory (since the 815G anyways) so you'd think if anyone has the motivation or has had the time to figure out a better implementation it would be Intel. In any case, Intel's IGPs do have the advantage over dedicated GPUs using TC or HM since the IGP is located right on the northbridge for low latency and high potential bandwidth (in comparison to worrying about PCIe overhead and bandwidth competition). Also brings up the question of how well Intel's IGPs share memory in comparison to nVidia and ATI IGPs. -
Meaker@Sager Company Representative
In a word, no. I disagree with everything you just said
The only platform getting close to what your saying is the a64, but the latencies involved are still huge for the GFX chip. All core and core 2 cpus need all the bandwidth they can get. -
There's a big difference between bandwidth and latency. As said above, with a decent dualchannel DDR2 setup, you certainly have bandwidth to spare (although it's not much compared to how much GPU's typically use).
This is easily illustrated by testing the performance difference between single- and dualchannel DDR2. In most cases, it doesn't really make a difference, just like 667MHz vs 533 mhz doesn't really do much.
But the latency is horrible when accessing system memory, even on the a64 (Onboard memory controller only really benefits the CPU. As far as the GPU is concerned, it's still located far far away) -
Meaker@Sager Company Representative
EDIT: x1900gt mem bandwidth: 38.4GB/sec, mem bandwidth system (on a good day for my system, laptops are less) = 7gb/sec, x1600xt = 22.2gb/sec even the 7300GS has 6.4gb/sec. -
ltcommander_data Notebook Deity
http://www.anandtech.com/memory/showdoc.aspx?i=2800&p=7
The reasoning seems to be that with a good cache subsystem the times that you need to randomly hit RAM through the FSB are decreased making the memory subsystem have less determination on overall performance. Most memory access is then prefetching which can be organized and optimized around bandwidth and latency concerns. Having to hit the memory subsystem less should also be to the GPU's advantage since there are more empty request slots that it can use. -
Meaker@Sager Company Representative
Yeah your seeing 10-20% performance increases from the memory alone.
Lets examine this more closely, you have a graphics chip that to perform up to standard (we are talking 7300 level here) needs around 6-7gb/sec bandwidth. Now if we are using a purely TC system then that would swamp the bus, management in this area is never going to be very efficiant as the priority between the CPU and GPU is never going to be right as this changes from scenario to scenario.
Now most laptops have a 533mhz fsb which will land 5-6gb/sec (just a guess) and we have a half/half turbocache system, due to clever management 70% of accesses occur in the local gfx ram (most used stuff). 30% of the needed bandwidth goes to the system ram, thats still around 2gb/sec lopped offyou have just taken an effective 1/3rd of the ram allowance so we might aswell be running a 355mhz fsb.
Now if those graphs were continued downwards they would rapidly tank towards 0, not only that but we have also made the gfx pipeline have to wait MANY times longer for that information (imagine lag in an online game) to arrive so even though it arrives in a steady stream the information is quite old, but the rest of the scene has to wait for it before it can be outputted, so we get a drop off in performance from two sides.
Now if we take one step up to say the x1600 which normally likes (for the laptop version) around 14-15gb/sec bandwidth, take a third of that and thats all your fsb gone. -
Well, I don't use Turbocache on my card obviously. But in Vista running DirectX 9.0L, a version of Turbocache or Hypermemory called "Dynamic system memory usage" or something like that is initiated and used by Vista on the graphics card. If I look in the video card settings, Vista says that I have 512MB of video memory available, but the 1500M is only a 256MB card. And according to all current reports, gaming takes about a 10% hit in Vista as compared with XP. Now, I don't know if this is due to the dynamic memory usage that Vista employs or simply due to the new DirectX features. Just an observation. It would be interesting to see performance of low end turbocache/hypermemory cards versus high-end cards in Vista compared to XP.
Hypermemory, turbocache, why its still so poor.
Discussion in 'Gaming (Software and Graphics Cards)' started by Meaker@Sager, Aug 8, 2006.