The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    Hypermemory, turbocache, why its still so poor.

    Discussion in 'Gaming (Software and Graphics Cards)' started by Meaker@Sager, Aug 8, 2006.

  1. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    Ok guys with the recent posts on these technologies I thought I would make a post as to why the performance will still be poor.

    Sure since I last made a post like this memory speeds have come along (not a huge amount for notebooks), yet the same things apply.

    Hypermemory and turbocache use the PCI express bus to connect to the ram (pci express has a more direct connection anyway compared to AGP) and due to each pci express bus being separate this takes nothing away from the PCI bandwidth. HOWEVER that is where the advantages end, and here are the same pitfalls:

    1. The ram only has so much bandwidth, the CPU and rest of the system need bandwidth too and if you have a gfx card stealing most of it the rest of the system will really suffer (as will your fps)

    2. This still takes away ram, so if your running 512mb, its going to hurt performance, even with a gig if the chip is taking 192mb you are loosing a 1/5th of your mem.

    3. The ram in the system is at best 233mhz 128bit (in centrino notebooks) compare that to 350mhz 128bit mem and you can see even if you did have full access to it, its still going to be slow.
     
  2. sheff159

    sheff159 Notebook Deity

    Reputations:
    77
    Messages:
    880
    Likes Received:
    0
    Trophy Points:
    30
    This is the exact reason I got 2 gigs of 667MHz RAM in my system. Yeah it has some negitive effects on system performance, but if it didnt help more than it hurt it wouldnt exist. It still oveall helps gaming performance to have a turbochache/hypermemory card then not.
     
  3. Jalf

    Jalf Comrade Santa

    Reputations:
    2,883
    Messages:
    3,468
    Likes Received:
    0
    Trophy Points:
    105
    Nice post. You might want to add that the purpose of these technologies is not to offer "good" performance. It is a cost-saving measure more than anything else. It allows them to sacrifice onboard memory (which is expensive), while still having a working card.
    It's not intended as a performance-enhancing technology in the first place, for the reasons you mention.

    So it might not even be fair to call the technology "poor". It does what it's designed to do very well. But it doesn't, and won't ever, offer good performance. :)
     
  4. dagamer34

    dagamer34 Notebook Evangelist NBR Reviewer

    Reputations:
    41
    Messages:
    642
    Likes Received:
    0
    Trophy Points:
    30
    Then again, you'd still probably get bad performance without it. I wouldn't really recommend it to anyone.
     
  5. sheff159

    sheff159 Notebook Deity

    Reputations:
    77
    Messages:
    880
    Likes Received:
    0
    Trophy Points:
    30
    OMG seriously dont you people think? What do you think would give you better performance. Having 64mb of Vram and thats it, or having 64mb of VRAM dedicated and another 192mb taken from your system ram to help your GPU? Seriously, TC/HM arnt perfect, but they still HELP video performance, and they help more than they hurt so why not? No they dont make super powerful cards, but they make low-mid range cards a little better.
     
  6. tullnd

    tullnd Notebook Evangelist

    Reputations:
    83
    Messages:
    446
    Likes Received:
    0
    Trophy Points:
    30
    Also...this has been beaten to death...but it DOES NOT steal system memory away that can be used. If your computer has 2GB of ram and you're using 1.5GB right now...that means up to 512MB can be allocated to Hypermemory. Let's say your system will allocate 128MB to Hypermemory or turbocache. Now, in Hypermemory, if you suddenly spike in memory use to 1.95GB, the system will immediately release that 128MB of ram from Hypermemory, back to the system ram as it needs it. It's "dynamic". I assume turbocache works the same way.

    So it will NOT lead to any noticeable performance degradation because it's being allocated to video use. It only takes it when it needs it, and only if it's available.

    Frankly...I think it helps more than many give it credit for. Yes, it's not the same as having physical video memory, but it does provide a noticeable performance bump and also allows systems to utilize it for more texture memory while lowering costs for various applications.
     
  7. Dustin Sklavos

    Dustin Sklavos Notebook Deity NBR Reviewer

    Reputations:
    1,892
    Messages:
    1,595
    Likes Received:
    3
    Trophy Points:
    56
    Here here. TurboCache is just as dynamic as HyperMemory is from my understanding.

    It's also worth noting that nVidia's TurboCache implementation is remarkably better and more efficient than ATI's HyperMemory. TC is done in hardware, while HM is usually done in software. If you need proof, look no further than the performance of a 64MB dedicated Go 7400.

    Ultimately, these technologies are a good cheap idea to get more dedicated parts into more notebooks, which is ALWAYS better than getting an IGP.
     
  8. WeelyTM

    WeelyTM Notebook Consultant

    Reputations:
    65
    Messages:
    118
    Likes Received:
    0
    Trophy Points:
    30
    there is no proof that TC/HM help at all... as some have said in an infinite number of posts before this thread, its a marketing tool, nothing more.

    here's an example. if you have 64MB onboard memory, you are better off disabling TC/HM because constantly kicking something out of onboard memory and putting something new in (which is what happens when you have too little onboard memory) is still faster than constantly accessing system RAM for buttloads of data across a much slower interface (which is what happens when you use system RAM as video memory). yes, pci-x is super fast compared to AGP, but it is still several hundred times slower than access to onboard memory.

    unless you have an application that is solely dependant on how much memory you have and cares nothing for the speed at which it runs, hypermemory and equivalents are of no value.
     
  9. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    But the advertising is FALSE, simply because ANY card out there when it runs out of mem will use HM or TC up to amount it needs, some laptops set a certain amount and thats always going to be worse than the dynamic system.

    A 64mb card and a 64mb with 192mb of turbo cache would perform EXACTLY THE SAME, and the 64mb card will perform almost identically to the 64mb cards of old. It helps so little.

    Its pure BS advertising. In the end its all up to how much dedicated you have, if a game needs more than that then your really going to start seeing a drop in performance.

    INCORRECT on some notebooks, they set a MINIMUM amount which you cant change so its always at least taking that amount. Some games run better with 1GB and cant afford to give any ram or bandwidth away from the CPU and when turbocache takes what it needs away from that your starving your CPU, therefore on games like Doom3 and Quake4, your going to get a MASSIVE drop in performance to what most people to be unacceptable.
     
  10. Jalf

    Jalf Comrade Santa

    Reputations:
    2,883
    Messages:
    3,468
    Likes Received:
    0
    Trophy Points:
    105
    Not true. It won't constantly shuffle data between the two. It works more like a swap file. When it runs out of memory, the least used data is pushed to system memory, on the basis that it probably won't be used again as soon. It also tries to prefetch data before it's needed, to hide this latency.

    As for "proof", well, you could look up some of the benchmarks and tests of the technology.

    Correct, apart from one thing. In these cases, your GPU will be the main bottleneck. If we're down at the bottom of the range, where HM/TC is actually used, then that's a much more important issue than whether you have a full 1GB RAM free, or how much bandwidth is available to the CPU.
    Without HM/TC, the CPU would run slightly better, and the GPU would run worse. But since the GPU (with these low low-end cards) would be a bigger bottleneck, you'd get overall lower performance.

    It can help average out performance a bit, in cases where you have a lousy GPU, and a decent CPU/RAM, by taking a bit of RAM, a bit of RAM bandwidth, and using it to assist the GPU.

    You just have to keep in mind, it's a solution for the cheapest, slowest GPU's only. For higher-end cards, the penalties might well outweigh the benefits.

    Quite simply, a card with 32MB onboard memory would be useless. It just wouldn't be able to run games. TC/HM allows it to run games, but it doesn't allow you to do so with good performance. But it's still an improvement.
     
  11. KrispyKreme50

    KrispyKreme50 Notebook Evangelist

    Reputations:
    41
    Messages:
    678
    Likes Received:
    0
    Trophy Points:
    30
    I would like to add that Windows Vista's Aero Glass isn't supported on cards with only 64 MB of memory (128 MB or more required). Turbocache and Hypermemory are useful in that they enable Aero Glass support on many low end cards with little dedicated memory.
     
  12. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    It is but this technology has been around for ages, its nothing new, TC and HM are just tweaks to it. Left to being dynamic is the best and I wish advertisements were banned for incorperating it as its not shipping with their product.
     
  13. jeffmd

    jeffmd Notebook Evangelist

    Reputations:
    65
    Messages:
    554
    Likes Received:
    20
    Trophy Points:
    31
    meaker, any proof on performance hits if the system ram is used on current end systems? ie duo laptops running 566mhz. Obviously its still not the best option, but most mid range dedicated video cards dont even run their memory at that.

    Also when I think about it, most memory transfer is allways to the video card. The cpu computes some math, sound gets a small stream, Hard drive should only need to send the data of a certain area once, the rest is constant texure movements from video and ram.

    I think people downplay it to much.
     
  14. Ice-Tea

    Ice-Tea MXM Guru NBR Reviewer

    Reputations:
    476
    Messages:
    1,260
    Likes Received:
    0
    Trophy Points:
    55
    The reason why TC and HM are again surfacing after the 'original' AGP implementation sunk is because PCI-e has as much BW to the processor as from.

    Also: I was under the impression that the dynamic characters of the memory mapping was about the fact that it would not take mem if it needed it. I don't think it will just release mem if the system asks for it, that will probably depend on priority mapping between GPU/CPU.
     
  15. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    All the mid range cards I have listed have at LEAST 300mhz (600DDR) ram, so i dont know what your going on about. Remember the CPU usually (in a game like doom3) and all the other devices use ALL of the available bandwidth so ANY you take will hurt performance.
     
  16. Emotion

    Emotion Notebook Guru

    Reputations:
    9
    Messages:
    72
    Likes Received:
    0
    Trophy Points:
    15
    I think mine helps :D! it's running in dual channel 1066MHz is the MAX system bottleneck atm on my m1210
     
  17. TwilightVampire

    TwilightVampire Notebook Deity

    Reputations:
    362
    Messages:
    1,376
    Likes Received:
    0
    Trophy Points:
    55
    I've never liked Hypermemory and Turbocashe. I am crazy for optimal syste performance and these technologies just hurt it. I notice systems are much less peppy with this enabled and I find it unaceptable.

    Another thing, I use a lot of system intensive software like Maya and 3DStudio Max. When rendering, these stress the RAM and CPU to no end. I cant have that much of my RAM going into video performance when I NEED it going into the system so I can get something rendered quicker. Friends and I have run tests to this theroy. It takes much longer to complete a render with HM/TC on than it does with it off.

    In gaming performance the boost isnt as great as one would think. You're still working with a lower end card (usually) and will get performance that still matches that card. Sometimes less if they put an extra small amount of memory on the card.

    On higher end cards this REALLY hurts. When enableing Hypermemory on my x700 I have 768MB of total video RAM. Now, an x700 cant even use the 256mb that I have on it (In most cases). I loose about 250 points off my 3DMark scores with it enabled. Oblivion plays extremely choppy with it on. Overall FPS in most newer games is way down. Not to mention windows doesnt have the "pep" it usually does with 1.5gb RAM. Thats not acceptable.
     
  18. jeffmd

    jeffmd Notebook Evangelist

    Reputations:
    65
    Messages:
    554
    Likes Received:
    20
    Trophy Points:
    31
    Meaker, hmm yea I never did take into account that video memory quotes would be before DDRs x2 multiplyer. Everyone quotes system memory after the fact.

    Id do some kind of extensive testing but my x1300 cant really power the games that need lots of memory, hyper memory or not. I do know that I'm running guildwars and WoW quite well, almost as good as my 9600xt.
     
  19. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    The x1300 core has a few improvements over the 9600 core, its two generations ahead afterall. They dont amount to much though.
     
  20. HavoK

    HavoK Registered User

    Reputations:
    706
    Messages:
    1,719
    Likes Received:
    0
    Trophy Points:
    55

    Excellent post. ;)
     
  21. usapatriot

    usapatriot Notebook Nobel Laureate

    Reputations:
    3,266
    Messages:
    7,360
    Likes Received:
    14
    Trophy Points:
    206
    What I dont get is that is have have 2gigs of 667mhz RAM why cant your GPU with hypermemory use some it it running @ 667mhz?
     
  22. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    (Yes I am using the search feature)

    The bandwidth is finite, ie, your system in games is using all of the amount of data that can be got from and sent to the memory, while not all the mem might be used, if the bandwidth intensive gpu wants to use some memory the rest of the system has to wait for it to do it before it can use what it was using before.
     
  23. ltcommander_data

    ltcommander_data Notebook Deity

    Reputations:
    408
    Messages:
    1,398
    Likes Received:
    0
    Trophy Points:
    55
    I think it should be pointed out that in due to the FSB frequency, only 1 single channel of memory bandwidth is theoretically needed. Such as a 667MHz FSB only using 1 DDR2 667 channel. Of course, nothing ever reaches theoretical, but with dual channel DDR2 667 on a 667MHz FSB or dual channel DDR2 533 on a 533MHz FSB there is certainly bandwidth left over because the FSB is the bottleneck. How effectively that extra bandwidth can be used by the GPU will depend on how smart the memory controller is between balancing GPU memory requests in the empty gaps of the CPU memory requests. The 965 series chipsets include Fast Memory Access which specifically improve this load balancing, but it won't be available for mobile until Santa Rosa in Q2 2007.

    Now it's been said that TC is better than HM since it's in hardware, but I wonder how Intel's shared memory architecture performs in comparison? Intel's GPUs are designed from the ground up to use shared memory since they will never see dedicated memory (since the 815G anyways) so you'd think if anyone has the motivation or has had the time to figure out a better implementation it would be Intel. In any case, Intel's IGPs do have the advantage over dedicated GPUs using TC or HM since the IGP is located right on the northbridge for low latency and high potential bandwidth (in comparison to worrying about PCIe overhead and bandwidth competition). Also brings up the question of how well Intel's IGPs share memory in comparison to nVidia and ATI IGPs.
     
  24. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    In a word, no. I disagree with everything you just said :p

    The only platform getting close to what your saying is the a64, but the latencies involved are still huge for the GFX chip. All core and core 2 cpus need all the bandwidth they can get.
     
  25. Jalf

    Jalf Comrade Santa

    Reputations:
    2,883
    Messages:
    3,468
    Likes Received:
    0
    Trophy Points:
    105
    There's a big difference between bandwidth and latency. As said above, with a decent dualchannel DDR2 setup, you certainly have bandwidth to spare (although it's not much compared to how much GPU's typically use).
    This is easily illustrated by testing the performance difference between single- and dualchannel DDR2. In most cases, it doesn't really make a difference, just like 667MHz vs 533 mhz doesn't really do much.

    But the latency is horrible when accessing system memory, even on the a64 (Onboard memory controller only really benefits the CPU. As far as the GPU is concerned, it's still located far far away)
     
  26. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    Just look at how far away the onboard mem chips are (physically and electrically) from the gfx core to see how much this is an issue ;)

    EDIT: x1900gt mem bandwidth: 38.4GB/sec, mem bandwidth system (on a good day for my system, laptops are less) = 7gb/sec, x1600xt = 22.2gb/sec even the 7300GS has 6.4gb/sec.
     
  27. ltcommander_data

    ltcommander_data Notebook Deity

    Reputations:
    408
    Messages:
    1,398
    Likes Received:
    0
    Trophy Points:
    55
    My understanding is the that large 4MB L2 cache, the fact that is low latency compared to Netburst's L2, the shared architecture saving FSB cache coherency, and the strong prefetching algorithms simply increasing memory bandwidth without changing the FSB doesn't improve performance much. On a Conroe for example with a 1067MHz FSB, going between dual channel DDR2 667 (even DDR2 533) and DDR2 1067 with all the various latency possibilities in between don't yield worthwhile differences at all. This is in comparison to K8 which is far more sensitive to latency differences, which is why DDR2 800 with 4-4-4 timings is recommended to show decent performance differences between S939 and AM2.

    http://www.anandtech.com/memory/showdoc.aspx?i=2800&p=7

    The reasoning seems to be that with a good cache subsystem the times that you need to randomly hit RAM through the FSB are decreased making the memory subsystem have less determination on overall performance. Most memory access is then prefetching which can be organized and optimized around bandwidth and latency concerns. Having to hit the memory subsystem less should also be to the GPU's advantage since there are more empty request slots that it can use.
     
  28. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    9,441
    Messages:
    58,200
    Likes Received:
    17,916
    Trophy Points:
    931
    Yeah your seeing 10-20% performance increases from the memory alone.

    Lets examine this more closely, you have a graphics chip that to perform up to standard (we are talking 7300 level here) needs around 6-7gb/sec bandwidth. Now if we are using a purely TC system then that would swamp the bus, management in this area is never going to be very efficiant as the priority between the CPU and GPU is never going to be right as this changes from scenario to scenario.

    Now most laptops have a 533mhz fsb which will land 5-6gb/sec (just a guess) and we have a half/half turbocache system, due to clever management 70% of accesses occur in the local gfx ram (most used stuff). 30% of the needed bandwidth goes to the system ram, thats still around 2gb/sec lopped offyou have just taken an effective 1/3rd of the ram allowance so we might aswell be running a 355mhz fsb.
    Now if those graphs were continued downwards they would rapidly tank towards 0, not only that but we have also made the gfx pipeline have to wait MANY times longer for that information (imagine lag in an online game) to arrive so even though it arrives in a steady stream the information is quite old, but the rest of the scene has to wait for it before it can be outputted, so we get a drop off in performance from two sides.
    Now if we take one step up to say the x1600 which normally likes (for the laptop version) around 14-15gb/sec bandwidth, take a third of that and thats all your fsb gone.
     
  29. Paul

    Paul Mom! Hot Pockets! NBR Reviewer

    Reputations:
    759
    Messages:
    2,637
    Likes Received:
    0
    Trophy Points:
    55
    Well, I don't use Turbocache on my card obviously. But in Vista running DirectX 9.0L, a version of Turbocache or Hypermemory called "Dynamic system memory usage" or something like that is initiated and used by Vista on the graphics card. If I look in the video card settings, Vista says that I have 512MB of video memory available, but the 1500M is only a 256MB card. And according to all current reports, gaming takes about a 10% hit in Vista as compared with XP. Now, I don't know if this is due to the dynamic memory usage that Vista employs or simply due to the new DirectX features. Just an observation. It would be interesting to see performance of low end turbocache/hypermemory cards versus high-end cards in Vista compared to XP.