The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    On the subject of Maxwell GPUs' efficiency

    Discussion in 'Gaming (Software and Graphics Cards)' started by n=1, Mar 16, 2015.

  1. n=1

    n=1 YEAH SCIENCE!

    Reputations:
    2,544
    Messages:
    4,346
    Likes Received:
    2,600
    Trophy Points:
    231
    I'm adapting a post I made in another thread, since the topic of Maxwell's efficiency greatly interests me and pisses me off at the same time (you'll see why later).

    As you may know, Maxwell's efficiency comes from its aggressive dynamic micro throttling, which adjusts the GPU core voltage based on load. This micro adjustment happens so rapidly that on a macro (second long) timescale, we only see the "averaged out" value that's reported in Afterburner. Case in point:

    [​IMG]

    See how the power consumption constantly fluctuates, going from over 250W (!) to less than 100W in the timespan of 1 second? Yeah that's what I mean by micro adjustment. This constant, rapid adjustment of voltage is the source of Maxwell's efficiency, but unfortunately is also its Achilles heel. Let me explain why with the help of Maxwell BIOS Tweaker.

    I'm going to show a few pictures of the boost and voltage tables of my Gigabyte 970's stock vbios. Let's go step by step and start with the boost table:

    [​IMG]

    This one is fairly straightforward and doesn't really need much explanation. Each value in the table represents a particular boost/clock state. The only thing to note here is clock states #35 through #74 - highlighted in yellow - belong to the P0 (full load) state of the GPU.

    Now let's look at some voltage tables:
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]

    Let me go a bit slower here so everyone will follow. Each entry in this voltage table corresponds exactly to the particular clock state in the boost table. So for example CLK 40 corresponds to a boost clock of 1076 MHz by the boost table. Now you'll see that each CLK state has a fairly wide voltage range, starting from about 106mV (CLK 35), all the way up to a ridiculous 219mV (CLK 60).

    For my particular 970s, running each of them solo, one boosts to 1380 (CLK 63), and the other 1405 (CLK 65).

    Now let's look at the corresponding voltage entries for CLK 63 and CLK 65.
    [​IMG]

    You can see both states have a defined upper limit of 1.281V, while the lower limit is 1.075V for CLK 63 (1380 boost), and 1.081V for CLK 65 (1405 boost).

    As you can imagine, trying to push almost 1400MHz on the GPU core with a measly 1.081V is simply going to end in tears. Now what I don't know is how exactly, through what algorithm, the vbios picks the exact voltage to use for each clock state. Actually on second thought, the algorithm is most likely programmed into the driver, and the vbios simply delineates what voltages are "allowed" for what boost clocks. In any case I'm going to wager a guess that the driver adjusts the voltage dynamically based on load, the "micro adjustment" I mentioned in the opening paragraph. I'm also going to presume the voltage range is set so wide for each clock state to maximize efficiency, by giving the core as little power as possible based on what nVidia engineers have somehow determined to be the minimum stable working voltage for each clock state. (pure speculation on my part here)

    The very unfortunate thing here is, because the voltage range for each clock state is set so wide, sometimes the voltage simply gets stuck at the lower limit of that clock state, and doesn't ramp up fast enough to keep up with the GPU core under load, which results in crashing without any forewarnings (artifacts, glitches etc).

    From my own experience, this is especially prone to happen right after a non-demanding cutscene, where the core is basically chilling out, and then is immediately thrown back into action when the cutscene ends. What typically happens - as I've observed from Afterburner's OSD - is that the boost clock shoots right to where it should be due to the suddenly increased load, but the voltage is either stuck at the lower limit of that particular clock state, or worse, stuck in the voltage range of a lower clock state. This is what is referred to as a "boost/voltage table crossover" in the desktop community, and is CRAZY ANNOYING.

    So suffice to say the only fix to this dynamic micro throttling garbage, is to clamp both the lower and upper voltage limits to the same value for each clock state #35 through #74. This way it pretty much guarantees that no matter where the GPU is at in the boost table, it will always be delivered a constant voltage (I set mine at 1.25V), ensuring 100% stability regardless of load and what not.

    But of course in doing so efficiency goes out the window, and the TDP numbers don't look as nice anymore. Not that I give a damn, but does make me suspect this is yet another trick nVidia pulled with Maxwell to make it look more impressive from a performance per watt standpoint at the expense of 100% stability. And DO NOT get me started on the 970 3.5GB vram + 500MB G-Cache™ issue.

    Hopefully all that made sense to you guys. So what do you think, have I lost my marbles, or is this really what it is?

    Hey look I wrote a book. @D2 Ultima: You're not the only one writing books ha. (although you might still be the only one writing books at 2am XD)
     
    Last edited: Mar 17, 2015
    Vasudev, Robbo99999, E.D.U. and 6 others like this.
  2. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    I LOVE BOOKS
     
    Vasudev and n=1 like this.
  3. n=1

    n=1 YEAH SCIENCE!

    Reputations:
    2,544
    Messages:
    4,346
    Likes Received:
    2,600
    Trophy Points:
    231
    nobody eslse loves books it seems

    me = sad
     
    Vasudev likes this.
  4. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    I read it :D
     
  5. Zymphad

    Zymphad Zymphad

    Reputations:
    2,321
    Messages:
    4,165
    Likes Received:
    355
    Trophy Points:
    151
    Not surprising. Intel and AMD does this too. They are all doing crud for numbers on paper but sacrificing actual use. Look at AMD, their hybrid parallel CPU-GPU APU seem great on paper, but suck in practice. Intel's switchable graphics is great on paper, implementation and actual results? Rubbish. Don't see why it would be any different with NVidia.

    Efficiency at these levels of performance I always assume some gimmick is being used. I don't think any genuinely real improvements and advances in technology have happened in graphics technology in 10 years, it's just one gimmick after another. Thankfully performance improvements are real, but the battery saving without sacrificing performance claims are BS all around.

    You are right. Whether I'm using dGPU or Switchable, the fluctuations between min-max-avg is out of control for years now. The min are really low, and highs really high and averages are meh at best because the min are so low. Demanding games like Crysis 3 suffer, Max of 90? Wow... but the lows of 30s-40s keep my avg framerate under 60 with an overclocked 980M.
     
    Last edited: Mar 17, 2015
    Vasudev likes this.
  6. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    Then again, this also brings to light the whole DX12 troubles. If DX12 can pull all from a CPU, what's going to happen to mobile maxwell users on small power bricks? P650SG might as well be running a 20W brick when that happens.
     
    Vasudev likes this.
  7. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    DX12 decreases CPU usage and increases GPU usage by eliminating the CPU bottleneck. If our power bricks can handle a much more stressful load like FurMark+Prime95 without issue, DX12 apps will be a walk in the park.
     
  8. Zymphad

    Zymphad Zymphad

    Reputations:
    2,321
    Messages:
    4,165
    Likes Received:
    355
    Trophy Points:
    151
    My understanding DX12 will allow more instructions to be sent from CPU to GPU, breaking down that bottleneck that current DX overhead has compared to the "bare metal" method that PS3 and X360 used previously. I was not aware that the CPU power usage would be increased?

    Anyway, I went with 330W just in case of power issues for the 4790K.
     
  9. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Haha no. I doubt it. Microsoft isn't that stupid so as to kill countless laptop PSUs outright. 180W is and will always be fine for 980M whilst non-overvolted. I'm sure of that. CPU power usage will be reduced to compensate for any increased GPU power increase anyway.

    Worst case scenario Clevo will have to refund all of us owners for selling us faulty hardware. We'll still be covered under warranty. Or Microsoft will have to foot the bill for providing a dangerous, faulty update.
     
  10. Ethrem

    Ethrem Notebook Prophet

    Reputations:
    1,404
    Messages:
    6,706
    Likes Received:
    4,735
    Trophy Points:
    431
    Clevo wouldn't be on the hook for anything unless they specifically advertised Windows 10 support for the machine. DX12 has already been shown to increase overall power consumption, its stupid to think that wouldn't apply to mobile as well. I would hope, however, that Microsoft has thought about the woefully underpowered power bricks out there but when has Microsoft actually had any foresight......
     
    D2 Ultima likes this.
  11. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    ...in previously CPU-bound workloads where GPU usage was low. With the CPU bottleneck removed, GPU usage goes up and naturally so does power consumption.
     
  12. Ethrem

    Ethrem Notebook Prophet

    Reputations:
    1,404
    Messages:
    6,706
    Likes Received:
    4,735
    Trophy Points:
    431
    I don't see how that changes things, an increase is an increase. The power draw on the brick is the same whether it's the CPU or GPU sucking down the watts
     
  13. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    It doesn't increase though if you already had 99% GPU usage before. More likely, the reduction of CPU usage in DX12 results in lower system power consumption.

    You're basing your assertion off that Star Swarm graph. What it didn't show was that under DX11, the GPUs were being bottlenecked badly, so usage was much lower. Remove the CPU bottleneck in DX12/Mantle, GPU usage shoots up to 99%, and system power consumption goes up as a result.
     
    Cakefish likes this.
  14. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    No, GPU power usage will be increased, because it gives pure access to the GPU without going through the CPU. This means that the voltage adjustment "efficiency" benefit that Maxwell uses might be killed off. Which means our "100W 980M" cards might become a "140W 980M" etc.
     
  15. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    How can GPU render anything without CPU sending it draw calls?
     
  16. Zymphad

    Zymphad Zymphad

    Reputations:
    2,321
    Messages:
    4,165
    Likes Received:
    355
    Trophy Points:
    151
    I'm not going to speculate on something like that when I haven't seen anything to suggest that yet.
     
  17. n=1

    n=1 YEAH SCIENCE!

    Reputations:
    2,544
    Messages:
    4,346
    Likes Received:
    2,600
    Trophy Points:
    231
    Future GPUs will become so advanced they gain sentience
     
    TomJGX and D2 Ultima like this.
  18. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    Well the CPU won't need to draw the "thin frames" or whatever it is for the GPU to fill out. Even if it's sending draw calls, it isn't doing "pre-render" so the full GPU power can run free like bullseye from Toy Story.
     
  19. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Well Clevo will likely go out of business then. Producing enthusiast products in 2014/2015 that don't support Windows 10 - that's suicide. No one will ever trust them again. They will be doomed.

    Microsoft will also face severe backlash if Windows 10 literally kills PCs. Even desktops don't be safe from increased power draw. Many PSUs could fail and it will be the biggest scandal in the PC industry in years. Windows 10 will go down in history as an even bigger disaster than Windows 8. Their reputation will be irreversibly damaged.

    NVIDIA will face lawsuits if they falsely advertise the TDP of their products. A bigger scandal than the GTX 970 fiasco.

    So... basically it'll spell doom for Clevo, other OEMs that use small PSUs, desktop PCs, Microsoft, NVIDIA and AMD. Considering what's at stake I can't see this coming to fruition.

    I'm sure my 980M will keep on chugging along quite nicely under DX12 - it has to for the sake of PC gaming :)
     
  20. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    They did NOT advertise or make public the TDP of their mobile GPUs. Not even close. Their recommended system PSU size for desktops is always FAR larger than whatever they say the card draws. They are safe.
     
  21. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    CPU is still doing work. Lower API overhead means more draw calls for the same CPU usage, or same number of draw calls with less CPU usage. That's the point of the Star Swarm demo, to show how DX12 and Mantle enable tens of thousands of draw calls which would otherwise cripple performance on the same hardware in DX11.
     
  22. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    They must have internally communicated a TDP to the OEMs though, behind closed doors. So they may well face a backlash there.

    And the affected laptop OEMs will certainly suffer the consequences regardless. How could consumers trust them ever again if suddenly our PSUs (and potentially whole systems) end up being bricked en mass just because of a simple OS update? I personally would be absolutely furious.

    However, I doubt they would let such a scandal ensue. Microsoft, NVIDIA and AMD must have done extensive testing of the API. Any issues should be resolved before release.

    It'd be dark days indeed for a simple API update to brick laptops.
     
  23. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    No machine currently out is "windows 10 certified" and even if they release drivers for windows 10, the machines were designed (and supplied power bricks) for Windows 7 and 8.1 or just 8.1. Maybe clevo's NEXT set of machines will have 230W standard, which, according to @LunaP's researching, is going to be so, because of desktop-CPU-only usage, and they're rated for far more watts.
     
    Last edited: Mar 17, 2015
  24. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    Much ado about nothing
     
  25. hfm

    hfm Notebook Prophet

    Reputations:
    2,264
    Messages:
    5,296
    Likes Received:
    3,048
    Trophy Points:
    431
    If you can run Furmark/Kombuster you're probably safe. GPU usage will be much more efficient, but the game still has to use the CPU for countless other tasks.
     
  26. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Clevos reputation (as well as all other involved OEMs) will dive bomb if they sell brand new laptops incapable of being upgraded to Windows 10. The resellers will suffer too, thinking about it. I know personally I would be livid. It'd be completely unacceptable. An entire, less-than-a-year-old premium enthusiast laptop series being made entirely redundant in under a year? Unheard of in all of history.

    I'm using dramatic language, but it really would be a huge deal in my opinion.
     
  27. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    In one year's time I will read this thread and laugh as I play DX12 games on my overclocked P650SG with 240W brick :D
     
  28. D2 Ultima

    D2 Ultima Livestreaming Master

    Reputations:
    4,335
    Messages:
    11,803
    Likes Received:
    9,751
    Trophy Points:
    931
    in 1 year's time you'll likely have forgotten about this XD
     
  29. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    Exactly, because all this scaremongering and then it turns out to be a non-issue. ;)

    BTW 3DMark14 low overhead API (DX12, Mantle) test inbound. Let's put these power consumption fears to rest once and for all.

     
  30. heibk201

    heibk201 Notebook Deity

    Reputations:
    505
    Messages:
    1,307
    Likes Received:
    341
    Trophy Points:
    101
    so people are worried about overloading PSU when the API got MORE efficient...? this shouldn't be happening....
     
  31. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Well NVIDIA have most certainly advertised TDPs of their desktop cards so DX12 can't increase TDP of GPUs or else they will be sued for false advertising (again).

    You can't judge power draw from Star Swarm - which is the DX12 equivalent of Furmark. It's designed to push the system way beyond reasonable workloads
     
    Last edited: Mar 18, 2015
  32. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    In what way? There is no proof of this.

    Oh wait, look:

    [​IMG]

    [​IMG]

    [​IMG]

    Crysis 3 and FurMark on the exact same test bench use more power. :rolleyes:
     
  33. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Ah OK then, I stand corrected!

    Still, I really doubt all parties involved would be so foolish so as to intentionally brick countless laptops and desktops. The backlash would be catastrophic. We'll be fine :)
     
  34. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    It's easy to smell FUD from a mile away
     
  35. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    I just learnt a new acronym!

    B-b-but DirectX 12 is supposed to be the messiah of PC gaming, not the harbinger of its doom :eek:
     
  36. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    What are you talking about? Spreading FUD again? ;)
     
  37. n=1

    n=1 YEAH SCIENCE!

    Reputations:
    2,544
    Messages:
    4,346
    Likes Received:
    2,600
    Trophy Points:
    231
    FUDGE is delicious though

    Fear
    Uncertainty
    Doubt
    Generalization
    Exaggeration
     
    Last edited: Mar 19, 2015
    hfm, TomJGX and octiceps like this.
  38. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
  39. CptXabaras

    CptXabaras Overclocked, Overvolted, Liquid Cooled

    Reputations:
    1,024
    Messages:
    1,335
    Likes Received:
    236
    Trophy Points:
    81
    That's the reason why i modded my 980's bios and from clock bin 74 to 34 i've set min/max at 1.250V / 1.250V.

    No fluctuation at all.

    Under heavy benchmark, i've seen almost 675 Watt pulled from the power outlet, with the config in my sign. Not that different form when i was running crossfired 7970 Ghz edition cards.
     
  40. octiceps

    octiceps Nimrod

    Reputations:
    3,147
    Messages:
    9,944
    Likes Received:
    4,194
    Trophy Points:
    431
    Damn and the 7970 GHz was a 275W card
     
  41. CptXabaras

    CptXabaras Overclocked, Overvolted, Liquid Cooled

    Reputations:
    1,024
    Messages:
    1,335
    Likes Received:
    236
    Trophy Points:
    81
    Exactly, there is maybe a 50 / 75 peak watt difference. Keep in mind that i'm measuring with the integrated watt-meter on my UPS.

    Any way i don't really care, i went with 980's without thinking about power efficency, but performances. Watercooled them and hardly pass 42 degrees celcius, under full load. (happy panda!! :D)