[Fixed Workaround] Asus G51J(x) CPU throttling investigation | Page 5 | NotebookReview

unclewebb ThrottleStop Author

Reputations:: 7,810

Messages:: 6,414

Likes Received:: 6,726

Trophy Points:: 681

The Clock Modulation register 0x19A is pretty easy to understand. The lower bits in EAX are used while the upper bits are Reserved by Intel. Bit[0] is also reserved and is not used.

Intel® 64 and IA-32 Architectures Software Developer's Manual
Volume 3A: System Programming Guide

14.5.3 Software Controlled Clock Modulation
http://www.intel.com/Assets/PDF/manual/253668.pdf

Bit[4] is the one that controls whether Clock Modulation is enabled. When this bit is set, clock modulation is enabled. Make sure that bit is always zero and no more clock modulation. That's all ThrottleStop does to disable this.

Bit[0] is reserved and not used and bits [3:1] let you choose 8 different levels of clock modulation from 12.5% to 87.5%. The Intel documentation is very good explaining this.

You can use my MSR Tool to have a look at how this register changes. Use ThrottleStop to play around with clock modulation and then use MSR Tool to read what is in register 0x19A.

The register beside that one, 0x199, is very simple for Core i7 CPUs. For the i7-720 the default multiplier is 12 so you need to add one on to this which tells the CPU to use as much turbo boost as is available. Keep this register at 13 for the 720 and 14 for the 820 and there will be no more multiplier throttling.

Fixing things at the bios level is a lot better way to fix this problem rather than the way ThrottleStop tries to fix things after it detects throttling. It's better to stop throttling before it happens rather than try to reverse it after the fact.

unclewebb, Feb 27, 2010

#189

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

unclewebb to the rescue...

you don't happen to recall anything about other methods it uses to control the multiplier, eg. available power?
unless there's a master register (or different set of registers) that intel's PowerManagement module writes to, our nop replacement won't work.

thalanix, Feb 28, 2010

#190

Mishax1 Notebook Enthusiast

Reputations:: 0

Messages:: 15

Likes Received:: 0

Trophy Points:: 5

just a random question. is that a normal warning in Event Viewer ?
"The speed of processor 0 in group 0 is being limited by system firmware. The processor has been in this reduced performance state for 71 seconds since the last report."
There's one for each processor.

Mishax1, Feb 28, 2010

#191

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thanks again thalanix and thanks unclewebb for the assistance as well.

Mishax1 said: ↑

just a random question. is that a normal warning in Event Viewer ?
"The speed of processor 0 in group 0 is being limited by system firmware. The processor has been in this reduced performance state for 71 seconds since the last report."
There's one for each processor.

Click to expand...

Interesting, I have a hundreds of those warnings "Kernel-Processor-Power" Warning level events going back to within 30 minutes of when I turned on my G51Jx for the first time 3/2/10 at 10pm. I also have about 40 ACPI Error events:

": The embedded controller (EC) did not respond within the specified timeout period. This may indicate that there is an error in the EC hardware or firmware or that the BIOS is accessing the EC incorrectly. You should check with your computer manufacturer for an upgraded BIOS. In some situations, this error may cause the computer to function incorrectly."

and a handful of some similar warnings from the same time periods:

": The embedded controller (EC) returned data when none was requested. The BIOS might be trying to access the EC without synchronizing with the operating system. This data will be ignored. No further action is necessary; however, you should check with your computer manufacturer for an upgraded BIOS."

Hmmm...
Peter

PJPeter, Feb 28, 2010

#192

unclewebb ThrottleStop Author

Reputations:: 7,810

Messages:: 6,414

Likes Received:: 6,726

Trophy Points:: 681

thalanix: I sort of understand how these CPUs operate at the user level but I don't understand anything beyond that or how the bios controls throttling, etc. The publicly available Intel documentation is full of Reserved registers. I'd give my right arm to get the full docs from Intel.

Usually you only get those warning messages when the CPU is throttling. If you are getting these messages when you're not working your computer very hard then something else must be wrong.

unclewebb, Feb 28, 2010

#193

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

hopefully you can get that mass-msr-dump implemented, if not, no worries. thanks for all the help.

for now i suggest we look through http://developer.intel.com/Assets/PDF/manual/253669.pdf appendix B.5 for power-related MSRs

thalanix, Feb 28, 2010

#194

Mishax1 Notebook Enthusiast

Reputations:: 0

Messages:: 15

Likes Received:: 0

Trophy Points:: 5

CCz_Cataphract said: ↑

thanks again thalanix and thanks unclewebb for the assistance as well.

Interesting, I have a hundreds of those warnings "Kernel-Processor-Power" Warning level events going back to within 30 minutes of when I turned on my G51Jx for the first time 3/2/10 at 10pm. I also have about 40 ACPI Error events:

": The embedded controller (EC) did not respond within the specified timeout period. This may indicate that there is an error in the EC hardware or firmware or that the BIOS is accessing the EC incorrectly. You should check with your computer manufacturer for an upgraded BIOS. In some situations, this error may cause the computer to function incorrectly."

and a handful of some similar warnings from the same time periods:

": The embedded controller (EC) returned data when none was requested. The BIOS might be trying to access the EC without synchronizing with the operating system. This data will be ignored. No further action is necessary; however, you should check with your computer manufacturer for an upgraded BIOS."

Hmmm...
Peter

Click to expand...

Actually this is the G51J-A1 that had the bsod errors and now I see that this happened both on the 207.10T and the 208 bios.

Mishax1, Feb 28, 2010

#195

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

Mishax1 said: ↑

Actually this is the G51J-A1 that had the bsod errors and now I see that this happened both on the 207.10T and the 208 bios.

Click to expand...

Not to go too far off topic, but what do you mean? I did a search and can't find anyone linking the BSOD issues with any of those 3 errors/warning messages. None of those errors resulted in a BSOD for me - I have a few critical errors that did lead to BSODs/shut downs (none of which were 0x124 BSODs) but none of them line up with those messages. I have a G51Jx which has a latest BIOS version of 204 without the BSOD fix - but there is no connection with it at this point that I have found...

I think it may be more likely that the errors may be linked to CPU throttling - and I may even have started getting those EC errors while running tests with ThrottleStop - they only appear in the log during the 5 hours or so that I was testing with it, not before or since. The Processor "reduced performance state" messages have always been there though, since I got my G51Jx.

Peter

PJPeter, Feb 28, 2010

#196

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

i had those messages as well before disabling performance counters. may or may not be related, i don't think it matters.

now for some potential bad news, disabling ACPI does not stop throttling. i did this via acpi=off boot stanza and made sure lsmod didn't have it loaded. keyboard and backlight controls didn't work so i assume it was off. not sure why i have 2 CPUs listed when groceries only had 1. the results:

full screenshots, same thing as above.
http://i137.photobucket.com/albums/q228/HoldFire/ss28.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss29.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss30.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss31.jpg

thalanix, Feb 28, 2010

#197

DCMAKER Notebook Deity

Reputations:: 116

Messages:: 934

Likes Received:: 0

Trophy Points:: 0

how did u get non-throttled and throttled test scores?

DCMAKER, Feb 28, 2010

#198

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

non-throttled: running the benchmark normally. throttled: limiting CPU to 7x wit cpufreqd. the right three were without ACPI and running the bench alongside phoronix-test-suite unigine-sactuary.

thalanix, Feb 28, 2010

#199

DCMAKER Notebook Deity

Reputations:: 116

Messages:: 934

Likes Received:: 0

Trophy Points:: 0

how much cpu is being used when trying to cause throttling by the other app.

DCMAKER, Mar 1, 2010

#200

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

by pts, the monitor says 100% on one thread. i don't think that would be possible, so it's most likely throttling.

thalanix, Mar 1, 2010

#201

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thalanix said: ↑

i had those messages as well before disabling performance counters. may or may not be related, i don't think it matters.

now for some potential bad news, disabling ACPI does not stop throttling. i did this via acpi=off boot stanza and made sure lsmod didn't have it loaded. keyboard and backlight controls didn't work so i assume it was off. not sure why i have 2 CPUs listed when groceries only had 1. the results:

full screenshots, same thing as above.
http://i137.photobucket.com/albums/q228/HoldFire/ss28.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss29.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss30.jpg
http://i137.photobucket.com/albums/q228/HoldFire/ss31.jpg

Click to expand...

thalanix said: ↑

non-throttled: running the benchmark normally. throttled: limiting CPU to 7x wit cpufreqd. the right three were without ACPI and running the bench alongside phoronix-test-suite unigine-sactuary.

Click to expand...

Hey thalanix,

I'm trying to understand all these results. Do you have a test with ACPI and running the bench alongside phoronix-test-suite Unigine-Sanctuary to compare to? Ah I see, you did a test without ACPI but forcing the CPU to 7x with cpufeqd to get the Throttled Score - and then compared that to the score when the test was being run concurrently with some other process but (supposedly) still non throttled?

The results tables don't really make any sense - some runs of Prime95 (especially in the last column) are going faster at higher FFT lengths than at lower ones. If the effect of running with throttling = low score, without throttling = good score and without throttling and with other load = ~throttled score - wouldn't that be a better thing since it indicates we are getting more out of the system when under heavier load?

The tests with Unigine-Sanctuary show 3 cores in use as opposed to the 1 in the Prime95 loan test, which would essentially put Turbo Boost at a low level since it cannot push up all the speed on the one core (something it also couldn't do in the simulated Throttle test since you locked the CPU Freqs at 7x).

As well, we know that the GPU/CPU will operate more slowly as it reaches the actual wattage limits of the 120W adapter - something it seems able to max out with ACPI off quite easily. Could that be where you are seeing this throttling in the later tests? Edit: I don't think this is the case, more likely the option I mentioned in the last paragraph - but I left this here since it is something to keep in mind as we push these systems to their limits.

Sorry, I may be missing something here.

Thanks very much,
Peter

PJPeter, Mar 1, 2010

#202

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

lower/faster times = better. throttled was with ACPI (else cpufreqd wouldn't work), but the same would work by running prime95 with it to keep consistency.

idealy, the scores should be falling somewhere between the throttled and non-throttled score.

eg 1280k - best time is 29, worst is 73, and in three tests it falls at 72, 72 and 48. sometimes it falls between, sometimes it hits rock bottom. when it hits the lower threshold, it matches the score i get while throttled (2 threads @ 933 < 1 core @ 2.8)

without ACPI, i have one core available with hyperthreading.

thalanix, Mar 1, 2010

#203

nfshp253 Notebook Evangelist

Reputations:: 18

Messages:: 608

Likes Received:: 0

Trophy Points:: 30

@thalanix, is there a program that you used to prevent throttling, if not how did you get the non-throttled scores?

nfshp253, Mar 1, 2010

#204

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

without running pts

if the 3 unigine scores are confusing, those are sequentially (not parallel).

thalanix, Mar 1, 2010

#205

nfshp253 Notebook Evangelist

Reputations:: 18

Messages:: 608

Likes Received:: 0

Trophy Points:: 30

what's "pts"? Sorry, I was not reading the last 22 pages.

nfshp253, Mar 1, 2010

#206

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

phoronix test suite. benchmark utility on linux.

thalanix, Mar 1, 2010

#207

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

one final bump for victory: i don't know who added the last edit, but you are my hero. i love you with all my manly heart.

forget the BIOS editing. the fix is simply to disable thermal monitoring in 0x1FC, by disabling bit 0 (eg. using the MSR tool to write 0x02 into a single core; it propagates on its own to the other three).

initial testing:
furmark + stock nvidia clocks can now run
furmark + stock nvidia clocks + prime95 = shut down the power brick, meaning that we are now at the PSU level. when this happens, you have to unplug/replug it and pray it works (did twice so far).

temps can get extremely high and there is no automatic shutdown; use at your own risk and keep an eye on them whenever possible

thalanix, Mar 1, 2010

#208

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

I see - thanks for the clarification. So AllurGroceries was getting the same sorts of results with no ACPI but the Hyperthreaded core did not come up in the stats?

It's interesting looking at those stats - they cover a period of only just over 100 ms if I read it right - Turbo Boost can take a few seconds to fully cycle up (or get throttled down) - and some of the test results get better as the test continues, while others remain low - since these are all only taking the best time out of multiple runs of a few ms then we can assume that for those that do have good timings, Turbo Boost was in effect for at least part of that run - or nearly all of it - where as other runs might have been half or entirely throttled.

It's too bad we don't have some times for all those iterations to get data from a longer time window. If there is even a slight pause in UE then it could have a profound impact on those timings - and that might be why some of the #s don't match the trend (768K FFT length. Best time: 43.801 ms, 1280K FFT length. Best time: 48.125 ms, 1536K FFT length. Best time: 41.762 ms). I like how superpi does it, showing us the data for more than a single iteration in most cases.

To confirm, you could get that Non-Throttled score with ACPI on or off and it would be the same in both cases right? As long as nothing else is running concurrently you are not seeing any throttling?

By the way, do you have any Kill-A-Watt meter? Or just AllurGroceries?

Thanks,

Peter

PJPeter, Mar 1, 2010

#209

ryzeki Super Moderator Super Moderator

Reputations:: 6,547

Messages:: 6,410

Likes Received:: 4,075

Trophy Points:: 431

So this has been "fixed"? What temps are we talking about? And I can test the 150w with furmark+stock clocks+Prime95 to see if I can maintain it.

Can you tell me how to disable the thermal monitor? I am not as versed as some of you haha. If possible, also tell me how to re-enable it later

thalanix said: ↑

one final bump for victory: i don't know who added the last edit, but you are my hero. i love you with all my manly heart.

forget the BIOS editing. the fix is simply to disable thermal monitoring in 0x1FC, by disabling bit 0 (eg. using the MSR tool to write 0x02 into a single core; it propagates on its own to the other three).

initial testing:
furmark + stock nvidia clocks can now run
furmark + stock nvidia clocks + prime95 = shut down the power brick, meaning that we are now at the PSU level. when this happens, you have to unplug/replug it and pray it works (did twice so far).

temps can get extremely high and there is no automatic shutdown; use at your own risk and keep an eye on them whenever possible

Click to expand...

ryzeki, Mar 1, 2010

#210

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thalanix said: ↑

one final bump for victory: i don't know who added the last edit, but you are my hero. i love you with all my manly heart.

forget the BIOS editing. the fix is simply to disable thermal monitoring in 0x1FC, by disabling bit 0 (eg. using the MSR tool to write 0x02 into a single core; it propagates on its own to the other three).

initial testing:
furmark + stock nvidia clocks can now run
furmark + stock nvidia clocks + prime95 = shut down the power brick, meaning that we are now at the PSU level. when this happens, you have to unplug/replug it and pray it works (did twice so far).

temps can get extremely high and there is no automatic shutdown; use at your own risk and keep an eye on them whenever possible

Click to expand...

lol, I took a few minutes to re-read my post and I missed my window - oh well - great to hear you found a fix .

Does seem we are at the limits of the PSU then - can you tell me how you disabled thermal monitoring and I'll run a test and see what my Power Meter readings are giving me. Sounds like this might be a fix to show the potential of the system to show ASUS we need a higher limit/stronger PSU to make use of our laptop fully.

Peter

PJPeter, Mar 1, 2010

#211

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

http://g51jbsod.wikia.com/wiki/CPU_throttling#fix
* grab the MSR tool at the bottom of the page
* in MSR Number, type in 0x1FC
* in the box under -0003 (EAX), type in 2 and Write MSR
* Read MSR to make sure all cores are set to 0x02 and enjoy
using http://www.fileden.com/files/2008/3/3/1794507/MSR.zip

To confirm, you could get that Non-Throttled score with ACPI on or off and it would be the same in both cases right? As long as nothing else is running concurrently you are not seeing any throttling?

Click to expand...

that is correct. the stats don't mean much any more though

the brick turning off can be a little scary the first time, everything goes blank. this was at nvidia clocks with furmark and prime95, i don't suppose it'll happen on stock (hopefully).

thalanix, Mar 1, 2010

#212

cookieofdoom Notebook Consultant

Reputations:: 9

Messages:: 135

Likes Received:: 0

Trophy Points:: 30

Wow! Great work.

So this would stop the laptop from shutting itself off if it overheats?... is there anyway a 3rd party program could be used to do the same thing, or is temperature monitoring completely disabled using this method?

cookieofdoom, Mar 1, 2010

#213

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

yep. you can still read your temperatures, but the auto-shutdown and auto-throttling when DTS hits 0 are no longer enabled.

they aren't much higher than just furmark, but they are higher. i don't think we'll be getting to the furmark+prime95 level any time soon, so i wouldn't worry about temperatures too much.

we can change the default bits in PowerManagement and Intelligent Power Sharing, but that's far too risky than just setting it before playing.

thalanix, Mar 1, 2010

#214

unclewebb ThrottleStop Author

Reputations:: 7,810

Messages:: 6,414

Likes Received:: 6,726

Trophy Points:: 681

Thermal shut down is not supposed to happen in Intel CPUs until approximately 125C. When the DTS=0, the CPUs is only at about 100C.

When the core temperature is between 100C and 125C, it is no longer possible for any software to monitor this because the DTS will keep showing zero.

If this truly is a fix then I can write a very simple tool with a single button to disable MSR 0x1FC.

unclewebb, Mar 1, 2010

#215

ryzeki Super Moderator Super Moderator

Reputations:: 6,547

Messages:: 6,410

Likes Received:: 4,075

Trophy Points:: 431

thalanix said: ↑

yep. you can still read your temperatures, but the auto-shutdown and auto-throttling when DTS hits 0 are no longer enabled.

they aren't much higher than just furmark, but they are higher. i don't think we'll be getting to the furmark+prime95 level any time soon, so i wouldn't worry about temperatures too much.

we can change the default bits in PowerManagement and Intelligent Power Sharing, but that's far too risky than just setting it before playing.

Click to expand...

Well, I only did a test for 4 minutes, my CPU cores reached 79-80°C and my GPU reached 100°C

I will check in a bit.

Thalanix or anyone else, can you guys test with 120w and stock clocks (not the nvidia reference clocks) to see if there is also shutdown?.

I can help with Asus to get 150w in case 120w is not sufficient. But Asus won't accept overclocked results.

ryzeki, Mar 1, 2010

#216

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

after having the brick shut off 3 times, i'm a little bit of a chicken
the irony if the power brick becomes a brick...

this is breaking asus' throttle scheme, so i doubt they'll hand out 150's anyway.

edit: we already claimed the top of google with MSR 0x1FC, so there has to be a drought of info about it.

thalanix, Mar 1, 2010

#217

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

I can confirm with my G51Jx running with these sets, the CPU temps aren't going abnormally high - but I am getting 2.96GHz constant Turbo Boost while running FurMark at ASUS GTS360M stock clocks.

Usually Turbo Boost would cease working when I reached 80*C GPU temps in my previous tests - but now it's still going at 89*C and it's going at 2.96GHz almost 95% of the time (occasionally going down to 2.82 GHz) - much much better than before .

The CPU Core 0 temp never passed 71*C - the only thing that reached the limit was the PSU - it was peaking at 130W constant on my Power Meter (something I never was able to reach before). The system kept at this level, with nothing changing except the GPU temp inching up minute by minute for 5 mins when I shut it off to write this report.

This was just with only Furmark running, GPU at stock, CPU w/Extreme Turbo enabled and with no USB devices plugged in and the screen brightness at 20%.

Peter

PJPeter, Mar 1, 2010

#218

unclewebb ThrottleStop Author

Reputations:: 7,810

Messages:: 6,414

Likes Received:: 6,726

Trophy Points:: 681

This looks very exciting. I'm going to go develop a quick tool with a single button so it will be easy for any user to toggle this on and off for testing purposes.

unclewebb, Mar 1, 2010

#219

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

130W without prime95? if that's true, then these are some mighty PSUs... i was able to handle furmark @ nvidia clocks for more than a few minutes on highest fan state.

thalanix, Mar 1, 2010

#220

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

Whoa, at a very mild GPU overclock (575/1900/1450) and brightness at 100% when I started FurMark I hit 135W within 3 seconds with the fan speeds never having kicked up. It seems there really is no limit now outside of the PSU!

Peter

PJPeter, Mar 1, 2010

#221

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

unclewebb said: ↑

This looks very exciting. I'm going to go develop a quick tool with a single button so it will be easy for any user to toggle this on and off for testing purposes.

Click to expand...

i've toggled furmark on and off several times, and so far it's been persistent. any checking function you add won't need to run more than once every half a minute.

CCz_Cataphract said: ↑

Whoa, at a very mild GPU overclock (575/1900/1450) and brightness at 100% when I started FurMark I hit 135W within 3 seconds with the fan speeds never having kicked up. It seems there really is no limit now outside of the PSU!

Peter

Click to expand...

until you see electricity discharging out the vent

thalanix, Mar 1, 2010

#222

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thalanix said: ↑

until you see electricity discharging out the vent

Click to expand...

- I just put the screen at full LCD brightness w/ backlight KB on and my USB mouse plugged in - by the time I got to 82*C GPU temp the Power Meter was showing 134W constant draw from the wall. I'm sure I could get a shutdown at stock clocks with this PSU with this setup - and that's not even taking into account having my phone or any other USB device plugged in to recharge, or any joysticks/keyboards, external HDDs, ODD, Camera or any other part of the laptop active.

The G51Jx needs 150W at Stock ASUS clocks.

Peter

PJPeter, Mar 1, 2010

#223

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

after 15 minutes of just furmark on stock clocks, it peaked at around 99-100C GPU for me, active core at 70. i can see why they put in the throttling.

thalanix, Mar 1, 2010

#224

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

What would be ideal is if we can have an app that we can put in what we feel are our max temps for GPU and CPU and when we reach those temps, have it reset the MSR to what it was originally until the temps get to some other level we decide - and then reactivate the non-throttled state.

That way we can track what temps we feel are OK as well as knowing by our own testing how much power the PSU can supply safely (since some people with cooling pads, better paste, etc... know their temps can get higher than their PSU can allow).

What do you think?

Peter

PJPeter, Mar 1, 2010

#225

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

not needed. the GPU will still throttle at 105, and the CPU will never reach that high unless the fan is broken (which would be pretty obvious).

thalanix, Mar 1, 2010

#226

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

CCz_Cataphract said: ↑

What would be ideal is if we can have an app that we can put in what we feel are our max temps for GPU and CPU and when we reach those temps, have it reset the MSR to what it was originally until the temps get to some other level we decide - and then reactivate the non-throttled state.

That way we can track what temps we feel are OK as well as knowing by our own testing how much power the PSU can supply safely (since some people with cooling pads, better paste, etc... know their temps can get higher than their PSU can allow).

What do you think?

Peter

Click to expand...

thalanix said: ↑

not needed. the GPU will still throttle at 105, and the CPU will never reach that high unless the fan is broken (which would be pretty obvious).

Click to expand...

The point is to prevent the PSU from causing a shut down by monitoring the temps and thereby estimating the total system power load. By leaving it customizable then people can know how many USB devices, ODD, Camera, etc.. they want to be able to safely use while maxing out the performance without risking a system power shutdown. So with the 120W adapter you set the temps lower, with a 150W then you set them higher.

Peter

PJPeter, Mar 1, 2010

#227

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

still not worth it imo. power consumption will vary day-to-day depending on what you have plugged in, and that's discounting the possibility of the PSU degrading from consecutive beatings like these.

it'll be shutting down long before it hits temperatures over 90.

thalanix, Mar 1, 2010

#228

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thalanix said: ↑

still not worth it imo. power consumption will vary day-to-day depending on what you have plugged in, and that's discounting the possibility of the PSU degrading from consecutive beatings like these.

it'll be shutting down long before it hits temperatures over 90.

Click to expand...

Again - I repeat, the point is not to limit the temps but the power. We can customize our limits to safe enough settings and adjust them when we want, that would be my ideal case.

I want to avoid as many system shut downs due to power as I can - I think you do too - without being able to monitor the power input all the time this is one way to avoid that imho - do you know a better one?

Thanks,
Peter

PJPeter, Mar 1, 2010

#229

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

as nice as it would be, it's not possible in any way i know.

to measure power, is to measure temps. but to get temps to extremely high levels, means getting the power extremely high. power gives before temps. catch 22.

edit: to clarify, stuff like the ODD and camera will be negligible compared to the 50-70W of the GPU. the difference might even be attributed to the power brick preferring colder weather. going simply by GPU clocks isn't enough, especially not on windows7, so that leaves only going by temperatures. unfortunately there are too many variables that can change the "safe" point, so unless the goal is to set 110W, it may as well be right back down with asus' 90-100.

thalanix, Mar 1, 2010

#230

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

In my tests, at stock levels, the PSU was maxed out even at ASUS stock clocks with very little extra power drain.

If it were possible to set a limit to re-enable the throttling system when the GPU reached 80*C and one of the various CPU 70*C (where I know between the fan levels, GPU power use, and Turbo Boost I'm at ~130W on my PSU) then I could save myself from a system shut down and still have 5W of USB and other additional power drains factored in.

Then if I get a 150W PSU I would increase that GPU limit to 90*C and the CPU cores likewise to 80*C for example. That's why the app would need to be completely customizable in letting us put in the limits - otherwise we would be back to the ASUS/Dell 90/100/110W type of limits.

If I have a system shut down with my safe levels inserted, then I would lower those safe levels some more.

I know it's not perfect, but it would make feel safe gaming with this MSR fix you discovered - without any sort of protection I would feel I'm on pins and needles all the time I have it active unless we find the absolute max power drain and confirmed it was safe and I have a PSU that can support that.

Thanks,
Peter

PJPeter, Mar 1, 2010

#231

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

that would give a false sense of security.

if you set the limit too tight, then a harddrive spinning up after it's inactive, plugging in your phone, or loading a map each can go over. these events are unpredictable and unmeasurable.

if you set it too light, then you will go back into the state that asus has it in but with a lot more cycling. it's at the very limit with furmark as it is. the simple truth is, is that asus should have a 150W adapter with this notebook and they don't. (it should also come with a better cooling system, but that's a different topic...)

thalanix, Mar 1, 2010

#232

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

thalanix said: ↑

that would give a false sense of security.

if you set the limit too tight, then a harddrive spinning up after it's inactive, plugging in your phone, or loading a map each can set the go over. these events are unpredictable and unmeasurable.

if you set it too light, then you will go back into the state that asus has it in but with a lot more cycling. it's at the very limit with furmark as it is. the simple truth is, is that asus should have a 150W adapter with this notebook and they don't. (it should also come with a better cooling system, but that's a different topic...)

Click to expand...

That's why it has to be fully customizable with no preset values for a user - just values they choose and test themselves. The settings are only as good as the user is willing to set - and if they don't want to risk it then they can just leave the MSR limited as it is. But if they do have a 150W adapter or want to push their 120W one to the max without going over, this can be one way of helping them do it without shutting down all the time.

Peter

PJPeter, Mar 1, 2010

#233

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

even if we can figure out a formula and method to estimate (within 5W, any more or less and it's for nothing) the actual consumption, event-based cuts like that will have to be made by ACPI. i don't think anyone (except maybe unclewebb) here can write a full driver to replace the one asus provides.

thalanix, Mar 1, 2010

#234

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

It wouldn't be possible to have an app run in the background that can monitor the temps every 5 seconds or so and have a trigger based on what you input that could either bring up a message or auto-throttle you when the temps reach the limits you set? Unclewebb already offered to make a button that can change the MSR at a click - we would be doing that, but adding the extra dimension that it would also change the MSRs when the temps reached those user defined levels.

It seems possible to me, and not too difficult imho...

We could come up with a formula it's true, taking into account all the different Cores combined with the GPU temp and adding in a ceiling and trying to take into account extra user usage (or having been given some modifier by the user based on what extra loads they anticipate) - but just allowing the user to input the temps would be straightforward and give us a starting point imho.

Peter

PJPeter, Mar 1, 2010

#235

thalanix Notebook Deity

Reputations:: 353

Messages:: 1,012

Likes Received:: 0

Trophy Points:: 55

the only input we have is clocks, temperatures, and device IDs.

if you go by temperature, then as i said before; the power draw will get there first every time. i can run furmark + prime95 while oc'd, but the temps never go over 87 (when the fan should spin up a notch) because the brick will shut off the instant it does. 5 seconds is nowhere near enough. reading the fan speed is beyond our reach.

if you go by clocks, those are too unstable to provide any meaning. the GPU likes to jump up on moving a window and the CPU is all over the place.

if you go by connected devices, i don't think it's worth the trouble to go through every device you have and fill in it's max usage (which is still very variable).

adding everything together assuming the potential max draw will get you the limit asus set.

thalanix, Mar 1, 2010

#236

PJPeter Notebook Deity

Reputations:: 110

Messages:: 1,122

Likes Received:: 35

Trophy Points:: 66

We ignore clocks (we are dealing with a case when they are the max that the user defined anyway), we ignore connected devices (we just trust the user to put the temps to a level that takes into account extra devices loaded). Users can use Kill-A-Watt meters to get precise power usage amounts from the wall and if they observe when the adapter shuts down (~137W in my case) then they can observe how low below that they are and determine what safety margin they themselves want to allow.

We are not trying to calculate based on Dev-IDs - that is all left to the user to test and determine themselves.

The system I'm suggesting would be quite simple and you can definitely set a value that is lower than 87 combined with a CPU temp that you observed with that to prevent the shut down from happening before it does, since it is only when both are running at high levels that we see the crash. We could allow people to input different temps over different combination of usage over the different cores - but even that is refining things beyond what I suggested. The GPU max temp works as well since it is well beyond the spikes you get when you move the mouse - and as I said, you could set it so that when the GPU gets to 10*C or so below the cutoff, then you re-activate the MSR de-throttle. You could also wait for the GPU to be at the max temp for 3 seconds or something before doing the MSR throttling - because the fan activity is part of the power drain.

At this moment though, all I am suggesting is a system that when GPU reaches XX and any CPU core is also at YY then de-activate the MSR de-throttle - then we can use that as a basis from which to test and then refine the system as we use it if we find improvements through testing that worth implementing.

Peter

PJPeter, Mar 1, 2010

#237

unclewebb ThrottleStop Author

Reputations:: 7,810

Messages:: 6,414

Likes Received:: 6,726

Trophy Points:: 681

Toggle1FC - Version 2.0
http://www.sendspace.com/file/07d762

Here's a nice simple program to let users toggle this magic bit on and off.

I might build this into ThrottleStop but I'll wait and see how feedback goes first and how many people blow up their power adapters.

Furmark is a beast but for regular gaming I don't think you will have to worry about your power adapter shutting down.

This new tool only works on Core i CPUs so let me know if it works OK.

unclewebb, Mar 1, 2010

#238

[Fixed/Workaround] Asus G51J(x) CPU throttling investigation

unclewebb ThrottleStop Author

thalanix Notebook Deity

Mishax1 Notebook Enthusiast

PJPeter Notebook Deity

unclewebb ThrottleStop Author

thalanix Notebook Deity

Mishax1 Notebook Enthusiast

PJPeter Notebook Deity

thalanix Notebook Deity

DCMAKER Notebook Deity

thalanix Notebook Deity

DCMAKER Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

nfshp253 Notebook Evangelist

thalanix Notebook Deity

nfshp253 Notebook Evangelist

thalanix Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

ryzeki Super Moderator Super Moderator

PJPeter Notebook Deity

thalanix Notebook Deity

cookieofdoom Notebook Consultant

thalanix Notebook Deity

unclewebb ThrottleStop Author

ryzeki Super Moderator Super Moderator

thalanix Notebook Deity

PJPeter Notebook Deity

unclewebb ThrottleStop Author

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

thalanix Notebook Deity

PJPeter Notebook Deity

unclewebb ThrottleStop Author