I've been having an ongoing issue with the GTX 980 in this thing overheating. Soon as it hits 70c it's basically done and black screens. I've tested a wide array of things and am at my wits end.
I've reached out to XoticPC regarding this (whom I purchased it through) and they've been less than helpful except for only recently due to my public display of frustration over their customer support.
I'm hoping perhaps some of the professionals here have some experience with this or something more to test? I've tried underclocking and that works for awhile as long as my game settings are low enough to never push it to maximum load, which generates too much heat. Primarily low settings with VSYNC is the only way I can manage to keep it from overheating so quickly.
Below is my XoticPC topic with everything I've tried. There's also HWINFO dumps in my topic with my most recently reply containing 2 temperature dumps showing it crashing at 70c. Even with fn + 1 maxing the fans out it still overheats.
http://xoticpcforums.com/showthread.php?18528-NP9870-S-Nvidia-980-Overheating-nvlddmkm
Getting the feeling this GPU just flat out does not belong in a laptop.
-
Was the heat issue resolved by repasting?
Also, your black screen sounds like driver problem. Does this thread help - http://forum.notebookreview.com/thr...g-drivers-mirror.786094/page-10#post-10214806. Especially part about DDU.
The only other thing that comes to mind is a loose connection in a cable to the panel in which heat causes the display to lose its signal. Do you have an external monitor to check to see if the card still produces a signal when used but hooked up to an external display?Last edited: Mar 10, 2016Krileon likes this. -
I've tried about 5 different drivers. Starting with the ones that came with the system. I now have the latest beta drivers installed, but issue persists. It consistently crashes at around 70c. I didn't use DDU though, but I did use the clean install option on every install though. I had no issues with the actual install of the drivers. I'll try with DDU this time and see if that helps.
I think I can manage to dig up an old monitor to try, but would that cause the event log to throw errors? Any idea where I can check this cable (the literal entire bottom of this laptop comes completely off... quite nice actually). I remember having a similar issue with my previous laptop of which I had to RMA, but they never told me what was wrong with it. It's just unfortunately a bad time for me to be able to RMA right now. It seams like the GPU fan just isn't working very well at all. -
Support.1@XOTIC PC Company Representative
Are you getting any errors in the event log when it crashes like that? Does one fan seem like it is working harder than the other fan as well? I don't think that would be the error for the crash, but would be something that we would probably want to look at as well if it isn't working.
I know the service manager is checking in to some of this for you. But if there is anything I can do to help out as well, feel free to contact me here or email me. We want to make sure we can find a way to get this taken care for you. -
One fan is absolutely working harder than the other. More specifically the CPU fan is working fine, but the GPU fan barely seams to move air. It's possible there's a bad connection with the fan or this fan is just no good.
On a side note as updated in my XoticPC topic it's extremely weird to me that the copper heat pipes have all been painted black. Does anyone else with this model have this issue? When you open the bottom see if the heat pipes are painted black or plain copper. Should be plain copper. -
is this a common problem with this laptop? Why is it overheating at 70c? That isn't that hot for a GPU I have seen people who run their cards A LOT hotter than that and have no problems with overheating at all....even if it is a fan problem 70c isn't that hot and shouldn't be overheating at that temp...
*edit* I am almost positive all the new Sager/Clevo's heatpipes are black I think they did that because of people calling them "batman" lol but my laptop is "older" and my heatpipes are just plain ol copper and they look awesomeKrileon likes this. -
Support.1@XOTIC PC Company Representative
Yeah, the models I've all seen are all that way with the black heat pipes. I don't think if it was making a significant performance change in heat that they would have painted them. I'll let our service manager know about the fan issue though as well, so he can check in to that.
Krileon likes this. -
Ah, ok thanks for the black heat pipe explanation, lol. Didn't realize they were doing this on purpose.. to components no one sees unless they open it, lol.
-
Can you check the GPU fan cable to the header? Is it loose? If it is the header itself, a temp sensor or something on the mobo has gone caput, then you're most likely going to have to find some way to get it in to be looked at.
Another thought, with the P870DM, there was an EC problem that needed correcting in the BIOS. @Phoenix or @Prema may have more additional information. Since the EC has some decisions to make regarding fans, is there a way to check to see if you need to update the EC?
Also, perhaps @pat@XOTICPC or @Meaker@Sager may be able to provide some assistance as well.
----
p.s. - Can't get to any crash/dump reports due to Xotic's forum policies for unregistered users. If anyone on these forums asks for them, you'll most likely need to repost them here.Krileon likes this. -
Krileon likes this.
-
Agreed. To me, that part still sounds like driver related or perhaps something not quite physically connected with the GPU.
Krileon likes this. -
EC problem? Do you have a link to that discussion by chance? Do you mean the system BIOS or vBIOS? Current BIOS appears to be 1.05.03LS1. Below file contains my HWINFO system report and 2 sensor reports (both up until BSOD).
https://drive.google.com/file/d/0B6YuIA-KqT-dUkJlOXd6YkFDalE/view?usp=sharing
Sorry for the Google Drive link, but these forums won't let me upload attachments. Button is there, select a file, and it does nothing from there.
Last edited: Mar 10, 2016 -
No. Not VBIOS, but regular BIOS. The Clevo BIOS consists of the main BIOS for the system and the EC portion. EC stands for Embedded Controller.
IIRC, what originally was discussed was a throttling issue in the CPU, but there may have been some implications on the GPU benchmarks. In any case, I believe this was fixed by Prema, and I believe Clevo was made aware of the problem. @Mr. Fox might remember where this was at, but here's the resolution :
http://forum.notebookreview.com/thr...ers-lounge-phoenix-has-arisen.781814/page-157
Now, I don't necessarily think this is related to the problem you're describing. And I have no idea if the GPU fan is controlled by the EC and limits/zones set there or perhaps by some other device. Regardless, I wanted to point it out just in case.Krileon likes this. -
Hmmm.... Interesting post!!!
@Mr. Fox, could you take a look at what @Krileon is reporting. This seems very, very familiar - http://forum.notebookreview.com/thr...enix-has-arisen.781814/page-153#post-10150708
Take a look at some of the hints he has (or PM @Mr. Fox). Things like Control Center, NVIDIA GPU Scaling, etc. Maybe something there will help.
So it doesn't look like it is driver related after all.Last edited: Mar 10, 2016Krileon likes this. -
Support.1@XOTIC PC Company Representative
Also, I did ask to see if we can get you the EC and vBIOS link sent over to you from Sager. Not sure if it would help, but couldn't hurt to try.
-
My 2 cents, for what it's worth...
@Krileon - I think you have a bad GPU. The log files show no problems that I can identify. Temps are fine. The problems I had with the screen going black with 980M on two difference laptops were caused by a GPU defect. In both cases one of the GPUs (in SLI) was messed up and causing black screens. In both cases this eventually, over time, lead to complete failure where the machines refused to POST or the screen was black all the time until I removed the offending GPU and ran only the good one by itself.
If this is happening with temps well below 80°C and no overclocking, it's time to get that GTX 980 swapped out for one that works correctly. The fact that there are not a bunch of other people with the same machine having the same problems you are experiencing is further indication that a repair needs to take place. -
Also regarding your "Also, look in the BIOS and if the option is present confirm NVIDIA GPU Scaling is set to the "Disabled" option." comment in the above linked topic in my BIOS GPU Scaling is indeed enabled. Should I disable this?
Edit: Ok, my BIOS more specifically says "GPU Performance Scaling".Last edited: Mar 10, 2016hmscott likes this. -
-
Ok, I used DDU, installed 364.51, disabled GPU Performance Scaling, uninstalled Sager Control Center (Hotkey), and enabled VSYNC. Lasted about 20mins before BSOD. HWINFO dump shows it crashed at 69c. Keeps doing it around 70c. Could there be something wrong with the BIOS thinking it's overheating? Attached is the HWINFO report from this test.
Attached Files:
Last edited: Mar 10, 2016hmscott likes this. -
-
-
-
Prostar Computer Company Representative
Although you've contacted Xotic for help/RMA, do you have any minidump files you (or we) can analyze that might shed a little more light on things? -
@Krileon - another question is what Geforce drivers are you using? The driver provided by Clevo or something else? A fair number of people are having issues with newer Geforce drivers being super buggy if not downright unusable. If you have not already done so, try using an older driver. Go to NVIDIA.com and select your GPU and OS, then search for drivers. In the list of drivers that appears choose the oldest you can find with support for GTX 980 notebook GPU and see if the situation changes. Try several drivers from the oldest forward. If you have the same behavior regardless of what driver is used, that would be useful information. To @Prostar Computer point, the minidump files might be useful. The bugcheck code may help pinpoint if something other than the GPU may be causing behavioral issues.
Krileon likes this. -
I just now tried lowering the game settings then watched the temperature with HWINFO. I can repeatedly confirm that as soon as it hits 70c it begins to fail. I kept tabbing out to drop the GPU usage and was able to play for nearly an hour by forcing the GPU to keep scaling down when at desktop to keep the temperature down. Something is wrong with a sensor or something it seams. There's no reason for it to fail at 70c as if it thinks it's overheating. -
tanzmeister Notebook Evangelist
the GPU needs to be replaced, 99.9%. RMA it and save yourself from frustration.
-
Mr. Fox likes this.
-
-
This would be my second laptop in a row from XoticPC where the GPU is basically DOA. It's a bit discouraging to say the least. Already passed my old one down to my wife and get to watch her play The Division while I sit here with an expensive paperweight, lol. -
Prostar Computer Company Representative
-
Ok, some more findings. I installed the clevo vga drivers below (nvidia 354.35).
http://www.clevo.com/en/e-services/download/ftpOut.asp?Lmodel=P870DM<ype=9&submit=+GO+
I no longer get a black screen, but the video driver does fail and windows does manage to recover from it. The game obviously crashes, but the computer doesn't get stuck in a non-recovery state. This all started to happen around 70c (which it hit then quickly dropped back down). Below is the event logged.
Code:- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> - <System> <Provider Name="Display" /> <EventID Qualifiers="0">4101</EventID> <Level>3</Level> <Task>0</Task> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2016-03-10T23:41:09.753703800Z" /> <EventRecordID>9983</EventRecordID> <Channel>System</Channel> <Computer>Kyle-PC</Computer> <Security /> </System> - <EventData> <Data>nvlddmkm</Data> <Data /> </EventData> </Event>
Update: Retried Sagers (354.35, same as clevos; odd?) below for the heck of it and it gives black screen. So far Clevos has been the most stable (windows was able to recover). Going to try a few more I guess.
http://www.sagernotebook.com/drivers.php?cat=627
Update: Tried oldest available nvidia driver (358.50) and same issue with black screen. I don't think it's a temperature issue, but entirely due to being at maximum load. I compared the GPU core load on each of these and they're at 99% when it failed. I think this basically confirms bad GPU for sure.Attached Files:
Last edited: Mar 10, 2016 -
-
hmscott likes this.
-
-
Well, last ditch effort I underclocked the core by 400mhz and it hasn't been able to hit 70c. Highest it hit is 66c. No crash yet, lol. 70c must be a magic number. I know this is just delaying the inevitable, but if it can last until the new GPU comes in then that'd be great, lol. Anyway, thank you everyone for all your help. Hopefully have the RMA process going tomorrow.
Last edited: Mar 10, 2016hmscott likes this. -
Some other component on the card must be overheating. Maybe there's a thermal pad missing somewhere. You should take a pic of the heatsink thermal pad layout to show XoticPC if it's right.
-
Support.1@XOTIC PC Company Representative
I know we are checking in to it and would be able to RMA it. We are working with Sager to see if we can cross ship the GPU to save some down time. I would agree, I think you have provided plenty of testing info at this point as well and can relax on that point until you hear back on the RMA. Hopefully you can keep it under 70 for the time being to enjoy this math game you are talking about.
But I'm sure you'll hear back from the service manager as soon as he gets answers. Feel free to reach out to me as always if I can help. -
Well, took the photos as @ssj92 suggested. They're attached as follows. You'll notice in one of them I took a closeup of a component that looks not so right. I'm not sure what it's covered in, but looks like paste. I didn't downscale them so they can be zoomed in for further detail as needed.
Attached Files:
-
-
Meaker@Sager Company Representative
That looks to be the bios chip and thermal paste to me. I assume you recovered the core?
-
-
Support.1@XOTIC PC Company Representative
Any change in performance once you did that?
-
-
Support.1@XOTIC PC Company Representative
Yeah, I'd agree, it isn't something we like to see at all either.
-
Good news! RMA is on its way. Hopefully next week things should be running good. Will report back my findings and tests after the new card is in. Also ordered a new tube of MX-4 instead of using the paste that came with the computer.
-
-
i_pk_pjers_i Even the ppl who never frown eventually break down
http://3.bp.blogspot.com/-Hd2g3UaTweM/VLU5ibAbN4I/AAAAAAAAKgY/zagwpzmXC2I/s1600/temp+diff.jpg
http://media.bestofmicro.com/6/F/396807/original/01-Water-Cooling-High-Pressure.png
I feel MX-4 gets a lot more hate on these threads than it deserves. I never had any issues with pump-out or anything else.Last edited: Mar 13, 2016 -
Look at one of the pictures with the test of Gelid Extreme you posted. Gelid Extreme is torn apart by AS5. This is just nonsense... All tests of Thermal paste show different results. Some tests show that Liquid Ultra only gives 2 degrees better temperature than the best standard paste. This is also nonsense. The real difference is 10 degrees or more on laptop processors (OC'd) in bench. This is well proven by me and a lot of others. It's the reality that counts. Not thermal paste rewievs done on test bench/desktop processors and with different colling.Last edited: Mar 13, 2016 -
Meaker@Sager Company Representative
Depending on the heatsink MX-4 is a little more prone to pumping out, usually IC diamond and the thicker pastes get along better in a mobile environment.
Georgel likes this. -
In so-called test rewievs done with the same paste's on desktop processor, was the difference in temperature only 1-2 degrees... So you can not say that the difference is maximum 2 degrees between different paste's on laptop hardware who have a more fragile cooling...Georgel likes this. -
-
Just look at how fraudulent such rewievs of paste are (see images of two tests). Gelid Extreme get a totally different result against AS5. This difference in temperature result is not sensational. It depends on who performs the tests and the use of different hardware. This is like the lottery. You have to try different paste's self to see the difference.
And look below ↓↓↓. Gelid Extreme is nearly 5 degrees colder than AS5.
Attached Files:
Last edited: Mar 13, 2016Vasudev, oveco, GTVEVO and 1 other person like this. -
-
I don't think they are fraudulent or fake like you are claiming just results vary...
Georgel and i_pk_pjers_i like this.
NP9870-S (P870DM-G) - Nvidia GTX 980 - Overheating
Discussion in 'Sager and Clevo' started by Krileon, Mar 9, 2016.