My previous topic http://forum.notebookreview.com/threads/np9870-s-p870dm-g-nvidia-gtx-980-overheating.789202/ found faulty hardware. Its been quite a few months and I'm back to having issues again, but doesn't appear to be the same problem. I now am getting the following (in the order specified below).
ERROR LOG 1
Faulting application name: dwm.exe, version: 10.0.10586.0, time stamp: 0x5632d756
Faulting module name: dwmcore.dll, version: 10.0.10586.494, time stamp: 0x5775e327
Exception code: 0xc0000602
Fault offset: 0x00000000000d04ff
Faulting process id: 0x19a4
Faulting application start time: 0x01d1df8c695fd20c
Faulting application path: C:\Windows\system32\dwm.exe
Faulting module path: C:\Windows\system32\dwmcore.dll
Report Id: 9c917736-d3e0-48a9-be77-9f5d14dd10f4
Faulting package full name:
Faulting package-relative application ID:
ERROR LOG 2
Faulting application name: dwm.exe, version: 10.0.10586.0, time stamp: 0x5632d756
Faulting module name: dwmcore.dll, version: 10.0.10586.494, time stamp: 0x5775e327
Exception code: 0xc0000602
Fault offset: 0x00000000000d04ff
Faulting process id: 0xde4
Faulting application start time: 0x01d1df8c69b44ebd
Faulting application path: C:\Windows\system32\dwm.exe
Faulting module path: C:\Windows\system32\dwmcore.dll
Report Id: 01ff835f-bed6-4a80-95b5-c65ffc563e8d
Faulting package full name:
Faulting package-relative application ID:
ERROR LOG 3
Display driver nvlddmkm stopped responding and has successfully recovered.
ERROR LOG 4
Faulting application name: dwm.exe, version: 10.0.10586.0, time stamp: 0x5632d756
Faulting module name: dwmcore.dll, version: 10.0.10586.494, time stamp: 0x5775e327
Exception code: 0xc0000602
Fault offset: 0x00000000000d04ff
Faulting process id: 0x1d8c
Faulting application start time: 0x01d1df8c69d7e307
Faulting application path: C:\Windows\system32\dwm.exe
Faulting module path: C:\Windows\system32\dwmcore.dll
Report Id: 776f8a23-3cd0-4abe-8059-5a145b981e92
Faulting package full name:
Faulting package-relative application ID:
ERROR LOG 5
Faulting application name: ShellExperienceHost.exe, version: 10.0.10586.494, time stamp: 0x5775e94c
Faulting module name: Windows.UI.Xaml.dll, version: 10.0.10586.494, time stamp: 0x5775e900
Exception code: 0xc000027b
Fault offset: 0x0000000000517ad4
Faulting process id: 0x10dc
Faulting application start time: 0x01d1df89e40a95d0
Faulting application path: C:\Windows\SystemApps\ShellExperienceHost_cw5n1h2txyewy\ShellExperienceHost.exe
Faulting module path: C:\Windows\System32\Windows.UI.Xaml.dll
Report Id: 71603bc4-502f-4a00-8e85-062ca0469284
Faulting package full name: Microsoft.Windows.ShellExperienceHost_10.0.10586.0_neutral_neutral_cw5n1h2txyewy
Faulting package-relative application ID: App
ERROR LOG 6
Faulting application name: dwm.exe, version: 10.0.10586.0, time stamp: 0x5632d756
Faulting module name: dwmcore.dll, version: 10.0.10586.494, time stamp: 0x5775e327
Exception code: 0xc0000602
Fault offset: 0x00000000000d04ff
Faulting process id: 0x19b8
Faulting application start time: 0x01d1df8c69fbf3a5
Faulting application path: C:\Windows\system32\dwm.exe
Faulting module path: C:\Windows\system32\dwmcore.dll
Report Id: 20c13e02-6968-46e7-80cc-4a19930f894e
Faulting package full name:
Faulting package-relative application ID:
This happens while playing Elder Scrolls Online as windowed fullscreen. It only started happening after installing windows update KB3172985.
The screen flickers black then crashes and never recovers the driver. I suspect the driver cash maybe just a symptom of whatever problem this is, but I'm not sure. There's not much information about it. Any ideas what to look for? Revert the windows update?
-
Uninstalled KB3172985 and issue persists.
-
Have you tried using DDU to clean the drivers in Safe Mode and then reinstall one of the recent ones?
Last edited: Jul 16, 2016Spartan@HIDevolution likes this. -
No, but will give that a try now.
-
There was no change. It's just like my previous issue. Last time it consistently failed at 70c. Now it's consistently failing at 78c. Soon as it hits 78c if crashes. Attached are my HWINFO logs.
Attached Files:
-
-
Alright, which drivers are you using currently? Will you be able to DDU Clean once more and install 368.25 ? Only install the drivers+Physx and nothing else. (unless you need HDMI Audio)
There is a chance that some component on the GPU is getting over heated (MOSFET/VRM).
Also does it happen at random intervals? Or about the same duration into the game? -
I installed 368.81 after having the issue. Then DDU clean and installed it again, but issue persists. The issue started before updating anything other than windows. Nothing else has changed on the system and began failing. It seams like random intervals and completely related to the GPU temperature. Each log shows it consistently failing at 78c just as it did before at 70c.
In log_2.csv it fails on line 84. The temp is 78 at that point and the voltage goes from stable (the entire time up to this point) 1.143 to 1.106. The GPU clock does the same with a stable 1227.7 to 1189.3. Maybe the thermal throttling is never kicking in and failing at that point? -
And the thermal throttling only affects the GPU core and not the other components. So if a VRM starts over heating , you will not know until the component either completely fails or starts crashing the card.
I would definitely suggest trying an old driver version, preferably 368.25, and if that also doesn't help, then taking the heat sink off of the GPU and re-pasting. And while at it making sure that the thermal pads are making proper contact.Last edited: Jul 16, 2016Papusan likes this. -
If you press FN+1 and max out the fans does the crashing stop? Remember, the only GPU temperature sensor we have is the core. The core can be cold and still having something else getting too hot.
Papusan, sirana, Spartan@HIDevolution and 1 other person like this. -
Just to make sure, you can play other games consistently stable for hours?
-
Ok, just tried again with Fn + 1 to max the fans. Didn't fail at 78c, but still failed. Lasted a long longer though (around 30mins). The video failed and this time began beeping with a single beep and short pause. Attached the log for this one as well. The D3D usage hit 100% then dropped to 0% at this failure. D3D memory did the same. At desktop they both have some minor usage. So D3D is entirely failing.
I do not see 368.25 available on Nvidias website. I also don't understand why it'd have worked fine on my previous drivers for something like 3-4 months then suddenly do this today.
Below is the event log for this test.
EVENT LOG 1
The speed of processor 7 in group 0 is being limited by system firmware. The processor has been in this reduced performance state for 65 seconds since the last report.
EVENT LOG 2
The description for Event ID 13 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
NVRM: Graphics TEX Exception on (GPC 2, TPC 0): TEX NACK / Page Fault
EVENT LOG 3
The description for Event ID 13 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
Graphics Exception: ESR 0x514224=0x80000041 0x514228=0x180004 0x51422c=0x3f154c 0x514234=0x0
EVENT LOG 4
The description for Event ID 13 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
Graphics Exception: ESR 0x514224=0x80000000 0x514228=0x0 0x51422c=0x0 0x514234=0x0
EVENT LOG 5
Display driver nvlddmkm stopped responding and has successfully recovered.
No I have not tried other games. As I said I was playing this game yesterday and the weeks before perfectly fine. The ONLY change this morning before this started happening was the windows update, which I reverted and issue persists. The game also did not update.Attached Files:
-
-
So some component other than the core is heating up. Take the heat sink off and check if the thermal pads are making proper contact and re-paste the core while you are at it.
Papusan and Spartan@HIDevolution like this. -
Spartan@HIDevolution Company Representative
-
Spartan@HIDevolution Company Representative
-
I've repasted and will give this a try, but pretty sure it won't fix anything. Anyway, attached is some photos of the pads and headsink. The green pads were kind of mushy and felt like they had an oil residue on them. I cleaned all of that off as perhaps that was causing almost like a water barrier?
Attached Files:
bloodhawk likes this. -
-
But the bottom half of your core barely had any thermal paste on it. Either it was making great contact or there wasn't enough paste. But the re-paste should fix that. -
Well, posted 41mins ago. Still seams to be doing fine since then; that's longer than all my previous tests. Yeah, one of the components I dare say was completely covered in an oily residue from one of the thermal pads. I cleaned all of them well and repasted the GPU. Everything seams fine thus far, but not calling it fixed until it's stable for a few days.
I need to get into contact with support and see about new thermal pads as I've no clue which to buy. I'd like to replace the thermal pads entirely as they seam like garbage. -
I can explain the thicknesses later on tonight. Please remind me. -
I've no issue paying for quality thermal pads, but thickness would be my issue. Not sure where to find that info. I absolutely want to replace them as it seams like they're extremely poor quality and falling apart. Going on almost 2hrs now of non-stop game time and working fine still. So that surely seams like it's the issue, but no idea how long cleaning that oily crap off will last since those pads seam to be rapidly degrading.
-
Thermal pads wise, Phobya XT 7W/mK pads are pretty easy to get and cheap.. They're pretty good tbh.. You want to get 0.5mm/1mm ones and stack them..
Sent from my LG-H850 using Tapatalk -
Thanks for the pointers about the thermal pads. Will look into it a little more and get some ordered. -
Ok, I ordered 1.0mm and 0.5mm ARCTIC thermal pads below.
https://www.amazon.com/gp/product/B00UYTTXSM -
Meaker@Sager Company Representative
Also see if it happens in full screen.
-
-
Meaker@Sager Company Representative
Also those thermal pads look fine and the residue they leave (which is normal) is perfectly safe.
-
http://forum.notebookreview.com/attachments/wp_20160716_18_14_30_pro_2-jpg.136915/ -
And the green ones are 2mm. -
bloodhawk likes this.
-
-
Meaker@Sager Company Representative
It is an oil and is part of the thermal conductivity of the pad and keeping it soft. Using an insulator in a pad would make no sense.
-
Sorry, my first time experiencing thermal pads leaking pools of oil. Maybe due to it being a desktop GPU and generating more heat. I don't have a clue. Seams fine now though so I may hold off putting on the replacement pads.
It would appear you're with Sager. What do you recommend I do? -
Meaker@Sager Company Representative
It was likely a seating issue with the heatsink which you have resolved and I would say with it working leave it
-
Issue is back today. Games crashing with purple square artifacts.
-
-
-
So should I move forward with trying to get another GPU RMAd? should I consider having them step by down to a GTX 980m?Last edited: Jul 24, 2016hmscott likes this. -
Ask them if you can pull your drives, SSD's/HDD's, and ship it to them without drives. Then you don't need to waste time cleaning it and reloading when you get it back.
Explain the problem, they might get the heatsink replaced and can use the existing GPU.
Or, it might be another flaky GPU, but a 980 works for many without problems, and a 980m is a lot slower.
A 980m would require different cooling solution, it might require sending it in for replacement. As would changing to 980m SLI cooling solution.
I would go with 980m SLI, but that's me, and I know what I do works 90% SLI compatible, so for me it's best, and it's only another $200 for SLI 980m.
Either way, do it now, don't let it drag out longer. -
Neah, dont step down, not worth it. -
Pulled it apart again. There's either a design flaw with this laptop or they sent a bent GPU. The entire bottom row of thermal pads were not making contact and 3 of the green ones were not making contact. Bottom row I fixed with a 0.5mm stacking and replaced the green ones entirely. Looked at the GPU from the side and sure enough it arches downward. Will see if this helps.
Regardless I think I'll RMA the GPU and heatsink.
Sending in the entire thing isn't the best of option. It puts me out of work. If they're too slow it costs me business. I'll have to spend the next few days preparing a secondary system. Will discuss with them regarding sending the entire system in with the HDs pulled. -
hmscott likes this.
-
It sucks you have to take time now, but you are prepared for the next time too
If you can take photo's and annotate them to send to the vendor so they know exactly what you are talking about.
This kind of thing can go poorly if there is a mis-communication, and can result in a couple of trips if not clear from the start.
If you have any other niggles about the cosmetics, fit, finish, or features you want tweaked, include those too.
Do you need Prema BIOS/vbios updated?, this would be the time for them to do that too.
Good luck, and please come back and let us know how it works out.Last edited: Jul 24, 2016bloodhawk likes this. -
I normally have a backup system. I just haven't done it yet. I buy a new laptop, pass old to the wife, then make her old one the backup. The old backup is wiped and given to either of my kids. In this case I've simply yet to wipe my wifes old one and turn it into the backup. Regardless it's frustrating as hell no matter the situation as I still have data to transfer and programs to install, etc... The last 3 laptops I've bought from XoticPC are Sagers and all 3 have had issues. The only 2 I didn't have issues with was my first Sager and the Asus.
I've already got photos in this thread and sent all of this to them. -
Ok, with the thermal pads replaced I've been playing for a good solid 4 hours with no issues. Fingers crossed that was all the issue was.
-
Meaker@Sager Company Representative
NP9870-S (P870DM-G) - Nvidia GTX 980 - Crashing
Discussion in 'Sager and Clevo' started by Krileon, Jul 16, 2016.