***FIXED - Replacement 780m GPU fitted, no more crashes***
Hey guys, long time since i've posted here
I have an Alienware 17 R1 (4700MQ, GTX 780M, 16GB Ram, 750GB WD Black + 64GB Plextor M5M mSata Cache, 1080p FHD Anti-Glare, 240 power pack, latest Nvidia drivers and A14 BIOS) that has all of a sudden started to power off when running something graphics intensive. Started a couple of days ago whilst playing PlayerUnknowns Battlegrounds (great game btw!). Played a few rounds one after the other no problem then about 10 minutes into another round the machine turns off completely, no reboot, nothing. I try powering back on and a couple of seconds later dies. I try again to power it on and it fires up and boots into windows fine. Doesn't seem to have any problems but i run a couple of stress tests on the CPU, RAM and finally GPU using DellSupportAssist. CPU tests are fine but the machine powers off again about 30 seconds into the GPU stress test. It seems now whenever the GPU is under load it dies. Temps were my first thought but all are fine (65-75 during stress tests on GPU & 75-80 in game). It will still launch games but dies after a couple of minutes, once it hits max load i'm guessing.
Both CPU and GPU are stock speeds and have never been overclocked.
Heres what i've tried so far:
- Removed Nvidia drivers using DDU in safe mode and reinstalled latest driver
- Removed driver again as above and installed an older driver version
- Ran stress tests including Heaven Benchmark & DellSupportAssist
- Ran BIOS hardware diagnostic (all came back clear)
- Repasted GPU & CPU
- Removed RAM and tested (all 4 sticks)
Things i've not tried:
- New power pack
- Windows rebuild (am gonna try this later when i get home)
- Reseating GPU
It all points to GPU failure but i'm hoping one of you can give me any ideas/solutions I can try before I go cry a little in the corner on my own
Thanks in advance guys and apologies for the long post!
TLDR - Laptop dies when GPU under load. Machine runs fine otherwise. Bunch of stuff tested but still have same problem. Pretty sure its not temps.
-
Zoltan@zTecpc Company Representative
I would recommend to test the laptop with a different power supply.
Sent from my SAMSUNG-SM-N915A using Tapatalkdeadsmiley and loafer987 like this. -
MickyD1234 Notebook Prophet
Hi, it does sound like temps but that usually causes it to beep a few times before shutdown. To make sure it's not temp, start with a cold machine. Run Heaven 4 ( https://unigine.com/en/products/benchmarks/heaven/ ) and leave it running watching the temp and see what it reaches before it shuts down.
The PSU could well be triggering this but in my experience when a PSU is overload it will not work again until you disconnect it from the mains to reset it.
I wouldn't bother with a win re-install, it's some hardware issue I'm sure (but at least it makes you sure if you feel so inclined)...
Good luck. -
Hi, thanks for the reply guys.
I've ordered a genuine dell PSU to eliminate that as a possibility and i'll try Heaven again once i'm home. Temps were not getting silly when running it last time but worth looking back into. In fact, thinking about it now, Heaven didn't crash the system at all. The SupportAssist stress test did (as well as some games; Division, PUBG) but Heaven ran without any crashes -
MickyD1234 Notebook Prophet
Sounds like the GPU could be on it's way out. The 780m was the 680m with more cores opened up and higher clock. I have seen a couple of failures here on NBR ,and they are said to run hot (mid-high 70's at stock)
Do you have to unplug the PSU after a shutdown?
With heaven running fine it could even be the CPU, check that with HWInfo and turn on logging (can be tricky not to get too much info). That way you may see something in trouble before a shutdown.
I take it the only item logged in event manager is Unexpected Shutdown?
The GPU test might actually be testing the on-board GPU, that can get quite hot when gaming even if the NV is active.
If you want to go the whole hog use msi afterburner and set up the on-screen display and select each item you want to see when gaming. If it works in a game that crashes you might get a clue from what is happening as it crashes?
After thought: remove the NV completely and run one of the games that trigger the shutdown. Hopefully it will run on the Intel, letting you know the on-board is fine. -
Ive not had to unplug the PSU to get it to fire back up again and yeah the event log only shows it as an unexpected shutdown unfortunately.
The GPU used to run hot, mid 80's to 90 but was always stable. A repaste lowered that significantly though.
I ran a test on the onboard (intel) GPU last night and it worked fine with no crashes so I think the CPU is ok but i'll run a bunch of tests when I get home and see if I can eliminate things one by one. I'll try your suggestions and see what I can find out -
MickyD1234 Notebook Prophet
Looks like you are on the ball with all this stuff
. It could well be that those temps shortened the life of the GPU? You should check if a power state on the GPU triggers it - like a throttle that reduces the voltage. OSD while in-game is going to show that hopefully before it crashes.
Just FYI but I have not seen a single person that gets a failure in the on-board diags, even if the GPU is clearly bad. Assuming the machine passes the test with the NV removed you can be fairly sure the NV had failed somewhere.SteveMonk likes this. -
Tried a few things last night. It seems the onboard is fine, no problems there as far as I can tell. I did notice there were later drivers available which I installed (clean install) and the machine died mid way through the driver install. I powered back on and drivers installed fine second time round. Weird as there was no draw on the GPU at the time. I also found it was actually crashing during Heaven as well so i decided to try running it again but running off battery only. The machine auto limits the GPU memory to 800mhz and it ran perfectly fine but once I plug the PSU back in and it kicks it up to 2500mhz it cuts out and dies.
Ive found a way this morning to control the power state so i'm gonna try that later once i'm home from work.
Tried to monitor as much info as i could but its tough to see a change as the machine cuts out all of a sudden. Its becoming a giant pain in the backside!
As a side note, I got annoyed last night and bought a 880m to replace it but i'm still gonna keep working on trying to figure out/rectify the issue. At least throwing in the 880m will confirm whether or not its the GPU thats the problem -
MickyD1234 Notebook Prophet
Good stuff. Just FYI but the GPU clock is also dropped on battery. I assume you're using NVInspector to set power states?
Sure seems we are working with a dodgy NV GPU
Good luck... -
Just a few questions. The clocks for your card in game are what? What you are describing doesnt sound like a bad gpu it sounds like a bad overclock. Is there a chance that a program you are running has overclocked or severally undervolted your card? The gpu temps you are experiencing are well under the safe limit for your card. In fact nvidia makes your laptop shutdown around 94c and even at those temps there is no permanent damage. I have had several cards go bad in r1s that i have owned and i have never seen one go partially bad. They almost always die completey and it is a black screen and no signal from the display. It can happen but a tell tale sign is screen artifacting and hanging in game or during normal use. Its far more likely that since it is only causing trouble under load it must be an overclock setting, a power draw issue or a temperature issue. Considering the card has been repasted and you monitor it i would say it is unlikely to be heat related but an OSD would be best to diagnose. Also make sure and run with optimus disabled to make sure your card is the problem.
-
MickyD1234 Notebook Prophet
I replaced the 675m with a 680m and that card is solid, overvolted and overclocked it never exceeds 74c. -
I have seen so many 780m's die in the last 6 months to a year... Mine went last October, followed by my work colleagues shortly after and multiple 780m's here on NBR and on Reddit....it's like they hit they end of their life expectancy or something and are dying off in droves.
I would remove the GPU from the system and power up the laptop and make sure the problem goes away. If it does, then you can put the GPU back in and see if the problem returns. At least that way you can narrow it down to GPU and or power.MickyD1234 likes this. -
MickyD1234 Notebook Prophet
@MogRules
Yeah, I saw the same with the 580m/675m. It seems they 'burn-in' and then a driver update triggers all sorts of problems. Of course, you get all the conspiracy freaks saying it's designed in!
Usually by the time a sufferer posts they will have gone through a nightmare chasing a software issue.
For this one it only happens in-game on the NV, and the PSU is not tripping out so I'm looking hard at the GPU.MogRules likes this. -
Sorry for the late replies guys, work has been mental so i've not had chance to try anymore troubleshooting on the 780. However the new GPU (which I got cheap and is the only reason I went for the 880 over the 780) and the PSU are turning up today so I'll be able to try a bunch of stuff this weekend to figure out whats going on.
Thanks again for all the replies, i'll give all suggestions a shot and post my resultsMickyD1234 likes this. -
Hi guys, so its been a mega busy couple of weeks but I've finally got around to trying a few more things. First off the 880m was a bust, DOA when it turned up so that got fired straight back. Second, I've purchased 2 PSU's so far, both of which haven't been the correct part even though the listings have showed them as such (its a nightmare to find a PSU in the UK!!!) so im still stuck at square one.
I've checked the clock speeds whilst in-game/running heaven; core clock 849mhz and memory clock 2497mhz. Temps are holding firm at between 70 and 75. The laptop will cut out randomly still and there is no consistency. It can be within a minute or after 5-10 minutes. Temps are also inconsistent when it powers off as its cut out many times at 60-65 and other times in the low 70's. When it worked for those few precious minutes of testing it runs exactly as normal, no frame drops, stutters or slowdowns then it will just power off.
If I run the machine off the battery it obviously drops the core and memory clocks to slow down power drain. Running a game or heaven in this state causes no crashes what so ever. Because of this i've also tried underclocking the card using MSI Afterburner but it still results in a crash and with the stock BIOS it only allows for a slight under/overclock.
I welcome any fresh ideas... would a moded VBIOS be worth a try? I also still want to try a new PSU but finding the right one is becoming a nightmareLast edited: Apr 18, 2017 -
Did you try what @MogRules suggested?
Assuming you have the 60hz display you could remove the 780m and run using the Integrated gpu. -
Sorry yeah I should have mentioned that. Works fine without the GPU installed running on integrated graphics. Problem comes back when GPU is refitted
-
MickyD1234 Notebook Prophet
Yeah that does not help since the power draw without the 780m will be so low.
I've bought a couple of genuine PSU's from ebay, just check for OEM parts. -
MahmoudDewy Gaming Laptops Master Race!
As for PSU, there is a supplier here in Germany called mtXtec, whom I bought a new PSU from for 60 euros at the time. You can see if they deliver to the UK.
Posting on Amazon for someone selling their PSUs
https://www.amazon.de/gp/aw/d/B01CMF76NE/ref=ya_aw_od_pi?ie=UTF8&psc=1 -
Another quick update. I eventually got a new working PSU which didn't rectify the problem. I spent some time at the weekend tearing down the laptop and trying various things with no luck. Tried removing the original RAM and replacing with a brand new 8gb stick. I checked the placement of the thermal pads as suggested by @MahmoudDewy all of which look to have good contact. Also tried a powered cooling pad which reduced the GPU temp by a further 10-15 degrees but still having the same crashes at intermittent points while under load.... It even managed to run heaven for over 15 minutes at one point before cutting out but most other times it can die within 10-20 seconds from a cold start.
Sooooo what to do next... well I figure the only things it can be are the GPU or the mainboard, however I doubt its going to be the mainboard simply because there are absolutely no other issues with the machine and no cut outs with the GPU removed. Do you think its worth contacting Dell? Its 3 year warranty ran out last July so i very much doubt they will help and if they do it will prove to be very costly.
Can anyone recommend a UK based seller as the only ones I can currently find are on eBay and i've had very little luck on there recently
Thanks again guys -
This is interesting because I'm having EXACTLY the same issue, but with a 970m instead. Everything you've done I've done too, short of just replacing the card, I had my old 180w (I think) PSU for the laptop that came with an 860m and it kept the 970m in the 2nd highest power state (P1 instead of P0) when running anything graphic intensive, but it still shut down a couple times using that instead. It also only started about 2 - 3 weeks ago. I'm looking at getting a hold of my old 860m and testing it with that installed.
-
That's the only thing I've not tried, my gpu in another machine or another 100% working gpu in my machine. It's become super frustrating because I have no easy way of figuring it out without buying a whole new card. I even restorted to calling Dell earlier in the week and the guy dialed in and figured it's most likely a dead gpu as well. Out of sheer curiosity I asked him how much a replacement 780m from them would cost. I almost fell off my chair when he said £720
It would be great to hear your results if you are able to test your old card in your machine -
If your going to repalce the GPU there are much better options, such as 970m or 980m, or even a 1060. -
I wouldn't dream of paying so much. I'd upgrade to a 980m from a reseller before going anywhere near Dell for a replacement
Last edited: May 30, 2017 -
After all the testing including new PSU's, a DOA 880m and many many more things I finally have my AW17 R1 back up and running. The fix was a replacement 780m GPU. Luckily I was able to find a clevo one in good condition at a very reasonable price on ebay. Cleaned and fitted this morning and so far so good. I've run it through a bunch of stress tests along with a day of gaming and no crashes. So touch wood it will last me a little while.
Thanks to everyone who helped out in this thread, great bunch of guysMickyD1234 likes this. -
I wasn't able to get a hold of my old 860, but I have been able to use my 970 in my laptop by modding the GPU fan to be 100% on at all times and keeping my GPU throttled to the P1 powerstate instead of P0 powerstate until I can get another 970 or go back to the 860.
-
Hi, I'm having the same problem , my question is that if I have to change the graphics card, can I update it to the 780M? or buy a more current and compatible one?. I have an Alienware 17, GTX 770M
-
Hi everyone. I'm facing similar problem with my alienware 18 2014 version. But when I reinstall the game it is working fine for a days or two and later starting to give trouble. Either gets off and ONs agian or stays blank for us to start again.
-
You guys should really make new threads.
Powers Off Mid Game/Under Load. Possible Dead GPU? (AW 17 R1, GTX780m)
Discussion in 'Alienware 17 and M17x' started by SteveMonk, Apr 4, 2017.