Hey guys,
The other night my screen turned grey (with faint darker grey lines) and the computer was non responsive. I had to hard reset the computer. It worked fine all night. Then in the morning I tried to turn it on and it would not boot past the Windows 7 logo OR the Windows 8 logo (dual booting on different partitions), it just froze at a black screen with no mouse or anything visible.
I booted into safe mode fine and uninstalled my GPU drivers, and used the Driver Sweeper program to ensure they were completely gone and it managed to reboot into Windows normally. I then downloaded the latest drivers for my 6990m and installed them, it rebooted fine.
The temperatures have been quite horrible lately so I finally decided to buy some compressed air and some new thermal paste (Antec Formula 7 Nano Diamond). I opened it up and followed the service manual. I found huge lint mats where between the fans and the heatsinks for both the GPU and CPU. I repasted only the CPU as I didn't have time to remove the GPU heatsink, just moved the fan to clean the dust.
I rebooted to find the temps A LOT better. I was just monitoring them at idle and all of a sudden boom! Another grey screen. Had to reboot and the same problem occurred, wouldn't go past the boot screen. Repeated the uninstalling in safe mode process, but this time when I reinstalled the drivers and rebooted, it wouldn't boot again. So there are currently no GPU drivers installed as I cannot use it in any other way than safe mode.
I believe this (especially the grey screen) is a sign of early GPU failure? My temperatures are idling at ~37-40C GPU, and 42C CPU (I didn't repaste the CPU very well, will do it again later).
Can you shed some light on this? Would I need to RMA the GPU? I live in NZ so shipping it to LogicalBlueOne in Australia (where I purchased the laptop) would be a little annoying to sort out, but would do it if it was the last resort. Any other tips would be greatly appreciated.
**edit:
Also forgot to add. I have tried disabling ALL startup services in MSCONFIG. Aswell as all running services (non Microsoft). Still problems.
Haven't tried older versions of the GPU driver though. However the first grey screen was on a quite old version as I did not update it for a while. Any stable drivers I could try?
Cheers all,
Luke.
-
yeah i think that sounds like the GPU is fried
-
How come it only does it when the catalyst drivers are installed?
working perfectly fine on the default microsoft vga drivers, minus the ugly 1440x900 res -
Also, before I cleaned out the dust my temperatures were pretty high and I was getting terrible performance in games, way worse than I used to... I always thought it was just a dust issue, and I haven't been able to test it with the dust removed because I can't install any drivers :/
-
i think its running on integrated chip, the HD 3000 as opposed to the 6990m when the drivers are not installed
also what temperatures was it at before you cleaned and repasted? if the performance went down when it got hot, it means it got throttled, thus the temperature must be way over 90C for that to happen. though unlikely, if that temperature last for a period of time, it might burn out the GPU -
The very fact that you allowed it to continue to run with extreme heat would have certainly fried the memory or worst yet the GPU. Always monitor your temps especially being a laptop/notebook and all. I guess this is a reminder to all to be vigilant and clean the grilles once a year or more if living in dusty conditions.
PS- the HM series do not have integrated GPU only EM series do. -
Right now his AMD 6990m is running in VGA mode when no other drivers are present.
I think you should definetely contact your reseller about this, maybe they can help fix it for a small fee. -
Temps when under load got very high on the GPU, around 94C (when propped up).
Yes I always was checking my temps. I just never had a compressed air can to clean it out with until I decided to buy on today.
I would say clean out dust every 6 months at MAX. Mine isn't even a year old yet and there was so much dust.
Does the warrenty not cover the GPU? -
if its still under warranty just rma it, even if its a hassle
main thing is that u dont have to pay that much (just for shipping).
btw, u dont really need a compressed air can to clean out dust. u can just take off the heatsinks and manually remove all the crap while doing a repaste of the cpu and gpu.
also, its always a wise thing to take out the two fans and take them apart for a thorough cleaning. u can just remove the small screws on the metal covers, take out the fans by pulling them up and clean them under running water (just the actual plastic fans of course, NOT the fan casing, cuz thats got the electronics in it) after washin i use a hair dryer to completely dry the fans off before reinstalling them again. makes a world of difference in fan noise
cheers
Sent from my Galaxy Nexus using Tapatalk 2 -
alrighty, I've emailed my reseller, now awaiting their reply.
this sucks. today was my last exam for my first year of Uni, was looking forward to play games these holidays, looks like it will be on my average desktop now -
You are lucky you even have a spare desktop to play on. And it actually looks pretty decent! Source games will run maxed out on it.
I dont have a desktop and if my laptop gives out on me, I will be stranded without any spare solution.
We have some old desktops at home, but they are taboo to play on - they have UniChrome Pro video adapters, which can run only half-life 1 and brood war, lol!!
And google loads slower than my laptop's booting time. -
hahah wow yeah I guess I am pretty lucky!
except for one thing, I go to university on the other side of the country to where I used to live.. not going home until the end of this month and my desktop is in my old room at home. so looks like another couple of weeks with nothing :/ -
Hang in there, these couple of weeks will be over and gaming will be restored soon
-
Just an update:
LBO emailed me and are willing to repair it under warranty. They have always been such an easy shop to talk with I recommend them to ANYONE buying a Clevo in AUS/NZ/Pacific region.
Will update once I have sent and received it back! Thx for everyones comments -
Meaker@Sager Company Representative
Would have been all the excuse I needed to pick up a 7970M! But best of luck getting it back in good shape
-
I got offered a discounted price to trade in my 6990m for a 7970m when I asked. However it was just out of my price range, because I am buying an SSD for my laptop :/
Maybe in a year or so I might upgrade! -
Meaker@Sager Company Representative
Well hopefully the next gen card won't need a factory bios mod to make it compatible.
But I suppose the 7970M will be a cheaper option at that point. -
Just an update:
It took a while but finally got the graphics card back! Turns out it was faulty and they thoroughly tested a brand new 6990m for me. They sent it back and also let me get free shipping for my ODD conversion kit
Purchased a 250gb samsung 840 SSD and put that in, along with the new graphics card and a fresh install of windows I once again have a beast. Thank you LogicalBlueOne! They were really helpful and easy. -
This is the classic case of solder joints falling loose due to thermal stress. Most likely on memory chips according to your description of symptoms.
Chips get hot => expand, then cool => contract. This fractures their connection to the board eventually. Same problem happened before lot of times (most notable case with AMD/ATI GPU - XBox 360).
Usually fixable by "bake card in the oven" (reflow solder) trick.
Be careful to monitor your temps from now on, especially for memory chips - if your cooling system not up to job, the card could fail again (especially if the "new" card is in fact one repaired from same issue). -
However I am not going to let it happen again, bought a cooling pad with a fan that I use whenever the laptop is on my lap or a blanket. Going to open it up every month (possibly every few weeks) and spray any dust out with the can of air I purchased. Also bought some Antec Formula 7 Nano Diamond thermal paste and repasted, it's working great!
It's summer in NZ and has been very humid lately, especially in my bedroom with my PC on as well. My temps at idle sitting on the cooling pad without the pad's fan plugged in is 41C GPU and 40C CPU!
Used the pasting in a line method for the CPU because of the rectangular shape, worked much better than the pea sized dot in the middle. -
Hello everyone,
This solder-joint problem got me for the third time resulting RMA's with my p150hm in these 3 months and it shouldn't be happen in this following 1-2 months. What can i do in order to prevent it? So far i've ordered a Zalman alu-laptop cooler, just keep my FN+1 always on till the stand comes, raised a little bit the back of the chasis. Can undervolting help to this issue? Since i've things to do on my computer like photoshoping, 3dmax ( not too complex scenes) etc. ( my studies depends on this), i don't really need of the 6990m's full power. Any help will be appriciated, thanks! -
Meaker@Sager Company Representative
Solder joint problems should not be that common, so I'm surprised.
Technically if you want to avoid them you want to minimise rapid changes in temperature. -
Solder joint problem actually appeared previously very commonly with certain termal designs
XBox 360 GPU is most prominent example. Everybody heard of Red Ring of Death.
8600M GT also had similar problem (or same, in Apple MacBooks, resulting in class action lawsuit)
I say its not yet enough reports about this 6990M board (or, maybe certain batch?) to make a same prominent pattern, but it appears to be definitely some link. -
Meaker@Sager Company Representative
I meant for this system or AMD cards in general to be honest.
-
P150HM is not that old, and problem seem to take long time (~year) to manifest, hence it seem to be appearing only now. -
Meaker@Sager Company Representative
AMD don't create the packaging for microsoft, they do for AMD MXM and desktop graphics cards. Nothing has changed from a regulatory standpoint so I would be surprised to suddenly see issues especially only on clevo machines considering all AMD chips like I said are packaged at the same point.
-
I don't get what you trying to say.
There is specific headsink/cooler setup for Clevo 6990M, there is specific board assembly. I don't see why this particular combination can't have some thermal flaw, which manifests after about a year of average use (sooner if used intensively - note that several months ago we had person with same problem here, who overclocked his 6990M and used it to do some 24/7 processing). -
Meaker@Sager Company Representative
Sure let me explain, now I don't know your level of expertise so if this comes across as condescending it really is not supposed to.
So all graphics cards, MXM or desktop have the same setup:
Here is an image I generated to explain Nvidia's solder cracking issues. The solder I have marked in the picture is the one that was causing the issues, since it's in direct contact with the core it by far sees the greatest thermal stress.
Here is an image to show the product that clevo/MSI/Asus or any other big OEM will buy:
Since the equipment to attach the laser cut cores is so expensive AMD and Nvidia will do this themselves (well the fabs which produce the chips for them) which means they are responsible for the soldering that is under most stress (why Nvidia were sued directly).
We saw this when Nvidia's entire lineup of chips (the same machines are solder supplies used for multiple products since as I mentioned they are expensive) across multiple brands were faulty, while notebooks chips are under stress more and so fail more often there were also issues with the desktop chips since they are sold the same way.
This is not an issue I have come accross with the AMD 6xxx series in any segment and replacement cards are likely to be from new stock still so to see 3 cards die in a row would be very surprising to me and while not impossible it would lead to me thinking a more likely issue may be with handling/motherboard grounding or socket issues (leading to a short).
If I have left anything unclear feel free to ask -
The problem is not that soldering is flawed (like it was with 8600)
The problem is that cooling is flawed (like with X360). Bad cooling can cause even perfect soldering to fail over time, because temperature swings (and resulting expand/contract cycles) go outside of range that solder could handle.
BTW with this card it appears to fail not on GPU, but on memory chips - hence it's generally not detected before board fails (mem chips lack the temp sensor on these boards). -
Meaker@Sager Company Representative
That's very odd because this level of GDDR is not even that hot and the power circuitry is much more liable to fail since it's attached to the same heatsink.
Since the HM and EM share the same heatsink it had no problem clocking the memory on the 7970M to 1.5-1.6ghz and maintain stability. -
Stability has nothing to do with it. It may never get hot enough to become unstable, but hot enough to fracture the solder over time. Its exacerbated by fact that mobile boards generally have lower VRAM clocks, so they are more stable under high temperatures.
Again, its exactly same with overheating XBoxes - they work perfectly (as it seems), until "suddenly" start glitching.
VRAM heatsink is not thermally isolated on this design, so its temperature is tightly coupled with GPU. So, if GPU goes under load, it literally starts heating the VRAM - as far as I've measured it, it starts to happen about after ~30mins under load, when entire assembly heats up.
7970M is different board, using different chips and assembly, so it may behave differently. Or, again, we may just had not enough time for them in the wild for problems to have manifested - again, it seem to take about year of usual runtime with 6990M for problems to start appearing. -
Meaker@Sager Company Representative
The vram is actually a different segment of metal in the HM series, you can remove it separate from the gpu heatsink, while they sit together they are not actually connected so they are not thermally linked.
This has also not struggled when cooling the power circuitry on the 680M, it fairs just as well clocking as other brands so it would certainly not be a cooling issue.Attached Files:
-
-
If it's not the cooling issue, why the solder fails then?
Its different heatsink alright, but still somehow heats along with GPU. I don't know why, I am not thermal design expert. It happens pretty slowly, but, under constant heavy load, eventually temperature of VRAM heatsink reaches temperature of GPU heatsink - and I suspect that VRAM chips can't handle this. Or probably VRAM chips actually even more hot (if thermal pads are not up to job) - I couldn't measure temperature of them, only of the heatsink.
I see no point in comparison with diff cards. Other cards use different chips, they most likely have different heat flux - maybe this 6990M has one that is particularly high for whatever reason. -
these idle temps look totally fine to me
nothing to worry about...
-
My 6990M failed recently too, in a similar way to how the OP's did. I'm currently waiting to save up enough money to buy a 7970M upgrade, but in the meantime I'm stuck with a busted 6990M. It works with Standard VGA Drivers, but the screen looks weird now with these vertical white lines all over the screen. Weird thing is though that my card never really overheated or anything at all (I use HWMonitor constantly, and the card never ran over the low 80's under load, and even then it rarely got that hot), it just suddenly did all this to me. The card failed on me just a couple weeks after my 1 year warranty expired...
-
this sucks, all these reports of hardware failing JUST outside the warranty period.... thats gonna be the case for my machine in about two months, oh good lord! O.O
-
Meaker@Sager Company Representative
You dont have the same card though...
-
i know, i was just panicking a lil bit in general
aside from my RAM, all components are actually pretty "new" in my system...
-
Just an update:
So the GPU kept on failing over and over again. Every time I baked it in the oven and it worked for about 1 - 2 weeks. It got annoying but I got so used to cooking my card haha I had nothing else to lose anyway.
Halfway through all this I ordered some new thermal pads from ebay. They took a while to get here.
They finally arrived when I was down to the very last bit of my thermal paste tube. I replaced all the pads (even the ones that didn't look worn out) and did one last bake and dust clean, and it worked. It's been going strong for over a month now! I hope this post doesn't jinx my card and make it die again tomorrow, but I'm pretty happy with how its turned out.
The pads cost me about $3 usd. Sent from china or somewhere. Very cheap fix indeed -
So the moral of the story is....
Check your thermal pads on VRAM and make sure they are doing there job correctly. I made sure of this when I bought my P170HM (will be 2 years old from October'13)
Also, make sure you use good quality thermal paste IC7 diamond is my recommendation(buy from here: IC Diamond 7 Carat CPU Thermal Paste Grease Compound | eBay). Pasted it in October '11 and haven't had a single issue with GPU or CPU touch wood and game with all the latest titles.
Ekulz you like me we both bought from LBO around same time. -
Yup!
I don't think the rumours about 6990's going defective after a year are true. My friend has an alienware MX17 that he purchased the same time I bought my clevo and his 6990m has always been running fine. I think its just the p150/70hm stock pads that get worn down and/or squished because of tight heatsink screws. Even if they don't look warn it they may be very thin so they don't absorb much heat so it may pay to replace them. I think there were a few different sized thermal pads on ebay I opted for the thicker ones. -
Meaker@Sager Company Representative
Well they should actually work better the thinner they are since they just need to pass the heat through to the metal rather than absorbing it into their own mass.
-
My 6990m died right before one year of ownership. Pretty much the exact same symptoms as the OP and others in this thread. Assuming I buy another Sager, I'll be getting the 3-year warranty.
-
Did you try any of the steps I did? Mine's working perfectly now!
Yeah no matter what laptop I buy next, i see a $100 3 year warranty to be one of the best purchases you can get with a laptop -
-
Meaker@Sager Company Representative
-
I think the stock are 0.5mm so i got the 1mm ones
Blue 100mmx100mmx1mm GPU CPU Heatsink Cooling Thermal Conductive Silicone Pad | eBay -
Meaker@Sager Company Representative
Well thermal transmittance goes down the thicker the pad is.
-
Awesome. My 6990m died moments ago, same symptoms as before, while I was just surfing the internet. Third card to go belly-up in 8 months
GPU problems, 6990M Clevo P150HM
Discussion in 'Sager and Clevo' started by Ekulz, Nov 10, 2012.