Who knows about this topic? I've heard explanation that range from too much heat causing it to the combination of lead free solder and too many thermal cycles being the problem.
The lead free solder blamers advise having the chip 'reballed' with leaded solder, and the other group claims simply improving the cooling is the right solution. Perhaps it's some of both?
If I have to replace the motherboard again I am better off buying another laptop, but is there any way to be sure another laptop will be any better?
Ripping a laptop apart every so often to heat the igp/nb or gpu seems like a bad solution to me...
-
Tsunade_Hime such bacon. wow
Are you referring to the Nvidia defective chips?
All G84M/G86M/G72M based Nvidia chips use cheapo substrate material so constant heating and cooling of the GPU will cause it to expand and contract so the solder joints fail and the GPU gets disconnected from the motherboard resulting in the loss of video/artifacts. There is no permanent fix, and I would advise against fixing it. Rebaking the motherboard to reflow the solder joints could work as little as 2 days and as long as 2+ years. You can reball the GPU all you want but it will fail eventually down the road. I would invest in a laptop without those cores. AFAIK all Nvidia chips after the G84M/G86M/G72M shouldn't have any further issues. -
I've heard of the Nvidia defect. The flip chip (glassy die part) actually disconnects from the package (the part that connects to the motherboard). But I'm talking strictly about the connection between a chip's package and the motherboard.
If only the connection between the die and the package is failing, and that is only a specific Nvidia problem, why do AMD chips exhibit the same probems, and why do the same solutions work?
This happens with consoles too.. but for some reason I've never heard of it happening to desktop videocards. -
it was a move to a different type of solder prompted by new EU regulations, which in the name of simiplifying design (won't get too far - however, different solders for the underfill will require slightly different die designs), new solder was pushed in without fully testing it. The most infamous example of this was the xbox360 - which was fully developed in a 1.5year span, and final dies only comming out less than 2 months before launch.
ATi chips are less known to have such problems from this time era because:
at that time, ATi was taking a beating vs nVidia in performance, so their chips were rare seen in laptops.
laptops with ATi chips normally used their lower end variants, which didn't put out as much heat as their larger cousins. -
I've heard about the solder change being the cuplrit before, but that doesn't explain why desktop grapics cards are unaffected. And what about CPUs?
My understanding of the Nvidia suit is that ony a small percentage of their chips were affected because the mismatched materials, not because they choose the wrong one. Supposedly that was a small isnstance attributed only to Nvidia mobile chips.
Are the arguments about the ball grid array failing a myth, and failure should always be attributed to the die to package connection? If this is true why are there still so many failures?
To make my frustration more simple: I don't want to buy a laptop that lives past a year warranty just to weigh between buying another one that might fail in another year, putting it in the oven in hopes it will 'fix' for another year, or paying some guy who just bought a machine to bake it himeself for some unreasonable fee.
What's further annoying is no one really knows why. It happens to console old a new, it happens to laptops old and new, but it doesnt happen to CPUs or desktop based GPUs. Why these specific chips?
I guess the best solution is to pay the extra money for a really long warranty and forget about it. -
-
In general I mean, ie out of all the chips nvidia has produced.
-
Tsunade_Hime such bacon. wow
Unfortunately there really is no way to know for future reference though I sincerely hope that Nvidia learned their lessons from GeForce 7/8/9 with those defective cores. AFAIK there were no more issues with chips desoldering themselves since, but you may want to invest in a laptop with a discreet card that is removable.
-
Desktop cards only have to cool the GPU, and the larger cards often have IHS to keep the die on place, or have custom cooling units (imagine GTX480/GTX580 - those coolers have to dissapate 300W of energy... all on their own!!) tailored specifically for the purpose. Not to mention, desktops custom built by enthusiasts often have decent cooling systems/airflow of their own.
Laptops, on the other hand.... even the massive Clevo & Alienware systems have teensy cooling systems that would be barely adequate for the purpose - a limitation of laptop sizes, unfortuneately. -
Desktop side GF 7300, 7600, 8400, 8600 series break quite often. Nforce 430 motherboard chipset is also "same bad batch". All of those are close relatives to the laptop variants, all of them suffer from same problem. 8800 series GPUs are newer, I haven't seen/heard as much problems with them.
Frequent hot-cold-hot-cold changes are contribute to quick death, desktops don't have those as much since the heatsinks are bigger and heat/cool slower than laptops. -
Tsunade_Hime such bacon. wow
Good thing my Vostro 1500's 8600M GT is on a separate card. -
This whole topic shows you what happens when agenda-driven political animals dictate technical details.
-
Where it relates to what we are talking about is basically no more lead in solder. This has the unfortunate side effect (at least initially) of making solder more brittle. Actually, "cold solder joints" were always a problem in electronics. I remember BITD when computer CRT monitors were worth enough to necessitate repair. The majority of failures, especially with cheaper models, were cold solder joints. This was merely from the heating/cooling that a monitor goes through in normal use.
With computer chips (GPUs/chipsets especially) this was compounded by RoHS and the highly localized high temperatures. And yes, it affected more than just NVIDIA, but of course Charlie Demerjian felt obliged to create the whole "NVIDIA bumpgate" affair because of his personal hatred towards NVIDIA because NVIDIA cut him off after he broke NDA. -
I read this thread thinking it might be about Optimus. That would have been interesting. I thought that yes the hot/cold cycles could maybe create the issue we all discovered with the 8xxx series.
Instead I get this?
You all make good points but much is old and what is the point? If you still have an 8xxx GPU you have bigger problems.
Was this thread about thermal fluctuations GPU/IGP/Optimus from the start or no? -
-
The problem with lead free solder (which I hate) is that it requires much higher temps to flow and the fluxes used are not as effective at wetting it like with lead solder. The problem first came about in many many electronics manufacturing houses where the manufacturers didnt adjust temps or fluxes and just threw in lead free solder into their PCB assembly machines. This lead to cold solder joints where the solder didnt have flow characteristics or the temp to etch the copper traces and the bond was merely a mechanical adhesion instead of molecularly bonded.
A cold solder joint is one where there was insufficient heat & wetting action to melt a thin layer of the copper trace/lead to create a super thin molecular layer of copper/solder alloy that interfaces the joint. Over time and heat cycling, a cold solder joint just breaks free because the bond was purely mechanical where the two metals were never molecularly joined. Lead free solder is also harder than lead solder so even when things are molecularly joined, there is less flex and give so the connection can shear in a shorter amount of cycles and less dissimilar expansion coefficients
In OLD electronics with failed lead solder joints that were once good joints, the heat expansion and contraction from excessive current or improper heatsinking through the joint cause the crystaline structure of the solder to gradually break down causing a higher resistance and more heat dissipation through the joint and thus turn brown from heat & oxidation and eventually shear from the connection to the component lead. The funny thing is that the joint never fails on the PCB trace side of the connection (Unless it's the solder pad ripping away from the connecting trace). The failure of the actual solder joint is always the component lead side because the expansion coefficient is much less in the component lead and the bonding surface area is also less. Also the PCB trace copper melts more and creates a better molecular bond to the solder than the component lead which is heatsinked through the component and thus the lead stays cooler so the copper/solder alloy layer is thinner and more abrupt in its boundaries. a larger boundary from the copper/solder alloy to solder kind of "bridges" the two different expansion coefficients of the solder and whatever it is attached to. Instead of hard boundary of differing expansion coefficients, there is a slightly more gradual transition due to the copper/solder alloy layer.
I can't tell you how many times I've had to deal with lead free solder BS in my work since 2006. With BGA, if the manufacturer doesnt x-ray each part to make sure there is a good connection, there's gonna be LOTS of failures if their line isnt kept to 110% perfect working order and tolerance. I've been to China many times and close manufacturing tolerance in their line in respect to temps is not happening unless you sit there and babysit them.
GPUs & IGP/NB desoldering themselves
Discussion in 'Hardware Components and Aftermarket Upgrades' started by s_rouault, May 29, 2011.