Hi, this is my first post...
I have an XoticPc Sager 9262 laptop with dual 8800m gtx video cards (SLI) that I purchased in April 08. I am now experiencing problems that I believe point to a hardware failure of one or both cards. I would therefore appreciate a tech-savvy opinion as to:
1) how do I identify whether the problem is hardware or software (or both) related?
2) if hardware, which card is at fault (if not both) and how do I go about replacing/repairing them?
3) if software, what should I do?
The only way to get the laptop to work is to disable both cards via device manager. After doing that, all other functions, e.g. internet work perfectly. The issue of course is no 3D capabilities.
For background, I have uninstalled drivers for both cards and reinstalled drivers that have worked in the past. When I try to re-enable and restart the pc, I get dot like artifacts on and from entering the BIOS password and the pc fails to load Vista at all - I then have to pull the plug and re-start in safe mode and disable the cards again to get the thing working again. I am running Vista 64 bit. Help will be appreciated.
-
Welcome to the NBR forums.
your issue is quite common.... and the first thin we always ask is:
"When was the last time you opened up your system to thoroughly clean out the fans and vents from dust clogs?"
- If your answer is never, then you have caused overheating damage to your componenes (namely the videocards)
..... you are supposed to do a regularly cleaning of the vents almost monthly.
As for your questions:
2) what are the temps of the videocards when they started to fail (which you hopefully were good about monitoring)?
3) have you opened the system and physically checked the videocard modules?
4) have you taken out the videocard modules and re-secured them back into their MXM slots?
5) have you also re-secured the SLI bridge cable?
you should always try to be as comfortable with your hardware as much as possible.
go grab a Clevo D901C (aka. Sager 9262) Service Manual from theriko's sig.
-
Hi Gophn,
First of all thanks for the quick reply! In answer to your questions:
1) does the system boot up fine with both videocards? - yes, albeit both need to be disabled via device manager first (via a safe mode load)
2) what are the temps of the videocards when they started to fail (which you hopefully were good about monitoring)? yeah, good point. No idea. Though I did have Ndivia System monitor, I wasnt looking at the time.
3) have you opened the system and physically checked the videocard modules?Yes. Nothing looks burned or otherwise physically damaged. Nor were they clogged with dust. I regularly open up the back and clear dust from the fans and heat sinks. I also run fn+1 when playing games to max airflow from the fans.
4) have you taken out the videocard modules and re-secured them back into their MXM slots? yep, done that, though it is possible I didn't do a thorough job. That said, the problem hasn't changed.
5) have you also re-secured the SLI bridge cable?
yes again, though only on one side - given the other is underneath some packaging.
By the way, when I run GPU-z, one of the cards shows zero memory (and DDR at that), while the other shows 512mb and DDR3. -
My guess is that it's your second card, I have the same issue and one of my 8800m gtx appeared to broken.
Uninstall both cards then reinstall the second card in slot one and leave the first one out.
Now see if you can boot up with the drivers installed.
In case you don't have the drivers installed during this test, boot up and install them.
My guess they won't be able to finish the install or won't boot during restart. In case everything works fine, try the same test with the other card.
I hope this helps and hopefully you have a card under warranty. -
Like said, physically remove one card and just run the system on the other card to see if its stable or not.
repeat the process with the other card to single out which one might be the issue.
if both card run fine singley, then make sure to check the SLI bridge cable.
also, take apart the videocard's heatsink assembly to re-apply thermal compound to the GPU itself (leave the memory thermal pads alone). -
Well good news - your suggestions enabled me to quickly identify the problem card and I now have a fully operational system - albeit with one, not two 8800m gtx's. Talking of which, can the defective card be repaired or is it totally toast? Speaking of which, I'm assuming I have to get another 8800m gtx to match rather than upgrading one of the cards? As is always the case, I'm just out of warranty on the card - them's the breaks...
Thanks again Gophn and The Voyager. Owe you one.
Lexicon1 -
Many of the Nvidia 8xxx cards have been known to fail due to a poor soldering issue. One way to fix it yourself is through baking the card, as exemplified through this thread ( http://forum.notebookreview.com/showthread.php?t=437683). There are also repair services that do essentially the same thing, although (hopefully!) using actual soldering gear to remelt and reflow the solder. That's probably the only way to fix your old card, unless the problem is something else, at which point, that would depend on exactly what else the problem might be.
-
-
don't want to hijack you're thread Lexicon but since I'm in the same situation I might
I wonder if this is caused by a solder issue (see pictures)
Note the purple colored characters, also it results in boot failure when the drivers are installed.
If uninstalled I can boot up but get the dots on the windows boot screen.
The first is taking during several attempts to boot and the second is during disk error check.
http://img692.imageshack.us/img692/7155/8800mgtxbug1.jpg
http://img40.imageshack.us/img40/2383/8800mgtxbug2.jpg
If so I might give the oven a try, I do know lots of soldered gear fails these day through the new ROHS regulations (meaning there's no more lead used in the solder)
resulting I solder cracks also this solder doesn't flow as good and takes more time to heat as the leaded solder. But it's better for the health of course.
cheers -
ahhh look...hijack away. I'm not precious.
Actually, I think it was the slot A card that was defective, but I wasn't really paying that much attention.
All I know is that I tried two cards separately per your suggestion and only one worked. Logically, this tells me its the card not the MBoard. But as you can tell, I'm not fully up to speed with this hardware game. Baking is not an option at this point given things are working. -
The 'error' I was getting with the faulty card was dots across the screen from the BIOS onward, followed by a crash.
-
Why isn't baking an option? You're not supposed to bake your whole motherboard, mind, since you can actually remove your graphics card. All you have to do is remove the heatsink from the faulty card, and bake the faulty card only. This way, you can hopefully have a functioning system with 2 working cards again, instead of just one. Of course, if you've gotten rid of the faulty card already, then yes, baking wouldn't be an option.
-
erm...they say the truth is stranger than fiction. The baking thing worked, kind of...
Now both cards work in slot A, whereas previously, only one did. However, when I hook both together via the SLI cable, it only recognises the existence of one card and that doesn't run properly. However, I no longer get artifacts...
So perhaps the SLI cable or slot B is the problem?
Thanks Judicator for the encouragement! -
You're welcome. Now, of course, you just need to get a known working SLI cable to see if you can get SLI working again...
-
Hmm...getting SLI working again is proving somewhat problematic. Before I go and order a new SLI bridge can someone suggest other alternatives? To recap:
- Both 8800m gtx cards are definitely working - Ive physically removed each card (then run the system with only one card at a time) and can run Crysis (for example) perfectly using either card.
- I have then re attached the SLI bridge and reinstalled 186.81 drivers. Thoroughly cleaning the system of drivers first seemed to be the answer to my earlier issue of the system not recognising both cards properly. After re-installing the drivers, the system allows me to enable SLI mode. Note I'm confirming device manager using GPUZ which identifies both cards and that SLI is enabled.
- HOWEVER, when I run games (and presumably other 3D apps?) in SLI mode, I get a blank screen or blocks/static patterns across the screen followed by a system crash. As above, these same games will run perfectly when I disable SLI via the NVIDIA control panel.
So is it likely to be the bridge or should I look for a solution elsewhere first? -
It sounds more like the baking either didn't fix everything, or else it did something unhealthy to the SLi functionality. It sounds like when SLi is activated, the cards are getting corrupted data across the bridge, which is causing them to crash.
I recall that in some systems it was possible to go SLi without the additional cable bridge, albeit somewhat slower than with the SLi-specific bridge, as the cards communicated with each other over the bus; have you tried that? -
...and your recollection is correct. No SLi bridge=no issues!
Games etc now work perfectly in SLi as confirmed via the SLi indicator (moving green bars). I note that Split Rendering is no longer an available option in the Nvidia control panel - does makes sense given the bridge has been removed? I also received a notification that I no longer had a bridge and that it would not be running optimally as you indicated - is it worth getting a new bridge or not at this point? While you consider that, I'll run some benchmarks... -
Mutant_Tractor Notebook Evangelist
The bridge allows full 16x SLI i think?
Will in desktops, cards can aparrently run in sli without a bridge, although the performance is nowhere near as good as with one, bridges only cost about £4 delievered from what i remember -
Well, at least it's good to see that my memory hasn't become totally corrupted with age (and disuse). It makes sense that the performance in SLi without the bridge would drop a bit, because the cards no longer have a dedicated communication bus between themselves that they don't have to share with any other component. By forcing the cards to use the main FSB (or it's Nehalem-based replacement), the cards will have to take their turns communicating and will on occasion be delayed a bit.
I wouldn't think that just getting a new bridge cable would solve the problem, however, because my speculation is that the damage that was causing the SLi problems when you had the bridge connected is probably on the cards themselves, at the point where the bridge connects to the cards, so just replacing the bridge cable itself is unlikely to help at all. -
Mutant_Tractor Notebook Evangelist
I think the nehalem based replacement is QPI (Quick path interface) as the fsb runs at 133MHz on all chips
-
QPI is only on the 1366 Nehalem chips, though. 1156 and mobile both use DMI, with whatever FSB replacement they have in the memory controller on-die.
-
How does QPI set up with the GPUs? I've been out of the saddle for a while, and haven't really paid much attention to how system communications actually takes place using the QPI architecture. In particular, how does it integrate with SLi?
-
As far as I know, SLI works just fine. QPI largely just replaces the northbridge and FSB, so memory is now basically connected directly to the processor (which handles memory management on-die), and GPUs go through a chipset like the X58 which connects their PCI-E ports to the CPU through QPI. The Intel X58 page on Wikipedia ( http://en.wikipedia.org/wiki/Intel_X58) actually has a pretty nice diagram explaining things.
-
I ran 3D mark and am getting 12360 SLi (no bridge), 10247 single (no bridge) on standard settings. Might get hold of a bridge if they are cheap to see what I should be getting for my rig - unless someone else knows?
Sager 9262 - q6700 quad core @ 2.66 GHz, 4gb memory, Vista 64 bit, 2x8800m gtx in SLi
Is my video card broken?
Discussion in 'Sager and Clevo' started by Lexicon1, Dec 7, 2009.