Announcement

Collapse
No announcement yet.

Beware! nVidia SLI Agonies via SW GPU Killer!

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Beware! nVidia SLI Agonies via SW GPU Killer!

    Brad... Please forgive the length of this post...but due to so many technical problems involved in the explanation... I just had to fill in all the details:

    I'm not into on-line internet FPS gaming and for the most part, my job as Dad and Hubby in the house (among fixing and maintaining anything that moves on wheels) is to keep all of the many computers in our house up and running for four people, one of whom is away at Vet School in Missouri right now). My son has a specific interest in the gaming side of computing and is very touchy about having whatever gives him the leading edge while on-line. So when I build something for him... I always try to use the best and most reliable components in combination that I can grab either from eBay, Amazon on-line or from the local CompUSA and occasionally BestBuy stores...and always for the least amount of Moolah that I can spare..

    But during my latest foray into a complete gaming system upgrade, I stumbled over something that I did not anticipate happening that wound up costing me a bundle in the end to fix because of my ignorance of the problems involved. I feel the need to share this experience for those who might use their time and cash this weekend to follow on with this idea and hopefully save them the same grief..

    Basically this all started when the subject of his wanting dual GPUs to handle the above-the-mobo graphics processing, so I had my eyes peeled for a mobo that had either Crossfire or SLI capability. Lo and Behold... I found a refurbishment XFX 780i nForce 3-Way SLI listed for only $99.00 on Amazon (it turned out to have never been opened and was brand spanking new!) and that eventually lead to my hunt for a matched graphics card to his own XFX Nvidia 9800 GT that he was already using and that came as a BFG 9800 GT 1TB card for around the same sale price at CompUSA within the last week. While at CompUSA, I saw their nVidia CoolerMaster Gaming Case on Sale, so I picked one up. I thought “So far... So Good!” and I might as well go Whole Hog while the getting was good.

    When I had all the parts gathered, I simply copped the Intel Core 2 Duo (45nm) PenRyn from his old board, the two 2-Gig sticks of Corsair 800 MHZ Gaming SLI -Ready he was using, the 1000W PSU, his Seagate 1TB HD and of course your basic SATA DVD Burner/Player out of the old machine. The only thing left to consider in the upgrade equation was the operating system. I poked around and figured out that with the advent of Direct X Version 10-11 working only in Windows 7 and with the advent of that OS as being that latest from Microsoft that, unlike their MS-Vista FUBAR, would NOT have more Bugs in it than a Bait Shop. The Bench Techs advised on getting an OEM Windows 7 Professional flavor because it would let the Core 2 Duo procs breath easier at 64 bits... while still being able to run the legacy 32 bit apps as well along with some other "goodies". So I sprung a leak in my wallet and picked up one while standing there.

    Okay... You pretty much know the rest... How all the HW went together nice and sweet in a case that is graveyard quiet and roomier than a Rich Man's Marble Crypt. The WIN7 OS booted up and loaded or downloaded all of the necessary LATEST system component drivers... and THAT is where the problems began. I should say right now that there was a significant design difference between my son's XFX GT-9800 GT (takes power from the PSU via a dedicated PCIE 6 Pin Connector) versus the BFG 9800 GT that is of a more EPA friendly design and gets powered from the mobo chassis. In orders of operation... I installed the newest BFG card 1TB first and because the SLI Bridge Connector was a hard and inflexible version vs. the usual flexible copper and plastic cabling variety, it forced me to put the XFX 9800 GT card into the third PCIE 16X slot instead of right next door in the second PCIE 16X slot. After an initial re-boot and some shenanigans of trying to use the nVidia Systems Monitor to recognize and set up the card... I got Bupkus...Nada...Nunca...Niet... Nothing. Period. I spent the next day and half trying every possible combination imaginable of HW and SW and BIOS tweaks to no avail.

    Finally I stumbled across tons of threads on the theme of “NVIDIA'S LATEST DRIVERS ... GRAPHICS CARD KILLER” and I started to worry. Sure enough... nVidia let out the 19X.75...something, something driver that causes the on board cooling heat sink fans of the graphics GPUs units to TURN OFF! They tried to put out a newsletter about, “...pulling the 19X.75... drivers off of their site, but judging by the volume of unhappy nVidia SLI upgraders... the damage was already rampant... I am just one of its latest victims!

    Meanwhile... the bloggers were intermittently blaming either nVidia or Micro$oft or both while lamenting the loss of their cards and what to do about it. So I too must join this sad brigade of those who, in their ignorance of the “power saving protocols” being foisted on all of us "electricity wasters" , used at least one nVidia card that loses its silently cooling fan on boot up if the card is powered up via the PCIE six pin connector instead of the newer versions that take their power from the main board.

    Now I think there might a solution to this problem on legacy cards... and that is to just mount an external cooling fan that is powered via a standard MOLEX 12V power source, neither metered nor monitored by any on-board HW from nVidia and or their Effed Up Drivers...that maintains a constant “high” enough velocity to keep the GPU cool at all times... regardless of the duty cycle demands being place upon it. I hope this unfortunately epic long post serves to warn those interested in taking the Red Pill of SLI upgrades instead of The Blue One. If you insist on keeping things with the HW sin the stock factory settings and attempt any driver upgrades in Windows 7, please investigate your options to manually install the earlier, functioning drivers from nVidia (19X.21.......) before installing the PCIE cable powered cards. I hope this long read saves you the time and trouble of finding this all out the hard way. I am interested in other ideas on how to stop this from occurring any more. Thanks in advance for your help.
    Last edited by 60dgrzbelow0; 03-21-2010, 11:55 AM.

  • #2
    Umm....

    I don't think the drivers killed your card. The driver only kills cards under load. It doesn't shut off fans, it sets the speed very low, ie, 30% of the max speed, as opposed to a more proper rate of speed. 30% is more than enough to keep an idle card cool. (Or at least under 80c, at which point damage may occur.)

    The damages occur at points of load, while playing games. And these problems don't involve the card not being recognized. For example, the problems that I ran into when I installed these drivers was, after noticing that the temps were at 90c in-game, under load, I turned the computer off. Upon rebooting, I discovered that that particular pixels were miscolored or flashing, a result of damage to the memory chips or the lines that connect to them. However, after leaving the computer for a day, those problems are gone.

    Furthermore, these drivers have been recalled. Simply roll-back the drivers.
    Last edited by CorporalAris; 03-21-2010, 01:03 PM.

    Comment


    • #3
      i'm a little bit of a computer buff, so this certainly does interest me...
      1995 Monte Carlo LS 3100, 4T60E...for now, future plans include driving it until the wheels fall off!
      Latest nAst1 files here!
      Need a wiring diagram for any GM car or truck from 82-06(and 07-08 cars)? PM me!

      Comment


      • #4
        There was news about this in the pc hardware market a few weeks ago. Someone at nVidia fudged the bit of programming that adjusts the cooling fan vs gpu temp in a driver release. As a result, certain videocards were overheating themselves to death. New drivers that correct the issue were released on the 17th.

        Last edited by Azrael; 03-21-2010, 04:37 PM.
        1995 Grand Am SE

        Comment


        • #5
          Originally posted by Azrael View Post
          There was news about this in the pc hardware market a few weeks ago. Someone at nVidia fudged the bit of programming that adjusts the cooling fan vs gpu temp in a driver release. As a result, certain videocards were overheating themselves to death. New drivers that correct the issue were released on the 17th.

          http://www.nvidia.com/object/196.75_...r_support.html
          Thanks for backing up my reality Azreal...

          Whew...for a moment there... I thought that my own empirical experience with the card dying was just a mirage and I can resurrect it by "simply rolling back the drivers". Corp..... I doubt that those who might benefit from this warning are willing to wait for a raft of geeky software engineers to de-differentiate the code of this f*cked up driver. No amount of discussion can convince me and possibly thousands of other users that this is NOT what caused the problem... no other explanation will do. My lengthy text was not meant to stir up a debate about whether or not the driver caused this problem. It did. That is it. And so the purpose of writing this message is to make certain that others follow this cautionary tale and not wind up with something "Old and Busted"... while trying to build some "New Hotness" into their computers. "Nuff Said...
          Last edited by 60dgrzbelow0; 03-21-2010, 09:02 PM.

          Comment


          • #6
            Uh... I didn't mean that it DIDN'T screw something up, but I had a personal experience with the drivers that I thought people could benefit from. For me, simply rolling back the drivers worked.

            I'm just curious because it sounds like your card is dead. And if it's dead, why was it so hot for so long?

            Comment


            • #7
              Originally posted by CorporalAris View Post
              Uh... I didn't mean that it DIDN'T screw something up, but I had a personal experience with the drivers that I thought people could benefit from. For me, simply rolling back the drivers worked.

              I'm just curious because it sounds like your card is dead. And if it's dead, why was it so hot for so long?
              How long do you think its takes to kill one of these things, absent proper thermal protection? Let's look at the facts in my situation. As with close to one hundred other assembles, I followed all the proven experiential protocols and orders of operations on this build... which includes the oft overlooked part about "RTFMF,S" (Read The F*cking Manual First, Stupid). Those members in here who know me will tell you... I will invariably follow such instructions to the letter.

              So, it never occurred to me that the cooling fan would NOT be running, what with the card having been fully operational from another board and with a Bright Red Six Pin PCI-E Power Connector thoroughly and solidly plugged in that during the initial set up period of 30 to 45 minutes ...those busy little electrons would be churning and burning their way through all those little gold wires that are less than a thousandths of the thickness of a human hair. Without the means to vacate the heat from what is arguably one of the better, real copper heat sinks found on performance cards... the heat sink becomes just that... a "Sink" into which all the damaging heat collects in the absence of a constant flow of air to transfer that thermal energy off the GPU and out of the case via so many other air moving fans. Its very simple... Having no fan(s) running = Thermal Runaway... and Bye Bye GPU. Consider looking at this video so you can get an idea as to just how long it takes your average CPU to go "Tits UP". The GPUS on most up-to-date cards are about the size and power (if not more so) of a PIII on an AMD sized Ceramic Space.... Just watch...

              From Toms hardware, around 2005, this video is very old. What happens when a CPU heatsink is removed.http://churchofbsd.blogspot.com

              Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.


              So the point is...with the way the PCI-E slots are situated and the fact that the GPU cooling apparatus are on the DOWN side of the case and therefore not in plain view to see...and with no other warning to indicate a problem... using the wrong nVidia Driver with Windows 7 in any SLI configuration spells death to the graphics card....So what that leaves you with is something telling the graphics card fan unit via SW or HW instructional code to either run within a certain range or thermal threshold under a given set of stressing conditions...or not... and in the absence of a warning that this is what in fact is happening... the problem will continue.... Kapeesh?

              Somehow...it never occurs to the designers that they could employ the same technology used to burn DVDs and CDs with laser technology to set up a fail safe GPU thermal sensor that shuts off power to the GPU circuit the instant the temperature "rises to the occasion" and post a log listing for later diagnosis after the computer shuts down and the owner/user/tech realizes the obvious failure has to be with the video card. (Of course... I'm not suggesting that they use a heavy binary metal thermocouple like those in an O2 Sensor... but Hell... even a simple piezoelectric device would do the trick!)
              Last edited by 60dgrzbelow0; 04-02-2010, 04:52 PM.

              Comment


              • #8
                it amazes me that something so small and normally runs perfectly fine with a heat transfer area of roughly a square inch or less can burn up almost instantly when that small area that transfers all of the heat just disappears... one of the AMDs hit 1400*F? that's scary. right now i'm sitting at roughly 112*F with my laptop... making me rethink finding my thermal transfer paste after i took it apart and cleaned out all the dust yesterday...
                1995 Monte Carlo LS 3100, 4T60E...for now, future plans include driving it until the wheels fall off!
                Latest nAst1 files here!
                Need a wiring diagram for any GM car or truck from 82-06(and 07-08 cars)? PM me!

                Comment


                • #9
                  One more reason why computer gaming is a waste of money...
                  -Brad-
                  89 Mustang : Future 60V6 Power
                  sigpic
                  Follow the build -> http://www.3x00swap.com/index.php?page=mustang-blog

                  Comment


                  • #10
                    Originally posted by robertisaar View Post
                    it amazes me that something so small and normally runs perfectly fine with a heat transfer area of roughly a square inch or less can burn up almost instantly when that small area that transfers all of the heat just disappears... one of the AMDs hit 1400*F? that's scary. right now I'm sitting at roughly 112*F with my laptop... making me rethink finding my thermal transfer paste after i took it apart and cleaned out all the dust yesterday...
                    The sheer physics of the the thing is the key... In a real world example...this concentration of energy is like the example of why a 98 Lb woman wearing stiletto high heels is capable of driving that heel through the top of a man's instep like a hot knife through butter... its just the physics of the concentration of the electrons once flowing almost freely through fat copper wire highways and then suddenly trying to converge through a complicated "city of wires" in an area less that the size of a match head. No cooling? Nitey Night Microprocessor... Damned Quick!

                    Comment


                    • #11
                      I can watch my CPU temp rise degree by degree C when I load it. Then again, it's overclocked from 2GHz to 2.7GHz- stable on air (older XP X2 dual core socket 939 AMD). I also invested in a good heatsink and fan combo to make sure it doesn't cook itself. If they really wanted to see something neat happen in those videos, they should have used a Duron. My laptop that used to have a Duron would keep me nice and warm when I used it, lol.
                      -60v6's 2nd Jon M.
                      91 Black Lumina Z34-5 speed
                      92 Black Lumina Z34 5 speed (getting there, slowly... follow the progress here)
                      94 Red Ford Ranger 2WD-5 speed
                      Originally posted by Jay Leno
                      Tires are cheap clutches...

                      Comment


                      • #12
                        Originally posted by bszopi View Post
                        One more reason why computer gaming is a waste of money...
                        I know ... I know... But Brad... My Son is the very Beat of my Heart... and since he is such a Good Kid... sometimes... I must to tolerate the folly in him that makes him happy.

                        Comment


                        • #13
                          In the defense of computer gaming, once you build a nice gaming rig, it will basically tear anything else up that you put on it to run. The computer peripherals industry has stated that gamers pushing the envelope of computer technology is why we have computers and hardware as good and fast as we have it today.
                          -60v6's 2nd Jon M.
                          91 Black Lumina Z34-5 speed
                          92 Black Lumina Z34 5 speed (getting there, slowly... follow the progress here)
                          94 Red Ford Ranger 2WD-5 speed
                          Originally posted by Jay Leno
                          Tires are cheap clutches...

                          Comment


                          • #14
                            Originally posted by pocket-rocket View Post
                            I can watch my CPU temp rise degree by degree C when I load it. Then again, it's overclocked from 2GHz to 2.7GHz- stable on air (older XP X2 dual core socket 939 AMD). I also invested in a good heat sink and fan combo to make sure it doesn't cook itself. If they really wanted to see something neat happen in those videos, they should have used a Duron. My laptop that used to have a Duron would keep me nice and warm when I used it, lol.
                            P-R... I am finding out that with the advent of higher and higher CPU performances (and heat) that even surpass and defy "Moore's Law"... even the best of the thermal pastes... even Arctic Silver... become drier than a Pop Corn Fart and deserve a fairly regular cleaning and renewal... along with removing the Dust Bunnies the size of Buicks from in and around all the cooling fins and fans... LOL

                            Moore's Law is the reason modern computers pack so much power into such tiny form factors -- Scientific American community editor and 60 Second Psych podcast...

                            Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.
                            Last edited by 60dgrzbelow0; 03-22-2010, 02:21 PM.

                            Comment


                            • #15
                              I won't argue that. That just means that the thermal paste companies need to get on the ball as well, especially as long as there are people like me that don't mind pushing their hardware from time to time and will pay the extra buck for a better solution. I mean, I even used a Silver based thermal compound on my Xbox 360 when I fixed it- just not Arctic silver because I couldn't find it stocked locally here. I guess I should find out when I need to replace the Arctic Silver on my CPU, lol. Thanks for the reminder
                              Last edited by pocket-rocket; 03-22-2010, 02:18 PM.
                              -60v6's 2nd Jon M.
                              91 Black Lumina Z34-5 speed
                              92 Black Lumina Z34 5 speed (getting there, slowly... follow the progress here)
                              94 Red Ford Ranger 2WD-5 speed
                              Originally posted by Jay Leno
                              Tires are cheap clutches...

                              Comment

                              Working...
                              X