The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    GT4xxM vs GT3xxM Series: Poor Performance?

    Discussion in 'Gaming (Software and Graphics Cards)' started by SimoxTav, Dec 13, 2010.

  1. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    I recently bought an XPS 15 with the GT420M (96 Cuda Cores @ 500/1000/800). On overclock, i raised it to the level of the future release GT540M (670/1340/900) keeping the highest temperature below 75°C degrees (so no temperature issue at all). Actually, making a comparison on many titles i had, i noticed that against a GT335 (72 Cuda Cores - 450/1080/1066 on Alienware m11x), keeping the same resolution, the GT420@GT540 always performs around 20% slower than GT335 in DX9 environments and sometimes even in DX10 ones.
    I know the architecture of the GPU is different and should be more efficient in GPGPU, by the way i thought the overall results could be better in gaming too (due the rough horsepower of the GPU).

    Now, the drivers i installed are the one provided directly by Dell (259.51) and until now they are the only ones available an working (even with optimus) for my card (i tried the modified *.inf from laptopvideo2go but my ID_Hardware is missing). I would like to know if someone with similar configuration found the same issues concerning the performance and if this "lack of fps" could be a driver-age related issue or if the architecture is really supposed to work in this way (improving only the GPGPU that IMHO is useless for generic customers).

    Thanks in advance!
     
  2. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56
    DX9 Performance will sometimes go down with newer cards and even more so in drivers. Not always but I have seen it before with DX 7/8 a long time ago. I havent actually read enough about it but some of the guys I have been talking to told me the 400 series was a pretty bad mess.

    As far as CUDA and GPGPU performance I havent seen you post anything in that department. DX10 tests for physics will show a bit in that arena, has nothing todo with DX9 performance.
     
  3. Pk77

    Pk77 Notebook Enthusiast

    Reputations:
    0
    Messages:
    15
    Likes Received:
    0
    Trophy Points:
    5
    23 vs 39 fps in Starcraft 2 with the same details and resolution it's a lot (420 overclocked at 540M vs 335 default).
     
  4. Trottel

    Trottel Notebook Virtuoso

    Reputations:
    828
    Messages:
    2,303
    Likes Received:
    0
    Trophy Points:
    0
    I think your problem is the memory. The GT420M is only available with DDR3. The GT335M on the other hand can use GDDR3. That would probably go a long way to explaining the discrepancy. Also don't discount the loss of 1/3 of the texture mapping units, and 1/2 the render output units on gaming performance.

    As an aside, the GT540M can use GDDR3 or GDDR5.

    I think what you are referring to is a card getting better framerates on DX10 than DX9. This is normal. DX10 alone improves performance over DX9.
     
  5. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56
    Nice Info Trottel. Nah I am actually talking about DX9 on a newer card/driver versus older. I have mixed experiences with DX10 versus 9 performance for the same game. I usually run 9 for say Far Cry 2 but 10 isnt too much worse off. Benchmark scores go way up, doesnt always mean in game performance does, nor compatability.

    Unrelated anyways so. If this is a trend then I think I am glad I intended to skip the 400 series. Allyourgroceries was saying he is unimpressed with the 400 alltogether. I havent had a Radeon since my X700, think I am ready for the 6000 series.
     
  6. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    Am i wrong or only voltages and the physical position of the chips differs DDR from GDDR3? (this is what i found making a research some weeks ago) The bandwidth should remains the same and i don't really know if having higher voltages to keep a bit lower the timings could affect so much the framerate.
    As long as i imagine, the GT540M will be a simple rebrand of the GT420/425/435 so probably GDDR5 will be not implemented except in some "more than unique" notebooks.

    Yeah, due the lack of the mirroring of the video memory on the system one i guess (that affects instead DX9)
     
  7. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56
    I had not kept track of the library changes but that would sure as heck help. I read alot on the changes in DX11, DX10 has been awhile. I am surprised I hadn't seen this highlighted in the DX10 SDK documentation. Missed it for sure.

    I have to go look but I thought GDDR3 was actually related to DDR2 more than DDR3.
     
  8. unlogic

    unlogic Notebook Evangelist

    Reputations:
    24
    Messages:
    310
    Likes Received:
    56
    Trophy Points:
    41
    I heard that Fermi CUDA cores are weaker than old GT2xx/G9x/G8x CUDA cores.
    I know Fermi is a new architecture but it’s also less efficient?

    For ATI shaders, they are working in a group of 5 shaders, so roughly I divide the ATI shaders number by 5 if I want to compare to NVIDIA.

    So how to compare the new Fermi CUDA cores compare to old CUDA cores?
     
  9. Trottel

    Trottel Notebook Virtuoso

    Reputations:
    828
    Messages:
    2,303
    Likes Received:
    0
    Trophy Points:
    0
    Actually GDDR3 is very different from DDR3. GDDR3 is on the same technological level as DDR2, but highly optimized for graphics cards. On paper the bandwidth of DDR3 and GDDR3 might be the same, but the GDDR3 outperforms it. Graphics memory is a lot costlier than desktop memory though. The performance and cost reasons are why graphics memory is used on higher end cards and desktop memory is used on lower end cards.

    It very likely uses the same GF108 core, but it is slated to support GDDR3 and GDDR5. The desktop GF108 cards support GDDR3, but not GDDR5. The GDDR5 support is probably latent in the core but disabled by nvidia. The interfaces for each memory type, DDR3, GDDR3, and GDDR5, are completely different and the chip's memory controller is made to handle a specific or more than one type.
     
  10. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    That would confirm why performances suck, btw i'm a noob on that argument and i don't even know how is done a CUDA core, so i don't know how it could be "faster or slower" :(

    Got It :)

    In the end, this is the question!
     
  11. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    335GT should have DDR3 in Alienware M11x, so the performance gap due the memory type should also come from different working frequency (960mhz vs 1066mhz). One less problem on the route to the truth! :D
     
  12. Phinagle

    Phinagle Notebook Prophet

    Reputations:
    2,521
    Messages:
    4,392
    Likes Received:
    1
    Trophy Points:
    106
    Superscalar schedulers would also be contributing to the lower performance you're seeing.

    GF108 may have 96 cores on die but it's only got the schedulers of a 64 core GPU.
     
  13. Ruckus

    Ruckus Notebook Deity

    Reputations:
    363
    Messages:
    832
    Likes Received:
    1
    Trophy Points:
    0
    I've heard something similar about the new CUDA cores also. But I've never read anything that definitively proved it yet.

    And don't compare AMD Stream processors to CUDA cores, it would be hard to make direct comparisons like that.
     
  14. unlogic

    unlogic Notebook Evangelist

    Reputations:
    24
    Messages:
    310
    Likes Received:
    56
    Trophy Points:
    41
    I know they are totally different. It’s just my “rough” guideline because ATI has tons of stream processors :D

    Each ATI 5 SPs in the group have different tasks. NVIDIA old CUDA cores are more simple and flexible (?)

    Maybe current drivers and games are not well optimize for Fermi processing power?
     
  15. masterchef341

    masterchef341 The guy from The Notebook

    Reputations:
    3,047
    Messages:
    8,636
    Likes Received:
    4
    Trophy Points:
    206
    I think something in the ballpark of 5:1 is a decent approximation. I think I heard some newer ATI chips were using groups of 4 SPs... the number of SPs in a cluster is definitely subject to change with any architecture, so that ratio might change at any time.

    But, yes, they are not the same. Just check the benchmarks.
     
  16. Panther214

    Panther214 Notebook Evangelist

    Reputations:
    110
    Messages:
    435
    Likes Received:
    0
    Trophy Points:
    0
    lol don't whine so much.. they're mid range GPU's... if you want to cry like me, which i did, go and get a GTX460M or ATI 5870M.. no complaints about that :D

    Panther214
     
  17. Phinagle

    Phinagle Notebook Prophet

    Reputations:
    2,521
    Messages:
    4,392
    Likes Received:
    1
    Trophy Points:
    106
    Desktop Cayman/6900 will use a 4 ALU block that should match most of the performance of the current 4+1 block.

    I anxiously await the cries of "OMG Why did ATI drop the shader counts?"
     
  18. bennyg

    bennyg Notebook Virtuoso

    Reputations:
    1,567
    Messages:
    2,370
    Likes Received:
    2,375
    Trophy Points:
    181
    Already heard them. Even from sites like fudzilla... some faud knob thought the 6970 die was "harvested" because it had less "shaders" than the 5870 :rolleyes:
     
  19. essense

    essense Notebook Evangelist

    Reputations:
    14
    Messages:
    352
    Likes Received:
    0
    Trophy Points:
    30
    Have you guys tried COD: BlackOps on a GT425M ?
    I need to lower everything to not lag.. but I still lag in the movies X_x
     
  20. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    Sry no, only MW2 tried (with card @GT540M) and i got everything high around 40-50fps @ 1366*768.
     
  21. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    After 2 days i actually got what u meant :p

    Thanks to Pk77 too, i noticed that while the GT335M has 72 Cores, 8 Rops and 24 TMUs, the GT420 has 96 Cores, 16 Rops but only 16 TMUs. So the "improvement" in performance to match the old generation cards (with a Tmus/Rops ratio of 3:1) could only be reached improving these superscalar schedulers, right? (due the right of "less components leads to better optimization to keep same performances"). Furthermore from what i know DX9 games (or in general "older games" are designed around a Tmu/Rops Ratio of 2:1 (so 3:1 of older cards is better, while 1:1 seems to sucks a bit in performances :p), while DX10/11 should be more shaders oriented so we should lower the gap between the generations with newer games.
    Actually we could say that, without driver improvements (that i hope to see starting from the next version), GT4xx cores could be mached to older ones with a ratio of 2/3 (so 96 GT4xx Cores, match 64GT3xx/GT2xx Cores @ same frequencies) while playing "not optimized" games.

    (This rule should work except for GF100)

    Am I right?
     
  22. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    So you're saying 540m = 425m = 335m?
     
  23. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56
    If he was it would be wrong unless you were comparing possible performance, which this thread would then say is wrong.

    400/500 series are unrelated to the 300 in design.
     
  24. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Performance-wise are they similar? That's the impression I was getting from this thread.
     
  25. askwas

    askwas Notebook Enthusiast

    Reputations:
    11
    Messages:
    14
    Likes Received:
    0
    Trophy Points:
    5
    gf108 96 cuda cores only equal gt216 48 cuda cores in dx9 performance the same cuda frequency under No AA 1366*768
    gt425m vs gt330m in 3dmark06 got nearly the same score
     
  26. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    "GT420m @ GT540m" = "GT335m" talking about PERFORMANCE in DX9 games (sometimes even worse). GT420 has lower Texture Fillrate than GT335m but about double Pixel Fillrate. So in game who widely use shaders tecniques (DX10/11 games) the performance should be better on GT4xx instead of GT3xx. Otherwise the situation is the one described above. Tha main problem is that with console games constantly ported to PC we are going to have many other games designed for DX9.
     
  27. Phinagle

    Phinagle Notebook Prophet

    Reputations:
    2,521
    Messages:
    4,392
    Likes Received:
    1
    Trophy Points:
    106
    Lower TMU counts can affect performance but the bolded part is closer to my point.

    In previous GPU generations you had one scheduler solely responsible for assigning a workload to a specific group of cores. GF100, for example, has two warp schedulers per block of 32 cores...with each scheduler responsible for assigning a workload to 16 cores.

    In GF104, 106, and 108 two warp schedulers now handle a block of 48 cores but each scheduler can still only assign a workload to 16 cores at a time. Even though they are superscalar two warp schedulers can't always assign tasks to three groups of 16 cores.

    On average two superscalar schedulers handling three groups of 16 cores will offer more performance than two regular schedulers handling two groups of 16 cores but less performance than three regular schedulers handling three groups of 16 cores.

    So the 96 cores in GT 420M is in reality somewhere more comparable to between 64 cores and 96 cores of GF100 or GT 200/300.
     
  28. Pk77

    Pk77 Notebook Enthusiast

    Reputations:
    0
    Messages:
    15
    Likes Received:
    0
    Trophy Points:
    5
    I think it's more the sum of various factor: 4xx architecture it's realy new, the mobile version has 1 month, driver couldn't be optimized and the 420 use DDR3 and not GDR3, has also less rops and a ratio 1 to 1 never seen in nvidia architetures, it's clear that there is now, with the actual software and drivers, more than one problem.
    I have seen a review, about a month ago, with a 310M vs a default 420M, 16 sp vs 96 sp, but the performance were not so good, about 100% more, 310M it's a low end card with very limited shaders capability and this review was the first alarm about the 4xx performance.
    For sure now we can't do a direct compare shader vs shader with the previews architecture, it's clear that g92 and derivates are fastest, maybe future dx10 and dx10.1 games will be focused on fermi.
     
  29. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56

    Only if their xbox360 based (as in originally developed and then ported from). The 360 uses a modified dx9 library. Which was in keeping with microsofts plan to increase pc game sales with the original xbox.

    I actually only know that because I have done dx9 development and its written in several parts of the sdk documentation lol.

    Mind you other studios might be porting with dx9 in mind as well, its just their choice at that point. I remember something awhile back about it being easier to write for 360 and then port to ps3 which is why there are alot of 360 original coded games.
     
  30. mobius1aic

    mobius1aic Notebook Deity NBR Reviewer

    Reputations:
    240
    Messages:
    957
    Likes Received:
    0
    Trophy Points:
    30
    Judging by benchmarks, much weaker. I also noticed how weak Fermi type cores were compared to the previous generation. It doesn't help that Nvidia is running the shader clocks much slower now, in a 2:1 (Shader:Core) ratio as opposed to the 5:2 previous cards mostly were clocked at.

    A desktop GTX 460 1 GB in many ways barely beats GTX 285 (it varies from game to game of course), but obviously the Fermi design traded performance for compatibility in many respects, and at least GF104 got us the GTX 285 performance at a much lower TDP.

    As a basic rule to get a ballpark estimate, when comparing cards, not taking the clock speeds, memory size/bandwidth into account, I would put the Fermi cores at 2/3 the speed of the previous gen DX10.1 and DX10 cores. Desktop GT 430 1 GB shows this trend vs the desktop GT 240.
     
  31. Ruckus

    Ruckus Notebook Deity

    Reputations:
    363
    Messages:
    832
    Likes Received:
    1
    Trophy Points:
    0
    I wouldn't think of them as being weak as much as they are now doing a lot more than the previous. Mostly the tessellation.
     
  32. RainMotorsports

    RainMotorsports Formerly ClutchX2

    Reputations:
    565
    Messages:
    2,382
    Likes Received:
    2
    Trophy Points:
    56
    Sadly DirectX 10 has tessellation which I have run examples of from nvidia on my dx10 card.

    ATi also had a proprietary implementation of a tesselator, everything's been abandoned in favor of DX11's methods.
     
  33. Pk77

    Pk77 Notebook Enthusiast

    Reputations:
    0
    Messages:
    15
    Likes Received:
    0
    Trophy Points:
    5
    It's an implementation, in a card like 420M, that will never show an advantage:

    dx9 games runs now worst.
    dx10 games i don't know but i presume that the situation it's the same of dx9, if not heavy optimezed for Fermi.
    dx11 games with tesselation will run, but at very low frame rates.
     
  34. stevenxowens792

    stevenxowens792 Notebook Virtuoso

    Reputations:
    952
    Messages:
    2,040
    Likes Received:
    0
    Trophy Points:
    0
    Tessellation is not a problem, it's a new FEATURE that will revolutionize your gaming experience. Bleh. I was hoping to see the next gen FERMI series in the m11x. Now, I hope for alienware's sake they dont put the 420 in it. What a step backwards.

    No thanks...

    StevenX
     
  35. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Maybe they put in a better gpu like 435m or 445m?

    Or switch to ATI? I'm sure a 6570m would fit nicely in the m11x r3.
     
  36. SimoxTav

    SimoxTav Notebook Evangelist

    Reputations:
    273
    Messages:
    442
    Likes Received:
    0
    Trophy Points:
    30
    Ok, definitely DX10 runs BETTER on Fermi than DX9 (and not about a couple of frames :eek: )

    Check the attachment below:

    Far Cry 2 on DX9 - 27fps average
    Far Cry 2 on DX10 - 34fps average (+25%)

    This put me in a good mood :D
     

    Attached Files:

  37. Cakefish

    Cakefish ¯\_(?)_/¯

    Reputations:
    1,643
    Messages:
    3,205
    Likes Received:
    1,469
    Trophy Points:
    231
    Wierd for once dx10 is the better performer!