GPGPU in the near future | NotebookReview

VZX Notebook Evangelist

Reputations:: 14

Messages:: 350

Likes Received:: 0

Trophy Points:: 30

I've been comparing nVidia cards and ATI cards lately for my laptop. Right now ATI holds more raw computing power and supports for DirectX11 (more future-proof gaming cards, IMO), while nVidia have a lot of games supporting it's PhysX technology but their newest card only support DirectX 10.1 and less number of shaders.

So:
- I'm wondering, when DirectX11 to be used generally in game, do more developers will hop to Direct Compute and leave the PhysX technology by nVidia gradually?
- CUDA vs Stream. I've seen more CUDA capability than ATI/AMD's stream like how it helps to perform rotation/scaling faster in Photoshop CS4. Does CUDA indeed a better GPGPU library than stream? Or simply Stream just doesn't get as many exposure to CUDA?

Opinion/comments?

I think this is on the correct section, but if mod feels this thread doesn't belong to this section, feel free to move it.

VZX, Feb 5, 2010

#1

sgogeta4 Notebook Nobel Laureate

Reputations:: 2,389

Messages:: 10,552

Likes Received:: 7

Trophy Points:: 456

nVidia shaders operate differently than ATI's, so you can't directly compare them. The older conversion approximation was 5 ATI shaders to 1 nVidia shader, but you'd really have to see in game comparisons to differentiate between close models.

sgogeta4, Feb 6, 2010

#2

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

Being as we're talking laptops, you may also want to take power into account... an ATI card uses MUCH less power for the performance than an Nvidia card.

Pitabred, Feb 6, 2010

#3

sgogeta4 Notebook Nobel Laureate

Reputations:: 2,389

Messages:: 10,552

Likes Received:: 7

Trophy Points:: 456

That's because nVidia slacked and used larger manufacturing processes without changing their design. nVidia's newer 40nm parts are much more power efficient than the previous generations.

sgogeta4, Feb 6, 2010

#4

thinkpad knows best Notebook Deity

Reputations:: 108

Messages:: 1,140

Likes Received:: 0

Trophy Points:: 55

I've heard, for the amount of raw power it takes to actually fully run DX11 and still be able to turn up the settings, DX9 and 10 will be here for a while, at least while the X360 stays, which it will for a couple more years, as all of it's rendering is DX9. DX11 has minimal beneffits over 10 right now IMO, it just introduces better liquid physics and better physics in general of things that are hard to simulate well for GPU's, like flags, water, trees swaying.

thinkpad knows best, Feb 6, 2010

#5

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

thinkpad knows best said: ↑

I've heard, for the amount of raw power it takes to actually fully run DX11 and still be able to turn up the settings, DX9 and 10 will be here for a while, at least while the X360 stays, which it will for a couple more years, as all of it's rendering is DX9. DX11 has minimal beneffits over 10 right now IMO, it just introduces better liquid physics and better physics in general of things that are hard to simulate well for GPU's, like flags, water, trees swaying.

Click to expand...

I think the boon to DX11 will be tessellation. But I do agree, DX9 is going to be here for quite some time into the foreseeable future.

H.A.L. 9000, Feb 6, 2010

#6

VZX Notebook Evangelist

Reputations:: 14

Messages:: 350

Likes Received:: 0

Trophy Points:: 30

So, I guess for now, DirectX 11 features won't be that much useful, provided mobile graphics cards have less power than desktop's.

How about the other non-game applications ? I don't really see the hype for ATI's GPGPU capability, but I see a lot of CUDA stuff (at least on nVidia site)

VZX, Feb 6, 2010

#7

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

sgogeta4 said: ↑

That's because nVidia slacked and used larger manufacturing processes without changing their design. nVidia's newer 40nm parts are much more power efficient than the previous generations.

Click to expand...

Not just manufacturing process. The 360M is essentially a rebadged 260M, which is a rebadged 160M, which is... you get the picture. Nvidia hasn't had anything new in their chips since the 9xxx range. Fermi may change that, but early reports say it's a massive power hog, which means it won't hit the mobile chips any time soon.

Pitabred, Feb 6, 2010

#8

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

Pitabred said: ↑

Not just manufacturing process. The 360M is essentially a rebadged 260M, which is a rebadged 160M, which is... you get the picture. Nvidia hasn't had anything new in their chips since the 9xxx range. Fermi may change that, but early reports say it's a massive power hog, which means it won't hit the mobile chips any time soon.

Click to expand...

Didn't they change from the G98 (?) core in the 160m to the GT200 core in the 260/360m?

H.A.L. 9000, Feb 6, 2010

#9

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

XGX2007 said: ↑

Didn't they change from the G98 (?) core in the 160m to the GT200 core in the 260/360m?

Click to expand...

Technically yes. Technologically, no. It's still essentially the same core with some other things added on, modified slightly for DX10.1

Pitabred, Feb 6, 2010

#10

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

Pitabred said: ↑

Technically yes. Technologically, no. It's still essentially the same core with some other things added on, modified slightly for DX10.1

Click to expand...

Interesting. I still can't wrap my head around the NVIDIA naming scheme between the cores and the card designations. Oh, NVIDIA and their re-badging!

H.A.L. 9000, Feb 6, 2010

#11

Deks Notebook Prophet

Reputations:: 1,272

Messages:: 5,201

Likes Received:: 2,073

Trophy Points:: 331

Nvidia is in the same predicament with their gpu's like AMD is with their cpu's.

Deks, Feb 7, 2010

#12

notyou Notebook Deity

Reputations:: 652

Messages:: 1,562

Likes Received:: 0

Trophy Points:: 55

VZX said: ↑

I've been comparing nVidia cards and ATI cards lately for my laptop. Right now ATI holds more raw computing power and supports for DirectX11 (more future-proof gaming cards, IMO), while nVidia have a lot of games supporting it's PhysX technology but their newest card only support DirectX 10.1 and less number of shaders.

So:
- I'm wondering, when DirectX11 to be used generally in game, do more developers will hop to Direct Compute and leave the PhysX technology by nVidia gradually?
- CUDA vs Stream. I've seen more CUDA capability than ATI/AMD's stream like how it helps to perform rotation/scaling faster in Photoshop CS4. Does CUDA indeed a better GPGPU library than stream? Or simply Stream just doesn't get as many exposure to CUDA?

Opinion/comments?

I think this is on the correct section, but if mod feels this thread doesn't belong to this section, feel free to move it.

Click to expand...

Just figured I'd chip in here since I'm doing a research project to compare the best way to exploit the greatest amount of parallelism with different methods (OpenMP, CUDA, and OpenCL). Right now, I've only been working on the OpenMP aspect of this, but once I'm finished, I'll port the code over to CUDA and OpenCL to see which is easiest to program and takes best advantage of the available hardware.

notyou, Feb 7, 2010

#13

VZX Notebook Evangelist

Reputations:: 14

Messages:: 350

Likes Received:: 0

Trophy Points:: 30

That's cool.
What's the research about, if I may know?

VZX, Feb 7, 2010

#14

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

notyou said: ↑

Just figured I'd chip in here since I'm doing a research project to compare the best way to exploit the greatest amount of parallelism with different methods (OpenMP, CUDA, and OpenCL). Right now, I've only been working on the OpenMP aspect of this, but once I'm finished, I'll port the code over to CUDA and OpenCL to see which is easiest to program and takes best advantage of the available hardware.

Click to expand...

From what I've read in multiple articles, CUDA seems to be the easiest. Honestly, I have no idea, as I don't know how to use any of those, but CUDA seriously has a beastly name..lol

H.A.L. 9000, Feb 7, 2010

#15

f4ding Laptop Owner

Reputations:: 261

Messages:: 2,085

Likes Received:: 0

Trophy Points:: 55

Nvidia's GT200 while doesn't have much improvement over the G98 core in terms of regular gaming features, it does include improvement in terms of CUDA-capable computing unit. Nvidia probably saw the potential of the money they can make from CUDA and neglected the gaming features a little bit.

OpenCL is included in both CUDA and Stream libraries too. But in terms of capability, CUDA is simpler compare to Stream, at least according to CUDA people (huge possibility of being bias). Either way, it's a fact that CUDA is ahead of ATI/AMD stream right now. More software are taking advantage of CUDA compare to Stream. The capability is there for stream, but it seems that they're not marketed properly.

f4ding, Feb 9, 2010

#16

notyou Notebook Deity

Reputations:: 652

Messages:: 1,562

Likes Received:: 0

Trophy Points:: 55

VZX said: ↑

That's cool.
What's the research about, if I may know?

Click to expand...

The research is as stated earlier, to compare the different ways of performing computation in parallel to achieve the best possible performance. I'm analyzing three different types of algorithms (easy, medium, hard) to make parallel based on their dependencies and memory access patterns.

XGX2007 said: ↑

From what I've read in multiple articles, CUDA seems to be the easiest. Honestly, I have no idea, as I don't know how to use any of those, but CUDA seriously has a beastly name..lol

Click to expand...

I've done a little bit of research into OpenCL, and from what I've seen, it's mostly a cut and paste (for names at least) to go from a CUDA function to an OpenCL function. See http://developer.amd.com/documentation/articles/pages/OpenCL-and-the-ATI-Stream-v2.0-Beta.aspx for more details.

f4ding said: ↑

Nvidia's GT200 while doesn't have much improvement over the G98 core in terms of regular gaming features, it does include improvement in terms of CUDA-capable computing unit. Nvidia probably saw the potential of the money they can make from CUDA and neglected the gaming features a little bit.

OpenCL is included in both CUDA and Stream libraries too. But in terms of capability, CUDA is simpler compare to Stream, at least according to CUDA people (huge possibility of being bias). Either way, it's a fact that CUDA is ahead of ATI/AMD stream right now. More software are taking advantage of CUDA compare to Stream. The capability is there for stream, but it seems that they're not marketed properly.

Click to expand...

CUDA had the advantage of getting out of the gate earlier. I believe, that once OpenCL gains some more ground (how it will do it I don't know), CUDA will be pushed back (unless for some reason it gives much better performance) since there is no point in developing code that can only work on one architecture.

Any other GPGPU questions you guys have, fire at me since I may be able to answer them, or my supervising professor will.

notyou, Feb 9, 2010

#17

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

Even if CUDA is out earlier, it locks you into Nvidia. OpenCL is agnostic, and should be much more forward-compatible with hardware. You can put whatever is the fastest hardware at the time into your system and it'll run. With CUDA, you're going to be stuck with whatever is the fastest Nvidia hardware.

Pitabred, Feb 9, 2010

#18

notyou Notebook Deity

Reputations:: 652

Messages:: 1,562

Likes Received:: 0

Trophy Points:: 55

Pitabred said: ↑

Even if CUDA is out earlier, it locks you into Nvidia. OpenCL is agnostic, and should be much more forward-compatible with hardware. You can put whatever is the fastest hardware at the time into your system and it'll run. With CUDA, you're going to be stuck with whatever is the fastest Nvidia hardware.

Click to expand...

That's one of the things I'll be testing. Whether CUDA or OpenCL will take better advantage of the available hardware. One point I'm really interested in, is to see how CUDA does vs OpenCL on Nvidia hardware and then match the theoretical FLOPS to an OpenCL ATI card (if the G73 ever stops being delayed...).

notyou, Feb 9, 2010

#19

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

notyou said: ↑

That's one of the things I'll be testing. Whether CUDA or OpenCL will take better advantage of the available hardware. One point I'm really interested in, is to see how CUDA does vs OpenCL on Nvidia hardware and then match the theoretical FLOPS to an OpenCL ATI card (if the G73 ever stops being delayed...).

Click to expand...

It's entirely possible that CUDA will be faster. My point is that it may not be the best bet to put all your eggs in the Nvidia basket, even at the expense of a bit of performance, because it severely limits your bargaining power with hardware selection, and just reuse of the code going forward.

Pitabred, Feb 9, 2010

#20

jasperjones Notebook Evangelist

Reputations:: 293

Messages:: 427

Likes Received:: 4

Trophy Points:: 31

Technological advances in software typically lag technological advances in hardware. I believe GPGPU will be limited to specialized applications until the mainstream user adopts more powerful GPUs. Even on a mainstream dedicated *desktop* graphics card, GPGPU performance isn't too impressive, particularly if your application requires double precision. Not denying that GPGPU is becoming more common. The point is just that your average Joe's graphics hardware will remain so weak (for the next 2, 3, or 4 years) that GPGPU will not be critical to his computing experience.

The spreading of platform-independent interfaces such as OpenCL (or DirectCompute if we restrict ourselves to Windows) is inevitable. In a few years, everyone will have GPUs that are capable of OpenCL or DirectCompute. At that point, I see little reason for most developers to target vendor-specific APIs.

jasperjones, Feb 9, 2010

#21

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

Also, I believe if Apple is as serious as I think they are at implementing GPGPU via OpenCL in Snow Leapord and beyond, it could be a very good thing for OpenCL. All it really needs is for Apple to show the public a good way to apply OpenCL to the mainstream for video encoding/decoding or accelerating various functions of Apple code and it's in the bag for them. I think that if they get a successful app that actually uses OpenCL and people see how much faster it really is at certain tasks... somebody's bottom line is going to look very rosy. And everyone knows.... like it or not, that a LOT of companies copy Apple's every move.

H.A.L. 9000, Feb 9, 2010

#22

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

XGX2007 said: ↑

.... like it or not, that a LOT of companies copy Apple's every move.

Click to expand...

Only in the consumer space. OpenCL is definitely not a consumer-space technology in the near term.

Pitabred, Feb 9, 2010

#23

H.A.L. 9000 Occam's Chainsaw

Reputations:: 6,415

Messages:: 5,296

Likes Received:: 552

Trophy Points:: 281

Pitabred said: ↑

Only in the consumer space. OpenCL is definitely not a consumer-space technology in the near term.

Click to expand...

But it is getting pushed more and more into that space...

H.A.L. 9000, Feb 9, 2010

#24

Phinagle Notebook Prophet

Reputations:: 2,521

Messages:: 4,392

Likes Received:: 1

Trophy Points:: 106

Pitabred said: ↑

Even if CUDA is out earlier, it locks you into Nvidia. OpenCL is agnostic, and should be much more forward-compatible with hardware. You can put whatever is the fastest hardware at the time into your system and it'll run. With CUDA, you're going to be stuck with whatever is the fastest Nvidia hardware.

Click to expand...

This Fudzilla article from last week got me thinking about the rumors of ION 2 being a form of discrete GPU using a PCIe(mini?) connection and the possible potential to use it as a dedicated PhysX card for notebooks.

Need more facts about ION 2 before we know if it's possible or not but if it works it would deal with the problem of being locked into buying the fastest Nvidia hardware.

Phinagle, Feb 9, 2010

#25

Bullit Notebook Deity

Reputations:: 122

Messages:: 864

Likes Received:: 9

Trophy Points:: 31

CUDA was stable before OpenCL so Nvidia has a head there but now are start to come stuff for OpenCL too. See for example Luxrender :

http://www.luxrender.net/forum/viewtopic.php?f=13&t=3439

First results on a high-end system OPEN CL

Luxrender Classic Vs LuxrenderGPU on a i7 860 + ATI HD 5870: 57K Samples/sec Vs 202K Samples/sec

Click to expand...

First results on a low-end system

Luxrender Classic Vs LuxrenderGPU on a Q6600 + NVIDIA 240GT: 31K Samples/sec Vs 59K Samples/sec

Click to expand...

So Nvidia GT240 can do 28k samples/sec and HD5870 can 145k samples/sec , that is a big difference, while i7 can do 57k samples/sec and Quad6600 can do 31K samples/sec

The GPU difference is due mainly to the shaders numbers from what i learned, and it will be one of most important values to check in the future buys.

OpenCL is definitely not a consumer-space technology in the near term.

Click to expand...

It will be.
You want to render your AVCHD holiday movie and it will need some sort of rendering acceleration. For example the simple $39 video correcting http://www.vreveal.com/ uses CUDA. PowerDirector already uses CUDA and for AMD needs Avivo. Next Adobe premiere will have acceleration apparently only CUDA but OpenCL will come too.

Bullit, Feb 9, 2010

#26