nVidia admits Maxwell can't handle async compute well in developer do's and don'ts | Page 1 | NotebookReview

Ethrem Notebook Prophet

Reputations:: 1,404

Messages:: 6,706

Likes Received:: 4,735

Trophy Points:: 431

I can't say that I'm surprised because Maxwell wasn't engineered for DirectX 12 while GCN was but...

http://wccftech.com/nvidia-devs-computegraphics-toggle-heavyweight-switch/

Ethrem, Oct 13, 2015

#1

jaybee83 and i_pk_pjers_i like this.

i_pk_pjers_i Even the ppl who never frown eventually break down

Reputations:: 205

Messages:: 1,033

Likes Received:: 598

Trophy Points:: 131

I know I shouldn't be happy about this because I have so many NVIDIA GPUs but honestly this does make me happy. I want NVIDIA to crash and burn a little bit so AMD can stay in business and then we won't have to suffer with a monopoly.

i_pk_pjers_i, Oct 13, 2015

#2

TomJGX likes this.

HTWingNut Potato

Reputations:: 21,580

Messages:: 35,370

Likes Received:: 9,877

Trophy Points:: 931

i_pk_pjers_i said: ↑

I know I shouldn't be happy about this because I have so many NVIDIA GPUs but honestly this does make me happy. I want NVIDIA to crash and burn a little bit so AMD can stay in business and then we won't have to suffer with a monopoly.

Click to expand...

Well, I'd rather not go backwards in performance, but forwards. If Nvidia were to "crash and burn" we'd be stuck with lesser performance just so AMD could catch up. What we really need is another company to buy the graphics unit of AMD/Radeon whose single focus is the GPU market and live or die by success instead of just being a subset of a larger corporation where any failures can be offset by other division's profit. AMD doesn't care much about gaming graphics any more, except for consoles. And they like the server market too. Otherwise desktop and mobile GPU's are a hindsight.

HTWingNut, Oct 13, 2015

#3

octiceps Nimrod

Reputations:: 3,147

Messages:: 9,944

Likes Received:: 4,194

Trophy Points:: 431

HTWingNut said: ↑

What we really need is another company to buy the graphics unit of AMD/Radeon whose single focus is the GPU market and live or die by success instead of just being a subset of a larger corporation where any failures can be offset by other division's profit.

Click to expand...

If anything, AMD's graphics division has been one of its few profitable areas in recent years, and it's AMD's failures on the CPU side that have affected their investment in graphics and hamstrung progress, particularly mobile graphics where historically they've always had much lower market share anyway.

octiceps, Oct 13, 2015

#4

Kent T likes this.

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

^this. The GPU division pretty much single-handedly kept the company afloat after the Faildozer disaster.

Also nothing to be happy about. If there's a lack of competition it just means things will stagnate. If you ever need a reminder why a monopoly is bad, just look at what happened with CPUs after Sandy Bridge. Although it won't be 5% annual improvement bad. One because Jen-Hsun's ego won't allow it, and two because people will just stop buying GPUs if nVidia pulled something like that. Regardless, if AMD folded nVidia will definitely slow down, and charge more for less. Remember how every single SKU in their lineup pretty much doubled overnight with the release of 680? I have to applaud nVidia though, charging big die price for medium die but still have people gobbling them up like crazy. The trick? Just call it x80 instead of x60 Ti, most suckers buyers won't know the difference anyway.

n=1, Oct 13, 2015

#5

Apollo13, D2 Ultima, TomJGX and 1 other person like this.

i_pk_pjers_i Even the ppl who never frown eventually break down

Reputations:: 205

Messages:: 1,033

Likes Received:: 598

Trophy Points:: 131

HTWingNut said: ↑

Well, I'd rather not go backwards in performance, but forwards. If Nvidia were to "crash and burn" we'd be stuck with lesser performance just so AMD could catch up. What we really need is another company to buy the graphics unit of AMD/Radeon whose single focus is the GPU market and live or die by success instead of just being a subset of a larger corporation where any failures can be offset by other division's profit. AMD doesn't care much about gaming graphics any more, except for consoles. And they like the server market too. Otherwise desktop and mobile GPU's are a hindsight.

Click to expand...

Well I obviously wouldn't rather performance go backwards, but I don't want performance to increase to the point where AMD simply cannot keep up. I would rather performance just kind of taper off so AMD can catch up. Right now, AMD is just being killed, and I REALLY don't want them to go under. I would absolutely love for there to be a company that buys AMD but I just don't see that happening.

i_pk_pjers_i, Oct 14, 2015

#6

PrimeTimeAction Notebook Evangelist

Reputations:: 250

Messages:: 542

Likes Received:: 1,138

Trophy Points:: 156

i_pk_pjers_i said: ↑

Well I obviously wouldn't rather performance go backwards, but I don't want performance to increase to the point where AMD simply cannot keep up. I would rather performance just kind of taper off so AMD can catch up. Right now, AMD is just being killed, and I REALLY don't want them to go under. I would absolutely love for there to be a company that buys AMD but I just don't see that happening.

Click to expand...

I would not count AMD out yet in GPU market. I have seen alot of "Budget Gaming Desktops" recommending AMD GPUS due to performance to price ratio. And going by their track record, they usually dont have any idea what their hardware can or cannot do until it is released. It is quite possible that they are working on something fantastic currently and they dont have a clue about it. But unfortunately its equally possible that the next big thing from them is a complete flop in real world. And yes Pascal will be a huge challenge for them.

Last edited: Oct 14, 2015

PrimeTimeAction, Oct 14, 2015

#7

i_pk_pjers_i likes this.

thegreatsquare Notebook Deity

Reputations:: 135

Messages:: 1,068

Likes Received:: 425

Trophy Points:: 101

PrimeTimeAction said: ↑

And going by their track record, they usually dont have any idea what their hardware can or cannot do until it is released. It is quite possible that they are working on something fantastic currently and they dont have a clue about it

Click to expand...

That's a scary thought. That it's like waiting around for monkeys to produce Shakespeare.

thegreatsquare, Oct 15, 2015

#8

TomJGX and TBoneSan like this.

octiceps Nimrod

Reputations:: 3,147

Messages:: 9,944

Likes Received:: 4,194

Trophy Points:: 431

thegreatsquare said: ↑

That's a scary thought. That it's like waiting around for monkeys to produce Shakespeare.

Click to expand...

A monkey could write more intelligible English than Shakespeare

octiceps, Oct 15, 2015

#9

Player2 likes this.

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

i_pk_pjers_i said: ↑

Right now, AMD is just being killed, and I REALLY don't want them to go under.

Click to expand...

There's limits to how much someone can defend or praise a company, and AMD is failing because they aren't making anything new. Let's see what AMD's created in the GPU market since late 2011 to early 2012 with its original 7000 series launch:
- Hawaii (R9 290, 290x, 390, 390x, 295x2)
- Tonga (R9 285, 380, soon-to-launch 380x; is a flat downgrade in power, vRAM and memory bandwidth from Tahiti)
- Fiji (R9 Fury, Fury X)
- Various APUs still based on IPC of Piledriver chips

Repurposing stuff and dropping prices only goes so far.

nVidrosoft's line is actually in a far worse shape than AMD's current line all things considered but at the least their technologies are fully improving, and heat has gone down as OCability has risen. Even though power consumption has (without voltage adjustment tricks) gone UP, we're still not at the level of AMD's "midrange" 250W GPU that can barely hit 1100MHz over its 1000MHz base.

I mean, I recommend their GPUs most of the time now because most of their lineup is generally a better card for an average user, but middle of 2014 I was telling people on forums that nVidia should actually be considered instead of constant "AMD AMD AMD" the whole time. nVidrosoft however has decided to:
- make unstable cards for their current line
- charge $1000 for a plain gaming GPU because people don't know better
- stagger release their cards, again (unlike apparently with the Titan card where they needed to power-revise it like what they did going from 400 series to 500 series.. can't guarantee the truth there either)
- remove features from SLI and make it less stable
- allow their drivers to go to absolute crap and start pushing out a bunch of WHQL drivers that crash and interfere with and all sorts of crap to systems, with no actual WHQL license protection going around
- sweep as many problems under the table as possible

As I said in another post: AMD needs to get their act together, BAD, and fast. AMD is SLOWLY climbing a hard mountain, but they're only being considered because nVidrosoft has jumped off the top and waved at them as they headed for rock bottom. And instead of healthy competition, we're left with a choice between:
- A company with broken, terrible drivers, awful anti-consumer business practices, unstable, broken, badly designed specifications, overpriced cards, where just about two cards in the entire lineup are worth the $$ (980Ti and 750Ti), that seems to be going backwards with their multi-GPU features and support.
- A company with also broken, slowly-updated, DX11 CPU-heavy drivers (that happen to be more stable), hot, power hungry, tessellation-crippled cards that can barely overclock as they're designed near their limits already, with multi-GPU configurations that can't work in any title that's not fullscreen after over 10 years.

This is a terrible time to be a consumer.

D2 Ultima, Oct 21, 2015

#10

triturbo, TomJGX and i_pk_pjers_i like this.

Talon Notebook Virtuoso

Reputations:: 1,482

Messages:: 3,519

Likes Received:: 4,695

Trophy Points:: 331

SLI problems aside, my GTX 970 was a great card. I don't remember ever having driver issues. That card boosted, and overclocked great at very low temps. When I decided to SLI it though, it was rarely supported it seemed or had terrible utilization. BF4 was the exception not the rule.

My 980 Ti is an absolute champ. Its rock stable, boosts a crazy amount on stock (1404mhz out of the box) and hasn't crashed or had any driver issues.

I think the driver issues you're referring to are more related to laptops and older Nvidia GPUs. That is some shady practice on Nvidia's part if they are purposely reducing performance of older cards to sell more current gens. For that reason I would love to see AMD make a huge comeback. I think Nvidia makes some great GPUs, but they need to be kept in check.

Talon, Oct 21, 2015

#11

ryzeki Super Moderator Super Moderator

Reputations:: 6,547

Messages:: 6,410

Likes Received:: 4,085

Trophy Points:: 431

I'm kinda sad about the SLI situation to be honest. I used to be single GPU precisely to avoid issues, but I was tempted to try high end SLI, and it seems like it was not the best way to go hahaha

ryzeki, Oct 21, 2015

#12

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

D2 Ultima said: ↑

-WORDS-

This is a terrible time to be a consumer computer enthusiast.

Click to expand...

FTFY

Like I said, it's almost as if the industry wants to push us towards consoles.

n=1, Oct 21, 2015

#13

TBoneSan, TomJGX and D2 Ultima like this.

Raidriar ლ(ಠ益ಠლ)

Reputations:: 1,708

Messages:: 5,820

Likes Received:: 4,311

Trophy Points:: 431

I really do think (and hope) that AMD is waiting for the 14nm node transition to roll out any major revamping of their architecture. They dragged things along with TeraScale, they are dragging things out now with GCN. Maybe Intel should purchase AMD's graphics division and make it something great.

Raidriar, Oct 21, 2015

#14

TomJGX I HATE BGA!

Reputations:: 1,456

Messages:: 8,707

Likes Received:: 3,315

Trophy Points:: 431

Raidriar said: ↑

I really do think (and hope) that AMD is waiting for the 14nm node transition to roll out any major revamping of their architecture. They dragged things along with TeraScale, they are dragging things out now with GCN. Maybe Intel should purchase AMD's graphics division and make it something great.

Click to expand...

Lol that would be the end of AMD... For an AMD fan, that's a pretty dumb comment...

TomJGX, Oct 22, 2015

#15

J.Dre Notebook Nobel Laureate

Reputations:: 3,700

Messages:: 8,323

Likes Received:: 3,820

Trophy Points:: 431

AMD exists because of Intel. They wouldn't be legally allowed to purchase them. If they merged, Intel would control more than 67% (legal maximum) of the processor market, making it a monopoly, even though it pretty much already is. That's what us business folk think of AMD. It's a bit different on the other side of the coin.

J.Dre, Oct 22, 2015

#16

TomJGX likes this.

Raidriar ლ(ಠ益ಠლ)

Reputations:: 1,708

Messages:: 5,820

Likes Received:: 4,311

Trophy Points:: 431

TomJGX said: ↑

Lol that would be the end of AMD... For an AMD fan, that's a pretty dumb comment...

Click to expand...

I'm not an AMD fan lol. I'm just a neutral observer. nVidia has its pros and cons, as does AMD. I just think in the shape AMD is in right now, they can't afford to hire the right engineers to get themselves back on their feet. Intel could remedy that in a heartbeat and further both dedicated and integrated graphics departments. I do see AMD ending up on the chopping block with different companies snatching up different portions.

Raidriar, Oct 22, 2015

#17

TBoneSan Laptop Fiend

Reputations:: 4,460

Messages:: 5,558

Likes Received:: 5,798

Trophy Points:: 681

I've been reading around that AMD could still sell off their CPU division as they see fit and the Intel x86 agreement not mean squat since it still results in them having a monopoly if enforced. Thus Intel can't do much about it. Wendel on Tek Syndicate goes into detail about it.

TBoneSan, Oct 22, 2015

#18

Zymphad Zymphad

Reputations:: 2,321

Messages:: 4,165

Likes Received:: 355

Trophy Points:: 151

It's not a huge issue since hardcore gamers who care about DirectX 12 will upgrade their GPU once games that actually use DX12 are released. If Pascal also proves to not be optimized to fully take advantage of DX12.1 in all it's glory, then that will be devastating.

But we also have to see how the performance difference will be. I will be curious for example how well say Star Citizen or Deus Ex in DX12 vs DX11.

But it is disappointing to read NVidia didn't do their homework with Maxwell. I'm assuming they assumed that DX12 games won't be ready for Maxwell and hoped folks wouldn't notice as they upgraded to Pascal.

Zymphad, Oct 22, 2015

#19

hmscott likes this.

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

Zymphad said: ↑

But it is disappointing to read NVidia didn't do their homework with Maxwell. I'm assuming they assumed that DX12 games won't be ready for Maxwell and hoped folks wouldn't notice as they upgraded to Pascal.

Click to expand...

That's pretty much it. nVidrosoft makes cards to suit the times. If you look at the functionality of GPUs, Fermi is the best. Kepler removed double precision from all but two cards. Maxwell doesn't have it AT ALL. Cuda performance went down since Fermi except using double-precision on Titans. Using double-precision on Titans reduces gaming performance. The cards became all-in for current-gen gaming at the time of their release, and disregarded anything else. AMD kept everything in, that's all.

D2 Ultima, Oct 22, 2015

#20

TomJGX likes this.

TBoneSan Laptop Fiend

Reputations:: 4,460

Messages:: 5,558

Likes Received:: 5,798

Trophy Points:: 681

I was counting on Star Citizen to be leading the pack with DX12 but have since been guttered by their lack of enthusiasm. It seems like DX12 isn't exactly on the cards.

TBoneSan, Oct 22, 2015

#21

sniffin Notebook Evangelist

Reputations:: 68

Messages:: 429

Likes Received:: 256

Trophy Points:: 76

D2 Ultima said: ↑

That's pretty much it. nVidrosoft makes cards to suit the times. If you look at the functionality of GPUs, Fermi is the best. Kepler removed double precision from all but two cards. Maxwell doesn't have it AT ALL. Cuda performance went down since Fermi except using double-precision on Titans. Using double-precision on Titans reduces gaming performance. The cards became all-in for current-gen gaming at the time of their release, and disregarded anything else. AMD kept everything in, that's all.

Click to expand...

Well they didn't gut DP, it's there but there are less FP64 capable units than there were on Fermi. Making GPUs to suit the times has worked pretty well for Nvidia so you can hardly fault them for doing it. Nobody who buys Radeon/Geforce cares about DP so all it ends up doing for AMD is wasting die space. Honestly everybody will buy Pascal anyway so why should Nvidia care about Maxwell's DX12 capabilities? People whinge and moan but bend over anyway.

And Fermi was an abomination in all honesty. It was one of those moments where AMD actually had somebody by the balls, shame it didn't last.

sniffin, Oct 23, 2015

#22

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

sniffin said: ↑

Nobody who buys Radeon/Geforce cares about DP so all it ends up doing for AMD is wasting die space.

Click to expand...

Bitcoin miners

D2 Ultima, Oct 23, 2015

#23

TomJGX likes this.

sniffin Notebook Evangelist

Reputations:: 68

Messages:: 429

Likes Received:: 256

Trophy Points:: 76

D2 Ultima said: ↑

Bitcoin miners

Click to expand...

Sorry I should have specified that people don't care

sniffin, Oct 23, 2015

#24

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

sniffin said: ↑

Sorry I should have specified that people don't care

Click to expand...

"Most gamers" is the term you're looking for =D

D2 Ultima, Oct 23, 2015

#25

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

D2 Ultima said: ↑

That's pretty much it. nVidrosoft makes cards to suit the times. If you look at the functionality of GPUs, Fermi is the best. Kepler removed double precision from all but two cards. Maxwell doesn't have it AT ALL. Cuda performance went down since Fermi except using double-precision on Titans. Using double-precision on Titans reduces gaming performance. The cards became all-in for current-gen gaming at the time of their release, and disregarded anything else. AMD kept everything in, that's all.

Click to expand...

It's not so much AMD kept everything in as they designed an architecture that would allow them to expand into the professional/HPC segment with the highest margins. VLIW was very good at graphics but didn't handle compute too well. So AMD's philosophy with GCN was to make it a "flexible architecture" to be good at both graphics and compute. Basically you could say AMD tried to make Fermi 2.0, and on a raw computation power/TFLOPS level the Fury X does absolutely demolish the Titan X. AnandTech has a great writeup on GCN, and there's also a tl;dr version as well.

n=1, Oct 23, 2015

#26

TomJGX and D2 Ultima like this.

Zymphad Zymphad

Reputations:: 2,321

Messages:: 4,165

Likes Received:: 355

Trophy Points:: 151

n=1 said: ↑

It's not so much AMD kept everything in as they designed an architecture that would allow them to expand into the professional/HPC segment with the highest margins. VLIW was very good at graphics but didn't handle compute too well. So AMD's philosophy with GCN was to make it a "flexible architecture" to be good at both graphics and compute. Basically you could say AMD tried to make Fermi 2.0, and on a raw computation power/TFLOPS level the Fury X does absolutely demolish the Titan X. AnandTech has a great writeup on GCN, and there's also a tl;dr version as well.

Click to expand...

But does it crush Quadro? I haven't been reading about AMD making big gains and sales in professional market with Radeon GPUs.

Zymphad, Oct 23, 2015

#27

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

Zymphad said: ↑

But does it crush Quadro? I haven't been reading about AMD making big gains and sales in professional market with Radeon GPUs.

Click to expand...

Quadro and GeForce have been mostly the same cards since Kepler.

AMD cards technically destroy quadro with OpenCL etc.

D2 Ultima, Oct 24, 2015

#28

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

FirePros are to Quadro as Radeons are to GeForce. So no Radeon would not be crushing Quadro in the professional segment.

n=1, Oct 24, 2015

#29

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

Yeah, mainly because Quadro/FirePro is ALLLLLLLLLLLL about drivers. Since the cards are almost exactly the same, you're literally paying 4 figures+ for drivers and nothing else.
And let's be honest. nVidrosoft's drivers are so far beyond AMD's it's a joke. For the Quadros. They gave up having better drivers in the GeForce cards for no reason.

I find it disgusting though that paying so many thousands extra for drivers when your card can barely do anything but FP32 compute is stupid though. But there's really no choice for professionals at this point. You grab nVidrosoft's crappy cards or you deal with artifacting and various crashing issues, not to mention heat in the last few years (and this is coming from people I know who work in places that have basically render farms who dumped AMD because they were too unstable).

D2 Ultima, Oct 24, 2015

#30

Apollo13 and TomJGX like this.

nipsen Notebook Ditty

Reputations:: 694

Messages:: 1,686

Likes Received:: 131

Trophy Points:: 81

..pretty sure those general guidelines hold true for all cards where you don't have unlimited amounts of separate compute cores And that being conscious of whether a separate "compute" command will stall "graphics" or shaders that aren't completed or not is another extremely general and very obvious and good piece of advice for any architecture.

But since Maxwell collapses the number of "smx" units to fewer "devices", so to speak, it's obvious that you have fewer options to randomly add compute routines without having to rely on some internal scheduling, that then results in context shifts. So likely the results from running compute by pre-emption (that in theory is really a way to reduce the number of context shifts) might have adverse results, simply because it will cause the internal scheduler to create a context shift, since it needs to reassign for example one "smx" that are already in use.

And.. anyone who programs compute and feels this is a tremendous surprise probably isn't really paying attention. And note that you get similar problems if you pre-empt with any shader code on any number of cards, and need to rely on the internal scheduler. I mean, it's very basic stuff that holds true for any computer core (that you might create thread starvation and multiplying context shifts for a very long time by splitting tasks into many concurrent tasks), even if it's common to teach people to disregard overhead since single operations are "so quick on modern architectures we can generally ignore overhead", etc.

But sure - context switching on nvidia cards in general is slow. And that's not really what they're optimized or made for either. Even Quadro cards aren't, even if they do have more discrete cores that agree better with compute tasks.

nipsen, Oct 25, 2015

#31

Ethrem Notebook Prophet

Reputations:: 1,404

Messages:: 6,706

Likes Received:: 4,735

Trophy Points:: 431

The problem is that up until Maxwell 2 it was like a one lane highway... Maxwell 2 has a whopping 2 lanes while AMD has what, 16 in GCN? It quickly becomes clear why nVidia takes such a performance hit. It's like trying to shove all the traffic on the interstate into two lanes... I live on the I-25 corridor and I can tell you what rush hour is like when you're going up north of Denver.

Ethrem, Oct 25, 2015

#32

Apollo13, TomJGX, n=1 and 1 other person like this.

sniffin Notebook Evangelist

Reputations:: 68

Messages:: 429

Likes Received:: 256

Trophy Points:: 76

nipsen said: ↑

..pretty sure those general guidelines hold true for all cards where you don't have unlimited amounts of separate compute cores And that being conscious of whether a separate "compute" command will stall "graphics" or shaders that aren't completed or not is another extremely general and very obvious and good piece of advice for any architecture.

But since Maxwell collapses the number of "smx" units to fewer "devices", so to speak, it's obvious that you have fewer options to randomly add compute routines without having to rely on some internal scheduling, that then results in context shifts. So likely the results from running compute by pre-emption (that in theory is really a way to reduce the number of context shifts) might have adverse results, simply because it will cause the internal scheduler to create a context shift, since it needs to reassign for example one "smx" that are already in use.

And.. anyone who programs compute and feels this is a tremendous surprise probably isn't really paying attention. And note that you get similar problems if you pre-empt with any shader code on any number of cards, and need to rely on the internal scheduler. I mean, it's very basic stuff that holds true for any computer core (that you might create thread starvation and multiplying context shifts for a very long time by splitting tasks into many concurrent tasks), even if it's common to teach people to disregard overhead since single operations are "so quick on modern architectures we can generally ignore overhead", etc.

But sure - context switching on nvidia cards in general is slow. And that's not really what they're optimized or made for either. Even Quadro cards aren't, even if they do have more discrete cores that agree better with compute tasks.

Click to expand...

Maxwell 2 only has a single queue called a Work Distributor and all commands are stuffed into this single queue. The problem is that queuing more than 31 compute commands can actually completely block graphics commands. It's fine up to a point, then it falls over. GCN does not have this problem. Nvidia will probably manage this by encouraging developers to minimize the amount of compute commands issued. They'll say something like use it sparingly.

sniffin, Oct 26, 2015

#33

Ethrem Notebook Prophet

Reputations:: 1,404

Messages:: 6,706

Likes Received:: 4,735

Trophy Points:: 431

sniffin said: ↑

Maxwell 2 only has a single queue called a Work Distributor and all commands are stuffed into this single queue. The problem is that queuing more than 31 compute commands can actually completely block graphics commands. It's fine up to a point, then it falls over. GCN does not have this problem. Nvidia will probably manage this by encouraging developers to minimize the amount of compute commands issued. They'll say something like use it sparingly.

Click to expand...

That's what they did lol. What annoys me is that nVidia knew this which is why GM2* has two lanes instead of one but they didn't bother to actually address it and won't until Pascal. Planned obsolescence...

Ethrem, Oct 26, 2015

#34

TomJGX, TBoneSan and D2 Ultima like this.

Ramzay Notebook Connoisseur

Reputations:: 476

Messages:: 3,185

Likes Received:: 1,065

Trophy Points:: 231

HTWingNut said: ↑

...live or die by success instead of just being a subset of a larger corporation where any failures can be offset by other division's profit.

Click to expand...

cough...Micro$oft...cough

Ramzay, Oct 26, 2015

#35

nipsen Notebook Ditty

Reputations:: 694

Messages:: 1,686

Likes Received:: 131

Trophy Points:: 81

sniffin said: ↑

Maxwell 2 only has a single queue called a Work Distributor and all commands are stuffed into this single queue. The problem is that queuing more than 31 compute commands can actually completely block graphics commands. It's fine up to a point, then it falls over. GCN does not have this problem. Nvidia will probably manage this by encouraging developers to minimize the amount of compute commands issued. They'll say something like use it sparingly.

Click to expand...

Right, but still - it's not a huge surprise that nvidia wouldn't care about creating an internal scheduling system for a very high amount of concurrent tasks. As in, allowing a single thread to feed the graphics card a million tasks at once - you're supposed to schedule the runs externally and take advantage of limited simd capability for the tasks on the graphics card that can be automatically allocated in the immediate context... that's how a peripheral card on an external bus works..

I mean, it's the exact same thing on AMD. The only difference is that the proximity to the cpu cores means that a context switch and a revert is much, much faster.

Just saying that even if Nvidia wrote in fifty "lanes" for access towards the work queue - there's really no bus-architecture or transport back and forth off the card that would take advantage of it in any way. Imagine having to wait for IO on the bus, just in case it's going to be possible to assign a different set of concurrent tasks later, that would allow better smx utilisation - potentially, in 100ms, etc. Not going to be much point.

It's the same proposition as adding cpu-calculations into a graphics context manipulation each frame, for example. It's not going to happen because of the external bus IO waits. So you might want to get around that by using compute - but, no surprise, compute on non-programmable simd is really, really slow and resource hungry.

Brilliant that people seem interested in compute all of a sudden. But this isn't exactly news, is it? That Nvidia cards are not optimized for deep concurrent parallelism of infinite amounts of complex tasks? Over being super-fast for limited simd execution for immediate graphics context tasks/simple pixel operations, etc., that always exist, and where slow and cheap ram is fast enough to still be useful. That's... practically what the business was made from.

Hell, I've been told for a decade that no one cares about asynchronous parallelism, and that it's all utterly idiotic and a waste of time.

nipsen, Oct 26, 2015

#36

jaybee83 likes this.

Zymphad Zymphad

Reputations:: 2,321

Messages:: 4,165

Likes Received:: 355

Trophy Points:: 151

NVidia's Quadro is still the most powerful compute GPU. AMD's top tier FirePro is nowhere close to being as fast. AMD can spout all the BS they want about how much powerful their GPU is in synthetic benchmarks and double precision BS. When it counts, when professionals use REAL tools to do their work, Quadro crushes AMD.

Don't even bother comparing AMD FirePro to NVidia's Tesla. AMD doesn't have an answer at all for Tesla.

AMD had to create GCN to compete with Quadro and they still haven't succeeded. FirePro still consistently has more issues. OpenCL still hasn't become an industry standard and it's still not as well developed as CUDA.

To say NVidia is not known for compute is nonsense. NVidia set the standard with CUDA and Fermi well before AMD began spouting their BS about teraflops of power in synthetic benchmarks.

Also the reason why AMD financially is in the dumps despite their healthy consumer sales, it frankly is not profitable. NVidia nearly has a monopoly in the professional market and that's where the money is. NVidia does have the monopoly with US Defense research and supercomputers.

Top 100 SuperComputers are dominated by Intel E5. And 15 of them use NVidia Tesla. The top three most powerful SuperComputers that are in development in US all will use NVidia Teslas. NVidia not known for compute parallelism. AMD wish they had Tesla.

Last edited: Oct 26, 2015

Zymphad, Oct 26, 2015

#37

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

Tesla is the dedicated compute GPU as it doesn't even have any display outputs, is completely passively cooled so needs to be mounted in a rack with tremendous amounts of airflow in order to not burn up.

Quadro for the most part is just GeForce with ECC memory and certified drivers. Except for a few top of the line Quadro parts, all of them are FP64 gimped just like the non-Titan GeForce parts (excluding the entire Maxwell lineup obviously since they can't FP64).

Last edited: Oct 26, 2015

n=1, Oct 26, 2015

#38

D2 Ultima likes this.

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

Zymphad said: ↑

NVidia's Quadro is still the most powerful compute GPU.

AMD's top tier FirePro is nowhere close to being as fast.

AMD can spout all the BS they want about how much powerful their GPU is in synthetic benchmarks and double precision BS. When it counts, when professionals use REAL tools to do their work, Quadro crushes AMD.

Don't even bother comparing AMD FirePro to NVidia's Tesla. AMD doesn't have an answer at all for Tesla.

AMD had to create GCN to compete with Quadro and they still haven't succeeded. FirePro still consistently has more issues. OpenCL still hasn't become an industry standard and it's still not as well developed as CUDA.

To say NVidia is not known for compute is nonsense. NVidia set the standard with CUDA and Fermi well before AMD began spouting their BS about teraflops of power in synthetic benchmarks.

Click to expand...

It really isn't.

It really is. When people were mining bitcoin like crazy using OpenCL on AMD, there were dudes who wrote a CUDA miner app for nVidia cards. All the best Kepler cards were still multiple times slower than AMD, while using their beloved CUDA.

This is because of drivers. AMD cannot driver. Quadro/Firepro class cards, you are LITERALLY paying for the drivers. Calling companies like adobe for support will hang up on you without a Quadro or Firepro card in your rig, because consumer drivers are not guaranteed. Grab a GTX 680, pop off a resistor and have the card show up as a Quadro K5000 (which allows drivers to install) and BAM! Instant help from those people. For a small percentage of the cost. It's all drivers, excepting maybe one or two cards that are slightly different in say... amount of memory, etc (but the architecture is the same).

Tesla is an entirely different ballgame to Quadro/Firepro.

They still have issues because drivers and programs. It doesn't matter if OpenCL is used or not, the fact is that OpenCL on AMD cards is faster than any current or last generation CUDA-capable GeForce or Quadro card could ever HOPE to compute using either CUDA OR OpenCL.

nVidia has been long known for CUDA. But as everyone has noticed, since Kepler CUDA performance has gotten worse, and to an extent they've even removed capabilities for it from drivers which require modified drivers to work. If Maxwell could compute as well as Fermi could, the raw benefit of Maxwell's increased core counts and clockspeeds and architecture should theoretically make compute much faster than it currently is.

D2 Ultima, Oct 26, 2015

#39

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

Yeah GeForce/Quadro/Tesla all run on the same silicon, the differences are mainly driver-side, and of course GeForce also has some extra hardware to ensure it stays a GeForce.

That said, I swear I remember reading a publication from HP that talks about the differences between Tesla and GeForce cards, and one thing they noted was the Tesla cards were unbelievably oversoldered at every joint in order to cope with the stress of running full tilt 24/7. This is why cards used for mining often die prematurely, not because the GPU starts to degrade (as long as it's kept cool), but because the components on the PCB very quickly wear out since gaming cards were never designed with 24/7 100% load operation in mind.

n=1, Oct 26, 2015

#40

nipsen Notebook Ditty

Reputations:: 694

Messages:: 1,686

Likes Received:: 131

Trophy Points:: 81

Still.. typically the advantage quadro and tesla cards have is an increased number of smx'es.. Streaming Mollusc X-treme processor..? something like that? Along with a grid management unit for mapping an increased incoming number of concurrent jobs, to execute the code as efficiently as possible.

And my point was just that it doesn't make sense to have multiple pipelines to the graphics card (hardware differences), or a semi-intelligent way to schedule incoming jobs (in firmware/software on the chip), when you don't have that increased number of cores. Because the ones you have can and.. should.. really only be programmed with a few tasks at the same time.

In the same way, it makes very much sense to focus on fast and limited simd on gtx type cards, to perform typical and very simple shader and pixel-operations on higher clocks, at burst speed - rather than cram these cards full of smxes that often won't be utilized. That's just cost-efficient when thinking about the tasks they're supposed to perform.

Meanwhile, some of us would dearly like to see more OpenCL programming in graphics. But on current bus-technology, the response after each operation is simply too slow. And the area of use for an increased number of SMs, stream-processors on peripheral cards becomes high-latency jobs such as.. Folding@home, bitmining, etc. That agree with a distributed model in the first place.

I mean, we're talking about just a series of extremely simple processors with limited capability put in an array here. And for the tasks you're typically going to have in a game, and so on, you're going to favor fewer but faster cores. For economic reasons, and for practical reasons as well (a huge array of processors draws a lot of power for one). Meanwhile, you actually do have examples where compute code does execute faster on fewer but faster cores, in the way that for example on the laptop market, you have fairly decent performance in practice on a gtx card compared to a quadro card in a very large amount of typical usage-examples.. that's.. another thing.

Of course, ideally we would have a million cores that could be put in a graphics card and for example clocked invidually, or disabled when they're not in use, that sort of thing. So we could have massive "compute" performance on demand, with some sort of fairly cheap power-budget whenever it's going to execute. And I'm guessing that better and more compact chip manufacture and ever cheaper hardware, along with better scheduling and control over active cores (like turned up with kepler and specially maxwell) is an approach towards that. To get more compute performance in a small watt-budget. Since apparently compute is all the rage now, and I didn't even notice, but never mind.

Meanwhile, again the difference with AMDs approach with the apus is the bus proximity. And how they take care of the scheduling for that, or make it possible to use some convenient api for allocating compute tasks, that's really a secondary concern over the actual hardware capability. Where an automatic or completely plain round-robin allocator would just instantly perform, as it's been shown already, will have comical compute performance compared to your average peripheral card, from amd or nvidia. Even if there were huge pitfalls and massive amounts of locked threads and wasted time in the process-diagram.

Just pointing that out - that code specifically written for a limited number of compute capable cores, that has no demand for instant response or completion, where most of the task can be run asynchronously and it is easily parallelizable. Measuring performance differences between AMD and Nvidia cards in that case can be interesting in some ways. Discussions about how wise it is to use more general purpose cores over having specific shader-units and specific pixel-operation units, is also perhaps interesting. From a cost-efficiency standpoint and a practical standpoint, when talking about performance for specialized tasks.

And it sure could be pointed out that keeping "gpu-cores" and "cpu-cores" as two separate devices when hardware is as cheap as it is nowadays is a stupendously idiotic thing to do, and a complete anachronism only kept in place by industry conventions, purely for marketing reasons. In the same way that having that design limits the potential of bus-proximity, for all the performance increase it will give compute performance tasks that typically are written now.

But regardless of that - you wouldn't benefit from a solid scheduler, multiple pipes, etc., if you didn't have a very large number of cores. And you only need it then, and benefit from increased numbers of cores, if you have relatively high latency tasks to complete towards system ram. Or if you wanted to perform somewhat simple math on graphics card memory, without having to return IO first (and this is where the entire compute pre-emption and VR dimension comes in - limited "compute" for dealing with occlusion detection and deformation is more common though. And very likely the biggest use of the bus-proximity on an apu will be to simply perform standard shader-operations faster, or at a similar speed as before. Rather than anything interesting).

So saying that the only difference going on is drivers, and which overall api you use, and how efficient dx12 or whatever is going to be, and things like that. It's not untrue as such. After all, like people point out, specialized drivers or very specifically programmed tasks can make certain hardware very fast for those specialized tasks.

But it skirts the actually interesting part, about the bus-proximity of compute-capable cores and cpu-cores, towards system ram. This is important.

nipsen, Oct 26, 2015

#41

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

n=1 said: ↑

This is why cards used for mining often die prematurely, not because the GPU starts to degrade (as long as it's kept cool), but because the components on the PCB very quickly wear out since gaming cards were never designed with 24/7 100% load operation in mind.

Click to expand...

Because gaming doesn't use near 100% of a GPU, as most of us don't know.

nipsen said: ↑

Still.. typically the advantage quadro and tesla cards have is an increased number of smx'es. Along with a grid management unit for mapping an increased incoming number of concurrent jobs, to execute the code as efficiently as possible.

Click to expand...

Well, for Quadros that's not true. They are, quite literally, almost exactly the GeForce cards. It's why as I said before, popping off a resistor in a certain location in some GeForce cards change them into the Quadro cards and they perform exactly the same. Teslas... I don't claim to know a whole lot about, really. My understanding is that they're in another class entirely, even though the same basic architecture is present (maxwell is still maxwell, of course).

D2 Ultima, Oct 26, 2015

#42

nipsen Notebook Ditty

Reputations:: 694

Messages:: 1,686

Likes Received:: 131

Trophy Points:: 81

D2 Ultima said: ↑

Well, for Quadros that's not true. They are, quite literally, almost exactly the GeForce cards. It's why as I said before, popping off a resistor in a certain location in some GeForce cards change them into the Quadro cards and they perform exactly the same.

Click to expand...

Well, then there goes that. It really is just driver differences? Certain functions just are slower.. or implemented without the grid management unit activated, something like that, so the run time for certain functions are basically multiplied by the queue depth, in the worst case..? I thought they had the same chip, but at least had different config options on the internal bus and the ram, and so on.

So.. any qualified guesses on whether or not compute pre-emption being so expensive on Maxwell is also because of driver capability...? That compact allocation of smx devices could be a problem, and so on. But that the substantial performance hit really comes when the internal scheduler croaks? And that you would have a similar problem on quadro and tesla cards if their internal scheduler enhancement wasn't there?

nipsen, Oct 27, 2015

#43

D2 Ultima Livestreaming Master

Reputations:: 4,335

Messages:: 11,803

Likes Received:: 9,751

Trophy Points:: 931

nipsen said: ↑

Well, then there goes that. It really is just driver differences? Certain functions just are slower.. or implemented without the grid management unit activated, something like that, so the run time for certain functions are basically multiplied by the queue depth, in the worst case..? I thought they had the same chip, but at least had different config options on the internal bus and the ram, and so on.

So.. any qualified guesses on whether or not compute pre-emption being so expensive on Maxwell is also because of driver capability...? That compact allocation of smx devices could be a problem, and so on. But that the substantial performance hit really comes when the internal scheduler croaks? And that you would have a similar problem on quadro and tesla cards if their internal scheduler enhancement wasn't there?

Click to expand...

For about 95% of Quadros and GeForce cards, it's JUST driver differences. Teslas I am certain are a different beast, but I don't know enough about them. Prema once told me that SOME of the quadros weren't the same for Maxwell, but not all are different, and any sort of hardware limitation Maxwell has is present in the Quadros. If the Quadro can do something the GeForce cannot (due to as you said, grid management unit activated or something) then it's artificially turned off via drivers in the GeForce card and turned on via drivers in the Quadro card.

I have no guesses as to whether DX12's async compute fails are driver-resultant or not. I can say however that if CUDA simply never uses the parallel processing methods that supersede the card's hardware designs (as CUDA is handled by the driver after all; so it's no surprise that it would calculate in a card-friendly manner) it would never show up in compute-related functions. Again I don't know about Tesla cards.

But honestly, if the quadro cards could do things the GeForce couldn't, then all we'd need is a maxwell quadro user to run the Ashes of the Singularity benchmark and compare to the GeForce equivalent of the card. If the quadro does better on a quadro driver, then that means the cards are physically capable of better and drivers are the limiter. If it does not do better, then it means the cards are flat out incapable, and CUDA simply works around the cards' downsides.

D2 Ultima, Oct 27, 2015

#44

i_pk_pjers_i likes this.

sniffin Notebook Evangelist

Reputations:: 68

Messages:: 429

Likes Received:: 256

Trophy Points:: 76

Zymphad said: ↑

NVidia's Quadro is still the most powerful compute GPU. AMD's top tier FirePro is nowhere close to being as fast. AMD can spout all the BS they want about how much powerful their GPU is in synthetic benchmarks and double precision BS. When it counts, when professionals use REAL tools to do their work, Quadro crushes AMD.

Don't even bother comparing AMD FirePro to NVidia's Tesla. AMD doesn't have an answer at all for Tesla.

AMD had to create GCN to compete with Quadro and they still haven't succeeded. FirePro still consistently has more issues. OpenCL still hasn't become an industry standard and it's still not as well developed as CUDA.

To say NVidia is not known for compute is nonsense. NVidia set the standard with CUDA and Fermi well before AMD began spouting their BS about teraflops of power in synthetic benchmarks.

Also the reason why AMD financially is in the dumps despite their healthy consumer sales, it frankly is not profitable. NVidia nearly has a monopoly in the professional market and that's where the money is. NVidia does have the monopoly with US Defense research and supercomputers.

Top 100 SuperComputers are dominated by Intel E5. And 15 of them use NVidia Tesla. The top three most powerful SuperComputers that are in development in US all will use NVidia Teslas. NVidia not known for compute parallelism. AMD wish they had Tesla.

Click to expand...

Honestly why would you come into a technical discussion waving your arms around and rambling about Nvidia's professional market success? This discussion is based on facts and reality, not Jen-Hsun's dreams.

D2 Ultima said: ↑

For about 95% of Quadros and GeForce cards, it's JUST driver differences.

Click to expand...

This is true. The whole point of Nvidia's professional market strategy is that you are already making GPUs in volume to sell to consumers.You take some of these GPUs, put them through additional validation, and rebadge them and sell them for 10 times as much, and lock features to drivers specific to them.

If Tesla and Quadro were based on different GPUs it would defeat the purpose. GK210 was the first case of professional cards using a different GPU. Whether this is a blip or the start of a trend I guess we'll found out.

Last edited: Oct 27, 2015

sniffin, Oct 27, 2015

#45

n=1 YEAH SCIENCE!

Reputations:: 2,544

Messages:: 4,346

Likes Received:: 2,600

Trophy Points:: 231

D2 Ultima said: ↑

Because gaming doesn't use near 100% of a GPU, as most of us don't know.

Click to expand...

lol I was simply pointing out it's probably the solder joints that are the first to fail instead of the GPU itself crapping out.

As far as Quadro/Tesla go, this is my understanding:

Quadro is basically just a GeForce with ECC memory. They still have display outputs, and are actively cooled by a fan, and in a pinch can still be used for gaming.
Tesla is a dedicated compute GPU. It comes with ECC memory of course, but has NO display outputs, so you can't even hook them up to a display. Some Tesla cards come in both active/passive cooling variants, but some (like the GK210 based K80) only comes in passive form, meaning they're most certainly intended to be mounted in racks with lots of airflow, and not to be used as a standalone card.

n=1, Oct 27, 2015

#46

octiceps Nimrod

Reputations:: 3,147

Messages:: 9,944

Likes Received:: 4,194

Trophy Points:: 431

D2 Ultima said: ↑

Because gaming doesn't use near 100% of a GPU, as most of us don't know.

Click to expand...

Because gaming workloads have traditionally tilted more toward graphics than compute. Up until 9 years ago we didn't even have the hardware to do GPU compute, or widespread API support until 6 years ago. Although I'm still not sure why compute has suddenly become such a hot and divisive topic recently, considering it's already been used in popular games for a number of years now (since the inception of DX11) for everything from deferred lighting to AO to DoF to realistic hair/fur/particle physics.

octiceps, Oct 27, 2015

#47

nipsen Notebook Ditty

Reputations:: 694

Messages:: 1,686

Likes Received:: 131

Trophy Points:: 81

octiceps said: ↑

Although I'm still not sure why compute has suddenly become such a hot and divisive topic recently, considering it's already been used in popular games for a number of years now (since the inception of DX11) for everything from deferred lighting to AO to DoF to realistic hair/fur/particle physics.

Click to expand...

Mm. I'm guessing that when you run into situations where one piece of hairy code in Directcompute runs perfectly fine on one platform, and somehow has immense penalties on another, you get these "this platform is ****" articles.

By the way, 3dfx had Glide Collections of routines with semi-complex math that completed fairly fast on immediate memory. Few titles ever used it in a way that put lighting effects on moving objects controlled by core engine logic.. can't really think of anything except Lander by Psygnosis.. (because it was pretty complicated to do it - I've experimented a tiny bit on a hobby-basis, but I can see why people would give up. It's.. still simpler than trying to gracefully incorporate graphics code with access to system ram on mobile phones, though..), but that sort of compute has been turning up once in a while. Memory hacks, more or less.

What's turning up (again now?) is that demand for general compute performance for unoptimized high-level code. Maybe more people tire of having to be restricted to one specific proprietary "tech" to create effects. That you get developers of games wanting to keep their code between projects, that they want their artists to have some predictability about what they can create, and people who write UI backends and so on want to have an easier standard way of dealing with graphics code. That even the super-gurus who would sit on compact shader-code that did extremely specific effects are getting old and tired, that sort of thing.

But I mean, the "asynch compute performance crash" thing is from the Oculus Rift guys, no? I'm (randomly) guessing they want to put in per-frame correction of some sort that has to complete before rendering. And that pre-emption compute seemed like a good way to do it (in spite of the context shifts that will happen on all platforms. That then unfortunately are crashing on Nvidia gtx cards.. because of the way the internal scheduler works..?).

nipsen, Oct 27, 2015

#48

Zymphad Zymphad

Reputations:: 2,321

Messages:: 4,165

Likes Received:: 355

Trophy Points:: 151

As indicated by Oxide statement that alluded NVidia does support it at hardware level, but not fully implemented in their drivers.
- The queue process is software unlike AMD that has a queue compute engine. Will be interesting the results when NVidia actually implements DX12.

What is ludicrous is NVidia published papers discussing Async/Parallelism years ago, when FERMI was being developed. Also NVidia hardware has been used in every MS DX12 presentations. Why NVidia hasn't fully developed this feature in their drivers or emphasized it is curious.

My guess why AMD emphasized it is not because of DX12 or PC gaming, but for consoles where async seems to shine.

BTW AMD doesn't support Conservative Rasterization and Raster Order Views DX12 features, only NVidia does. It could be that NVidia actually has more complete DX12 feature than AMD and it is just coincidence AMD has hardware queue compute engine because of consoles, not because of DX12.

Just rambling thoughts.

Last edited: Oct 29, 2015

Zymphad, Oct 28, 2015

#49

jaybee83 likes this.

Apollo13 100% 16:10 Screens

Reputations:: 1,432

Messages:: 2,578

Likes Received:: 210

Trophy Points:: 81

D2 Ultima said: ↑

- A company with broken, terrible drivers, awful anti-consumer business practices, unstable, broken, badly designed specifications, overpriced cards, where just about two cards in the entire lineup are worth the $$ (980Ti and 750Ti), that seems to be going backwards with their multi-GPU features and support.

Click to expand...

I thought you were referring to AMD at the start of that, until the anti-consumer business practices. Are nVIDIA's drivers really that bad these days? I've been using ATI graphics for years now (most recent nVIDIA is from 2007), and back when I last researched driver comparisons, nVIDIA's were generally recommended. I know they did have the overclocking fiasco early this year, but the quality is down, too?

I also remember when you could install Quadro drivers on GeForce cards... did that on my 8600M GT a couple times to play around and see the differences. Though it's been long enough that I'm not entirely sure those weren't modded drivers.

At any rate Ethrem made the key point of the thread in post #32. Pretty much sense the first in-depth analysis of why Maxwell was doing so poorly compared to Radeons on the Ashes of Singularity benchmark, it's been known why and that it's in the hardware. Probably before that for those who were better-versed on the technology. I'm sure nVIDIA will be addressing that with next year's cards, but for the sake of competition I'm glad AMD does have a slight head start here.

Apollo13, Oct 29, 2015

#50

TomJGX likes this.