CPU isnt improving fast enough | Page 2 | NotebookReview

IntelUser Notebook Deity

Reputations:: 364

Messages:: 1,642

Likes Received:: 75

Trophy Points:: 66

LOL, Daveperman indirectly citing superiority of his SSDs again(not saying you are wrong, just funny).

Cache sizes ARE computation related on a CPU though.

It's why GPUs and CPUs can NEVER replace each other. The workload of the two are fundamentally different. Core 2 increased performance by increasing things that weren't computation related like the ALUs, but memory like caches, memory disambiguation and prefetchers.

Sandy Bridge will focus majorly on cache and memory functions and will be the next big improvement. Media enhancements like AVX are really the icing on the cake, not the main feature.

IntelUser, Jan 29, 2010

#51

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

IntelUser said: ↑

LOL, Daveperman indirectly citing superiority of his SSDs again(not saying you are wrong, just funny).

Cache sizes ARE computation related on a CPU though.

It's why GPUs and CPUs can NEVER replace each other. The workload of the two are fundamentally different. Core 2 increased performance by increasing things that weren't computation related like the ALUs, but memory like caches, memory disambiguation and prefetchers.

Sandy Bridge will focus majorly on cache and memory functions and will be the next big improvement. Media enhancements like AVX are really the icing on the cake, not the main feature.

Click to expand...

no, gpus and cpus can never replace each other as long as they have a different instruction set. larrabee could theoretically replace a cpu completely (not being that efficient at it, though).

and no, caches are not computational workload. they are memory-workload.

if i want to sum two registers 10 billion times, all i can do is call add reg0, reg1 10 billion times. no cache or so is used for this, but a computational part of the cpu: the adder unit.

and yes, i'm nitpicking. but i want accuracy from someone called intel user

davepermen, Jan 29, 2010

#52

thinkpad knows best Notebook Deity

Reputations:: 108

Messages:: 1,140

Likes Received:: 0

Trophy Points:: 55

IntelUser said: ↑

The Atom project was originally made to scale down to fit it on smartphones. Think about how low they'll have to scale down Pentium M to fit on smartphones. It'll still be a larger die too.

Click to expand...

Ahem, wasn't ARM already dominating that market?

thinkpad knows best, Jan 29, 2010

#53

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

thinkpad knows best said: ↑

Ahem, wasn't ARM already dominating that market?

Click to expand...

and? that's actually a reason FOR trying to build atom.

arm isn't intel. atom is. they want a bit of the growing market.

davepermen, Jan 29, 2010

#54

Pitabred Linux geek con rat flail!

Reputations:: 3,300

Messages:: 7,115

Likes Received:: 3

Trophy Points:: 206

IntelUser said: ↑

It's why GPUs and CPUs can NEVER replace each other.

Click to expand...

A CPU can ALWAYS replace a GPU. That's the whole idea... a CPU is a completely general-purpose processing device. It can do ANYTHING. The reason we have GPUs is that for any given task, purpose-built hardware will always be faster than general-purpose. The GPU is designed to make certain operations very fast, and it will always be faster at those than the CPU. But that's at the expense of generality... it only does a few things.

Pitabred, Jan 29, 2010

#55

thinkpad knows best Notebook Deity

Reputations:: 108

Messages:: 1,140

Likes Received:: 0

Trophy Points:: 55

I know ARM isn't Intel for blank sakes, but the Atom still puts out too much heat to be inside a SmartPhone, 11nm maybe it might be. The GPU is just a high performance processor that exclusively handles graphics oriented operations. I wonder when we'll start having 128-bit CPU's

thinkpad knows best, Jan 29, 2010

#56

IntelUser Notebook Deity

Reputations:: 364

Messages:: 1,642

Likes Received:: 75

Trophy Points:: 66

davepermen said: ↑

and no, caches are not computational workload. they are memory-workload.

if i want to sum two registers 10 billion times, all i can do is call add reg0, reg1 10 billion times. no cache or so is used for this, but a computational part of the cpu: the adder unit.

and yes, i'm nitpicking. but i want accuracy from someone called intel user

Click to expand...

That's why I said its related. True, it never really does computation. But in modern CPUs, adding an extra computation unit like an extra ALU does nothing. Those ALUs are limited by the memory ops, and better caches opens them up. So its sorta true, and sorta not.

A CPU can ALWAYS replace a GPU. That's the whole idea... a CPU is a completely general-purpose processing device. It can do ANYTHING.

Click to expand...

It won't in the sense that GPU will always perform better than CPU at graphics, but you already knew that.

I know ARM isn't Intel for blank sakes, but the Atom still puts out too much heat to be inside a SmartPhone, 11nm maybe it might be. The GPU is just a high performance processor that exclusively handles graphics oriented operations. I wonder when we'll start having 128-bit CPU's

Click to expand...

The Pineview/Diamondville/Silverthorne Atom in current Netbooks/UMPCs/MIDs might put too much heat. The Lincroft Atom coming in Moorestown Atom will be low enough power to fit in larger smartphones. The Z500 CPU running at 800MHz is already at only 0.6W.

In a smartphone, CPU, memory controller, and the I/O controller aren't the only big contributors to power consumption. The display and the type of interface, operating systems, BIOSes and components on the motherboard also use hefty amount of power. If you can reduce that significantly you'll get that much closer to smartphones.

Moorestown will focus on not just integrating the graphics and memory controller like pineview and be done with it. Rest of the components I have mentioned above will be significantly improved.

First 2 Moorestown (smart)phones:
http://www.youtube.com/watch?v=5m79buEJQQY
http://www.youtube.com/watch?v=WfkzpdB97fg

The LG device is rumored to have 5 hours browsing time on 3G with its 1850mAH battery. It is 50% larger than the one in the iPhone 3GS, but considering the bigger screen and much better performance, its no way that far off. 45nm and Moorestown is enough to make Atom relevant for smartphones, 32nm with Medfield will get power parity.

IntelUser, Jan 30, 2010

#57

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

IntelUser said: ↑

That's why I said its related. True, it never really does computation. But in modern CPUs, adding an extra computation unit like an extra ALU does nothing. Those ALUs are limited by the memory ops, and better caches opens them up. So its sorta true, and sorta not.

Click to expand...

hm no. give me another one, and my app runs faster.

only the moment you have to wait for memory data (and no other hyperthread has something to do that it waited for), memory starts to be relevant.

that is, by default, quite often, yes. but it doesn't change the fact that numerical power on it's own is 100% defined without any caches in mind. and can be used as those.

but in most real world apps, one has to consider the data part as well, not just the computing part. and there, memory bandwith and latency gets important, of course.

that's why gpu's by now are sort of suuuperhyperthreaded (they call it differently). they mostly have tons of jobs pending for their data, and still have another one in the actual core to compute while the others wait for the data. that way, they can hide the latencies in their main job (doing billions of times the very same again and again)

davepermen, Jan 30, 2010

#58

Meaker@Sager Company Representative

Reputations:: 9,431

Messages:: 58,189

Likes Received:: 17,900

Trophy Points:: 931

You HAVE to include more cores, it has several benefits:

1. Larger contact area for heatsinks
2. Having 1156 pins requires a certain area around a die to actually make the connections, thats the minimum die area you can have. If you only have one core you have wasted space anyway.

Oh and if memory was such a bottleneck, why does increasing the clock speed/ decreasing the latencies have such a small effect on real world apps?

Meaker@Sager, Jan 30, 2010

#59

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

Meaker said: ↑

You HAVE to include more cores, it has several benefits:

1. Larger contact area for heatsinks
2. Having 1156 pins requires a certain area around a die to actually make the connections, thats the minimum die area you can have. If you only have one core you have wasted space anyway.

Oh and if memory was such a bottleneck, why does increasing the clock speed/ decreasing the latencies have such a small effect on real world apps?

Click to expand...

it WAS a huge bottleneck for the first quadcores.

determinable by being able to scale about linear from 1 to 2 to 3 cores, but the 4th core normally never gained the same. there, increasing the memory performance helped (by f.e. reducing the amount of needed memory to access, etc)

but nowadays, not so much anymore. still obviously dependent on the workload.

for big scenes one renders (as an example), programmer have to take much care to prepare the data in a way that all cores get feed with data. it's not "just working well". programmers have to manage memory manually quite a bit. so there is a bottleneck, which so far can be fixed by programmers.

but that's work. work that could be spent better.

davepermen, Jan 30, 2010

#60

IntelUser Notebook Deity

Reputations:: 364

Messages:: 1,642

Likes Received:: 75

Trophy Points:: 66

Meaker said: ↑

Oh and if memory was such a bottleneck, why does increasing the clock speed/ decreasing the latencies have such a small effect on real world apps?

Click to expand...

It's that or basically nothing. Double caches giving you 5-7% increase or lower latencies on Nehalem might have helped only 10%, but double your ALUs or your issue width and you gain 2-3%.

BTW, Athlon 64 gained 20% by the lower latency memory controller alone, on PC apps, and way more on server apps.

Pentium M gained 30-40% over Tualatin with no better FPU or more ALUs.

IntelUser, Jan 30, 2010

#61

Undertaxxx Notebook Consultant

Reputations:: 33

Messages:: 264

Likes Received:: 0

Trophy Points:: 30

streetxdreamer said: ↑

Engineers went other routes then just to keep on increasing clock speeds is because of bottlenecks in other areas.

Sure a Ferrari is fast but it cant go fast if it's stuck in traffic.

Click to expand...

Can't agree more
+1

Undertaxxx, Jan 30, 2010

#62

thinkpad knows best Notebook Deity

Reputations:: 108

Messages:: 1,140

Likes Received:: 0

Trophy Points:: 55

Yeah, they increase everything within the time period of lets say 10 years, miniscule power improvements with each release, (few exceptions) but then you look back and there's 40-50% performance gain with processors, it's also marketting influence. Most of us would be fine with advanced Core 2's for the next 5 years, then Intel would release the i series, improving performance in the double digit percentages, but they released it now because they wouldn't make as much money right now, instead of releasing the technology when it would have been truly a need among power users. It's sort of like GTA 4, Rockstar probably should have released the PC version in August 09, or even December 09, and pushed the console version up a little too.

thinkpad knows best, Jan 30, 2010

#63