The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.
← Previous page

    Intel core 2 duo vs dual,core i5 or i7

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by The Fire Snake, Dec 31, 2014.

  1. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    No. The point is not to say that in general, Haswell far outclasses Penryn. We all know this. We also know that for the same performance, Haswell is going to use less power. You don't need to post benchmarks showing an i7 demolishing a C2D. That doesn't help anything. What does help a lot is to see where the C2D fits, performance-wise, into the current Haswell lineup. What we can see is that the Penryn lines up performance-wise with Haswell ultra-low and low-voltage dual core i3, i5, and i7 processors. I think that is a much more valid comparison than just saying Haswell > Penryn.

    Also these low-voltage processors are a lot more common than they used to be. Today they are less the exception and more the rule in mainstream laptops. For many home-users, that is all the performance they need, and even a T9900 is perfectly adequate, at least performance-wise. The explosive growth in CPU performance of years past has slowed down immensely, and performance requirements for mainstream non-gaming computer use has plateaued.

    Your pseudo-scientific arguments have no merit if all you can do is try to discredit the evidence against them with even more techno-babble and you are completely incapable of offering any evidence to support your theories.

    You can't actually be serious by saying that your words are your proof? Are you sure this isn't Tilleroftheearth's second account?
     
    Starlight5 likes this.
  2. tilleroftheearth

    tilleroftheearth Wisdom listens quietly...

    Reputations:
    5,398
    Messages:
    12,692
    Likes Received:
    2,717
    Trophy Points:
    631
    Yeah; that is exactly the point. Duh.

    Did we take a stupid pill today? Lol...


     
  3. Krane

    Krane Notebook Prophet

    Reputations:
    706
    Messages:
    4,653
    Likes Received:
    108
    Trophy Points:
    131
  4. ComradeQuestion

    ComradeQuestion Notebook Consultant

    Reputations:
    204
    Messages:
    120
    Likes Received:
    9
    Trophy Points:
    31
    Fun.

    That was the conversation I was having, and seemed to be the question. See my first post where I ask for clarification on the question. In fact, it appears that benchmarks *did* have to be posted, so I'm left wondering what topic you're reading?

    How is "Haswell > Penryn" not simply shorthand for "High end penryn's line up with low end haswell's that focu son low voltage over performance" ? Seems to be the same exact statement to me.

    Techno-babble, eh? Interesting lol I didn't know that mentioning cache locality was "techno babble".

    How about this, you go wikipedia all of the big words that you didn' tunderstand from my post, and after that you get back to me?

    Well he focused on a very specific part of my post, and the evidence for my assumptions was given in the rest of my post.

    Again, if you want to Google/ Wikipedia how cache works, how latency affects performance, and everything les eI mention, be my guest. I provided the information, it's not hard to find ou tmore.

    Herb Sutter discusses these things a lot:
    https://www.youtube.com/watch?v=L7zSU9HI-6I

    I don't know your technical background, but generally any compiled language experience will be enough to understand these concepts.

    https://en.wikipedia.org/wiki/Locality_of_reference

    If you have questions about any of this I can answer them. But you seem to want to frame the conversation in a different light, even though it appears that you want to have the same exact conversation.
     
  5. ajkula66

    ajkula66 Courage and Consequence

    Reputations:
    3,018
    Messages:
    3,198
    Likes Received:
    2,318
    Trophy Points:
    231
    Well, OP would have to answer that one for you...:)

    Having said that, a 8770W will run circles around T500 no matter how one wants to look at it...as it should, in all fairness...

     
  6. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    If a single Penryn can perform better than a single Haswell, then that point is clearly either wrong or over-simplified.

    If relating performance of different processors is too intellectually challenging for you, let's just finish off by saying that all i7's are better than all i5's and call it a day..... :rolleyes:

    Not even close. Haswell > Ivy Bridge, Haswell > Netburst, Haswell > 4004. These sorts of comparisons leave a lot to be desired.

    Sorry, that was rude. But please, show some real world examples of what you are talking about. You are just having a theoretical monologue. Your logic is correct, but you vastly overestimate the performance differences. Nothing shows the kinds of leaps and bounds of cache prediction that you are talking about.
     
    Starlight5 likes this.
  7. ComradeQuestion

    ComradeQuestion Notebook Consultant

    Reputations:
    204
    Messages:
    120
    Likes Received:
    9
    Trophy Points:
    31
    It's not really theoretical, and the talk I link discusses, in depth, how modern CPUs can have massive effects. I'm not sure if that talk references prefetching, Herb Sutter has dozens of talks on these subjects, but it becomes clear that changes in these technologies can have very significant effects.

    I'm trying to think of a more direct example. Like, Bjarne Stroutsup at one point shows that O(log n) data structures end up being slower than O (n) data structures (an exponential difference) purely because of cache locality, which proves quite definitively that cache is incredibly important to performance.

    I found some of his slides here:
    c++ : Locality, Locality, Locality

    Unfortunately, to fully appreciate how this difference is exponential requires at least a cursory knowledge of data structures and algorithms. I can't really "source" something like that, but this is a literal demonstrable effect that the creator of C++ is demonstrating, so that should be a hint.

    As cache sizes increase we can fit larger objects into the cache. This is very important.

    As prefetching technology increases, we can predict what to put into the cache. Also incredibly important - I wish I had the Herb Sutter talk on this, but again, we see a difference between logarithmic speed and linear, all because a prefetcher was improved on the CPU.

    These are the ways CPUs increase performance now these days. We've stopped increasing clock speed as drastically for years, and the focus has been on increasing cache, increasing throughput with multithreading (and also intel's shared cache for hyperthreading), branch prediction, etc.

    I don't really look at benchmarks. They don't matter to me, because they rarely explain the exact technical implementation. But very simple demonstrations like the one in the link above will show how drastic it can be. These features are just not there, or not as refined, in older CPUs.

    That is why I think modern CPUs are significantly better than older ones.

    Not to mention instruction sets, but that's not really as worth going into or as important for performance.
     
  8. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    Cache sizes are not entirely relevant. Cache sizes of Penryn and Haswell are very consistent. Also, comparing performance of differences of the same processor with different cache sizes shows that it hardly makes any difference at all besides in some exceptional circumstances where cache size is cut dramatically. (65nm Core 2 based Celerons that only had 512KB of L2 cache are the best example of this.)

    Another thing is that Penryn is a modern CPU. We aren't comparing Haswell to a 486DX here. Each successive generation between Penryn and Haswell has offered only slight incremental improvements. On a per clock basis, Haswell offers a best possible performance increase over Penryn is nearly 50%. This works out perfectly when comparing high speed Penryns to low speed Haswells. You can talk about how cache prediction on Haswell is a million times better than on Penryn, but even if that is so, it doesn't seem to translate very well to real-world performance.

    Using new instruction sets is the only way one will ever see absolutely dramatic clock for clock performance differences with Haswell compared to Penryn. For example, take a look at AES encryption.
     
    Starlight5 likes this.
  9. nipsen

    nipsen Notebook Ditty

    Reputations:
    694
    Messages:
    1,686
    Likes Received:
    131
    Trophy Points:
    81
    Other than that last part, we have a winner! :p

    But yeah, that's one way of explaining it. The problem is that what you've really explained is that all the intel offerings are practically identical. And that to get the huge payoffs in anything other than special cases where you literally gate the compiler and feed the processor manufactured junk data that never occurs in a practical example (at least if it's to produce anything meaningful) -- we need a completely different architecture. And along with it, new programming languages and compiler techniques.
     
    ajkula66 likes this.
  10. ComradeQuestion

    ComradeQuestion Notebook Consultant

    Reputations:
    204
    Messages:
    120
    Likes Received:
    9
    Trophy Points:
    31
    I agree that cache sizes are not entirely relevant. It's less common to have a single cache line required to be larger than your cache size, outside of more specific contexts.

    But it has grown, certainly. Architecturally, the cache hierarchy has been split to per-core (for hyperthreading among other things). This means that the entire cache can't be invalidated by a single thread, only a small portion.

    And yes, Penryn is a modern Core 2 Duo, certainly. It was part of the first 'Core' lineup, which started a much larger focus on cache and parallelism.

    Clock per clock being 50% higher, to me, is quite drastic when they also use such a considerable amount less energy. I guess that's maybe where I differ here, but if you use 30% of the energy to get 50% improved results, that's really quite significant.

    As for cache prediction and real-world performance, I suppose it really just depends on the case. Most CPUs at this point are so fast that even latency issues won't be as noticeable. But I spend a lot of time benchmarking and profiling my own applications to optimize for throughput and latency, and it makes a big difference there.

    For example, in two identical pieces of code, there is a 50x performance gain from making it cache friendly. Of course, both pieces of code run in milliseconds, so you won't ever notice it.

    Why I think it's more important is because the gap between CPU and RAM has only grown, and it continues to grow. So I think that latency is only going to get *worse*, and therefor any time you can actually avoid it, it will be all that much more important.

    Perhaps real world applications haven't gotten to that point yet.

    I disregarded instruction sets because they are very rarely used, and the program has to be built with support. Naturally, instruction sets can make a significant difference. AES-NI is basically the best you'll get though - AES was built to be implemented in hardware, and encryption always benefits from instruction sets. Most programs probably won't be doing that sort of thing.

    nipsen,
    The real issue here is that programs are rarely CPU bound. If they were, we'd see quite large speedups.
     
    Last edited: Jan 15, 2015
    nipsen likes this.
  11. nipsen

    nipsen Notebook Ditty

    Reputations:
    694
    Messages:
    1,686
    Likes Received:
    131
    Trophy Points:
    81
    Mm. Well, in computer science classes, you're usually taught that if you can exchange a short algorithm with exponential factor for one with linear execution time, then you should do so. Almost regardless of the size of the constant. The reason for that is that we always work with limited data sets. So it's actually so rare that you run into a real world example where say, a search has to return in a short amount of time, and there's a difference between a brute-force algorithm of some sort and a heap-implementation, for example, that it's considered a special case.

    This also fits perfectly with the general industry narratives, of course. Because if you design linear algorithms - that all are completely cpu-bound - then you will always encourage short development time, low cost implementations (i.e., outsource the implementations to India and China). And the company that use the solution will then, when newer technology that is marginally faster, simply need to buy the latest generation offering. Which increases the performance - or trawls more data in a shorter amount of time - without having to change any implementation of the expensively developed software. And in practice, this is typically cheaper than developing completely new software with completely new ideas. If that is even an option.

    So it's not that most tasks running on a PC aren't cpu-bound. It's just that they're technically designed in such a way that they're only /occasionally/ cpu-bound. And you don't see that as an issue even in real-time applications, because the return time usually is fast enough anyway. And when it isn't, like you mention a few examples of, then we're moving on to special cases where we need "optimization".. right? ;D

    I mean, I agree that in the short term, it's possible to speed things up considerably if the transport layer was improved. We'd be able to design UI with better response, we'd easily smuggle in some reduction and occlusion detection algorithms for any 3d contexts. Switching contexts would be faster, you could rely on either core logic or external sources mostly interchangeably, that sort of thing. That would be useful.

    But it's not giving you an integrated bus with several elements that have common access to working memory. To, say, design a 3d imaging program that could increase and decrease the complexity of the model from a selection of data - a data store that would be too big to crunch with brute force, but where the selection you'd fetch would still be representative and sufficient for the visual representation at all times. VR, basically. We just can't design something like that on current architecture. And we can't really do it with the tools we have now either, because they're stuck in a paradigm where "when in doubt, increase the constant of the algorithm" is a perfectly valid rule, both practically and in theory as well..
     
  12. jedisurfer1

    jedisurfer1 Notebook Deity

    Reputations:
    39
    Messages:
    785
    Likes Received:
    50
    Trophy Points:
    41
    I'll have to to say I still use a c2d t9300 on my t61p everyday. and yes it still runs 3-4 virtual machines concurrently very well. The only thing holding it back is that it maxes out at 8gb of ram and that limits my vm work. I'd say it's faster than my i5 4300u ultrabook. I still adore that machine and it is just the gimped by the max ram. I also run a decked out w520, w530, w540 and the t9300 even in non dual ida mode is not slow to me. Then again I can't even tell much of a difference between sata2 vs sata3 difference in everyday use.

    I don't play games so I can't compare it in that department. But for vt-x vmware stuff it's surprisingly still runs perfectly fine.

    It's very comparable to all the current dual core cpu. I'm wondering who runs a quad core c2d in a laptop if it's on par with quad core current cpu
     
    Last edited: Jan 22, 2015
  13. hirobo2

    hirobo2 Notebook Consultant

    Reputations:
    32
    Messages:
    119
    Likes Received:
    13
    Trophy Points:
    31
    I'm still using Merom (1st gen c2d) as my main PC. The only thing I can't do with it is play 1080p files larger than 8.0GB (I get audio syncing issues after 30 mins). CPUs have only advanced since then in terms of using less power and having faster RAM on board. I also own a current i5. I can attest, were the i5 on the same RAM as my c2d, it would have only performed marginally faster...
     
← Previous page