The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.
← Previous page

    Question about CPU's

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by VaultBoy!, Oct 7, 2012.

  1. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    RAID 6 is like RAID 5, but with two sets of parity instead of one. It can withstand any two drives failing. There are many, many ways to make redundant arrays of disks besides just the standard RAID levels, but I think RAID 6 is the most common and easiest that can withstand any two disks failing.

    RAID 10 can withstand any single disk failing, but two disks failing and the whole array can be toast. If we are talking about an array of just 4 disks, RAID 10 is fine. Read and write performance is good, and the processing power to run it is minimal. But if you want more storage space and more data security, RAID 10 falls on its face.
     
  2. nipsen

    nipsen Notebook Ditty

    Reputations:
    694
    Messages:
    1,686
    Likes Received:
    131
    Trophy Points:
    81
    Yeah, it was always about being able to get the most out of the cheap-o kits I bought for my pocket money for me, at least. No point after a while. And you don't get those insane boosts nowadays, so not much point. ..I was underclocking my llano setup :)
    Yeah, but they cost too much. I had a Celeron, and then one of the boxed Pentium 3s.. with a small cache? I think. The budget alternative when I upgraded later was locked down both multipler and bus-speed. I know there were other models that were unlocked, but on the ones I had I could only adjust the bus-speed (inside a set range).
    Mm, you're right. Could it have been a much earlier processor I had a 120% overclock on, on air..? Had a lot of luck by running the bus-speed up, fsb:dram at 1:2 or 1:3. But just switched it around to get the highest cpu-speed I could, reasonably stable. Had better performance on higher bus-speeds and lower speed, though. So.. it must have been sdram and before athlon xp..

    Actually, no. It was an athlon xp 1600. This was the low range in the thoroughbred..? line. And if you were lucky, you could get that up extremely high. It was a very specific processor that had amazing ocs. Not sure how I found that out..
     
  3. nipsen

    nipsen Notebook Ditty

    Reputations:
    694
    Messages:
    1,686
    Likes Received:
    131
    Trophy Points:
    81
    Yes, good explanation.

    Also adds some power-draw and extra circuits to have the functionality there. ..Not entirely sure exactly what components they're duplicating down the pipeline, though.
     
  4. Karamazovmm

    Karamazovmm Overthinking? Always!

    Reputations:
    2,365
    Messages:
    9,422
    Likes Received:
    200
    Trophy Points:
    231
    well you are comparing 2 different archs, not counting of course with the xeon from foster.

    but the ''added'' hardware was there to improve the pipeline and the performance of HT.

    Thing is that the HT is basically to use the resources that are available to a said core, emulating the presence of another core. this is done and engaged when the core is being underutilized or given a task that would benefit from more cores. The FP calculators are the same and shared between the pipeline.

    For the programs to use HT is another can of worms, you have to have multithreaded code, and that aint simple to do. The logic behind that is far more advanced and entails more code hours, and thats why a lot of things today aint multi. And sometimes thats not a problem, since the OS is getting better coded, it can put forth more things so that you can multitask without a major compute bottleneck
     
  5. Marksman30k

    Marksman30k Notebook Deity

    Reputations:
    2,080
    Messages:
    1,068
    Likes Received:
    180
    Trophy Points:
    81
    I believe they duplicated the fetch and decode logic so the processor can tap in to TLP of the code. It works under the assumption that no thread has enough ILP to utilize all the FPUs and ALUs available to the CPU core at over the given execution period. Therefore, HT is a way to keep the cores busier at little or no extra energy cost since its likely that the adjacent Thread would have tappable ILP if being run in parallel instead of waiting for the first to finish.
    All CPUs have being trying to extract ILP from code to improve IPC but this is very difficult and inconsistent which results in some of the execution logic sitting around idle since there wasnt enough ILP to go around. Core 2 in fact, has extremely good IPC partly due to the superb ILP extraction.
    Pentium 4 wasn't as wide in decoding and was very Serial in its execution so I believe HT didn't result in as big a gains simply because there were fewer idle execution units.
    Nehalem took the Core 2 frontend and duplicated more downstream decode logic so it resulted in beastly HT performance. Sandybridge then took Nehalem and included a uOps cache to cut down on branch prediction penalties amongst other improvements and thus dramatically improved IPC. Ivy bridge then took sandy's hardware and included logic to allow the duplicate decode logic to dynamically either serve their own threads or can gang together when only one thread is active per core. Haswell supposedly has a beefier excution backend with Ivy's front-end.
     
  6. nipsen

    nipsen Notebook Ditty

    Reputations:
    694
    Messages:
    1,686
    Likes Received:
    131
    Trophy Points:
    81
    ^if people wonder. TLP - Thread level parallelism. ILP - Instruction level parallelism. IPC - instructions per cycle.

    So.. the goal is better instruction level parallelism on existing processing elements. And adding IPC.. more Mhz.. won't do that. Neither does adding cores.. So adding larger caches and an automatic/simple symmetric multiprocessing scheduler opens up some of the potential gain from the increased speed and amount of processing elements..?

    Right. Because that was where the first core2 quad processors didn't perform. ...?
     
  7. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    Hyperthreading took up like 5% of the die on nehalem, but in the best case scenarios it was increasing performance by 30%. It, like almost every other change or tweak in the architecture, is designed to boost the performance of the processor for the amount of real estate on a silicon wafer.
     
  8. Marksman30k

    Marksman30k Notebook Deity

    Reputations:
    2,080
    Messages:
    1,068
    Likes Received:
    180
    Trophy Points:
    81
    Its actually psuedo TLP exploitation.
    The goal was to maintain as much ILP in flight as possible on existing elements, TLP was a means to an end. Therefore, the cores more efficiently utilize their peak IPC capability, the speedup is not the same magnitude as true TLP with a separate core but it was greater than serial execution.

    Designing a CPU is full of tradeoffs.
    Adding cores is a particularly power and space expensive way to improve TLP performance, IPC improvements are also very difficult to extract because you have to re-engineer things like Branch predictors or even ALUs/FPUs in the core. ILP is dependent very much on the code so its impossible to have perfect ILP unless you are repetitively using the same instruction over and over and your instructions are completely independent of each other (e.g. Cryptography).
    ILP is tricky to extract, AMD found this out the hard way with their VLIW architecture for GPUs. Graphics workloads are crazy parallel ( I think only cryptography is more so) so AMD actually had 5 (later 4) ALUs that could execute in parallel to extract ILP from the thread. Even then, only about 3.4 units were used on average.

    You are spot on with the cache and scheduler :)

    The Core 2 architecture actually would've benefitted much from HT but Intel did not see fit to include it due to lack of competition from AMD.
    It was a very wide design but I suspect a lot of the hardware was idle because its tough to have lots of ILP with typical CPU workloads.

    Its a delicate balance between front end and execution, this is probably why IPC performance alone has only crawled along with every refresh, with the most significant improvements in speed coming from the combination with higher clockspeeds afforded by better manufacturing.
     
  9. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    This doesn't make much sense. When Intel was designing the Core architecture, which was first released with the Core 2 Duo in 2006, AMD's K8 processors had just come out, which were killing Intel's flagship Pentium processors. Also, if you have something that will make the processor better, you include it. Purposefully crippling your chips is stupid. What Intel has done in the past few years, since the introduction of Nehalem, was to just slow down their release schedule. I had always read that adding hyperthreading to the Core microarchitecture was a tradeoff that Intel did not see paying off until smaller manufacturing processes, and thus more transistors, became available to play with.
     
  10. Marksman30k

    Marksman30k Notebook Deity

    Reputations:
    2,080
    Messages:
    1,068
    Likes Received:
    180
    Trophy Points:
    81
    I stand corrected :)
     
← Previous page