RAID 6 is like RAID 5, but with two sets of parity instead of one. It can withstand any two drives failing. There are many, many ways to make redundant arrays of disks besides just the standard RAID levels, but I think RAID 6 is the most common and easiest that can withstand any two disks failing.
RAID 10 can withstand any single disk failing, but two disks failing and the whole array can be toast. If we are talking about an array of just 4 disks, RAID 10 is fine. Read and write performance is good, and the processing power to run it is minimal. But if you want more storage space and more data security, RAID 10 falls on its face.
-
Actually, no. It was an athlon xp 1600. This was the low range in the thoroughbred..? line. And if you were lucky, you could get that up extremely high. It was a very specific processor that had amazing ocs. Not sure how I found that out.. -
Also adds some power-draw and extra circuits to have the functionality there. ..Not entirely sure exactly what components they're duplicating down the pipeline, though. -
Karamazovmm Overthinking? Always!
well you are comparing 2 different archs, not counting of course with the xeon from foster.
but the ''added'' hardware was there to improve the pipeline and the performance of HT.
Thing is that the HT is basically to use the resources that are available to a said core, emulating the presence of another core. this is done and engaged when the core is being underutilized or given a task that would benefit from more cores. The FP calculators are the same and shared between the pipeline.
For the programs to use HT is another can of worms, you have to have multithreaded code, and that aint simple to do. The logic behind that is far more advanced and entails more code hours, and thats why a lot of things today aint multi. And sometimes thats not a problem, since the OS is getting better coded, it can put forth more things so that you can multitask without a major compute bottleneck -
All CPUs have being trying to extract ILP from code to improve IPC but this is very difficult and inconsistent which results in some of the execution logic sitting around idle since there wasnt enough ILP to go around. Core 2 in fact, has extremely good IPC partly due to the superb ILP extraction.
Pentium 4 wasn't as wide in decoding and was very Serial in its execution so I believe HT didn't result in as big a gains simply because there were fewer idle execution units.
Nehalem took the Core 2 frontend and duplicated more downstream decode logic so it resulted in beastly HT performance. Sandybridge then took Nehalem and included a uOps cache to cut down on branch prediction penalties amongst other improvements and thus dramatically improved IPC. Ivy bridge then took sandy's hardware and included logic to allow the duplicate decode logic to dynamically either serve their own threads or can gang together when only one thread is active per core. Haswell supposedly has a beefier excution backend with Ivy's front-end. -
^if people wonder. TLP - Thread level parallelism. ILP - Instruction level parallelism. IPC - instructions per cycle.
So.. the goal is better instruction level parallelism on existing processing elements. And adding IPC.. more Mhz.. won't do that. Neither does adding cores.. So adding larger caches and an automatic/simple symmetric multiprocessing scheduler opens up some of the potential gain from the increased speed and amount of processing elements..?
-
Hyperthreading took up like 5% of the die on nehalem, but in the best case scenarios it was increasing performance by 30%. It, like almost every other change or tweak in the architecture, is designed to boost the performance of the processor for the amount of real estate on a silicon wafer.
-
The goal was to maintain as much ILP in flight as possible on existing elements, TLP was a means to an end. Therefore, the cores more efficiently utilize their peak IPC capability, the speedup is not the same magnitude as true TLP with a separate core but it was greater than serial execution.
Designing a CPU is full of tradeoffs.
Adding cores is a particularly power and space expensive way to improve TLP performance, IPC improvements are also very difficult to extract because you have to re-engineer things like Branch predictors or even ALUs/FPUs in the core. ILP is dependent very much on the code so its impossible to have perfect ILP unless you are repetitively using the same instruction over and over and your instructions are completely independent of each other (e.g. Cryptography).
ILP is tricky to extract, AMD found this out the hard way with their VLIW architecture for GPUs. Graphics workloads are crazy parallel ( I think only cryptography is more so) so AMD actually had 5 (later 4) ALUs that could execute in parallel to extract ILP from the thread. Even then, only about 3.4 units were used on average.
You are spot on with the cache and scheduler
The Core 2 architecture actually would've benefitted much from HT but Intel did not see fit to include it due to lack of competition from AMD.
It was a very wide design but I suspect a lot of the hardware was idle because its tough to have lots of ILP with typical CPU workloads.
Its a delicate balance between front end and execution, this is probably why IPC performance alone has only crawled along with every refresh, with the most significant improvements in speed coming from the combination with higher clockspeeds afforded by better manufacturing. -
-
Question about CPU's
Discussion in 'Hardware Components and Aftermarket Upgrades' started by VaultBoy!, Oct 7, 2012.