The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    Software for recording history of thread full load

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by Dxxx, Apr 30, 2011.

  1. Dxxx

    Dxxx Notebook Enthusiast

    Reputations:
    0
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    5
    Anyone knows a solution to monitor the 100% load of threads during usage?

    I mean a reading showing minutes in full load for 1 thread / 2 threads / 3 threads... (and so on depending on CPU) versus total monitored time.

    Why I ask for such an exotic thing?
    My impression is that even on a dual-core both the threads are seldom used; so would 4 threads or even 8 bring benefits?
    If such a software is available it would be interesting to see records from various users in various environments on this forum.
     
  2. newsposter

    newsposter Notebook Virtuoso

    Reputations:
    801
    Messages:
    3,881
    Likes Received:
    0
    Trophy Points:
    105
    Finally, someone who 'get it' in terms of process and core affinity.....

    Anyway, process monitor will do this but you've got to set filters pretty narrowly to limit the kind of data you collect.

    also, oprofile is a linux-specific tool but I've seen some thread re porting it to run in 64 bit windows.
     
  3. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    Before the flame thowers are lit I'll appologize but below is meant as a simplistic explination and very generalized. There are exceptions to every rule.............

    Actually threads are not always set to affinity. If the application doesn't assign affinity it goes to the OS for resource pooling. There is a simple way to see this.

    If you run SuperPI on a C2D without affinity the system will split the thread load between the two cores and you will see 50% or better CPU load between the cores. Load may go up a bit as other threads are caching in and out and efficiency will dictate slightly higher loads. This also goes the same for a C2Q where now since there are 4 cores we see 25% load per core with added load for efficiency and other threads.

    Programs optimized for multiple cores usually means they set affinity to the threads and run/use more than one at a time. The advantage here is resource pooling uses interlacing thread load between available cores. This forces internal cache to be constantly reloading for the new thread proccess. Without the pool where optimized this cache flushing occurs much less if at all.

    Now with iCores this changes a bit as the cores are NUMA capable with Vista & Windows 7. Also the Large L3 cache is shared where as the C2Q uses 2 large independent L2 caches and is not NUMA. When a thread is not optimized it also forces the 2 large caches of the C2Q to duplicate everything meaning I/O to the main memory is doubled. This added I/O will take a pooled thread that runs at the same clock on a C2D and runs it about 7% slower. Again this can be simulated with SuperPI setting affinity too core 0&1 and then compare the times to all four cores. Also this will take what you think is a 12MB combined cache of the C2Q and effectively make it really only 6MB on non multithread optimzed programs.

    This is primarily why Intel never pushed the C2Q design further. They needed NUMA, shared (smart cache) L3 and arguably Hyperthreading as well badly in order to push the more cores is faster and better idea. The C2Q design with SMP instead of NUMA was just not going to cut it...................
     
  4. newsposter

    newsposter Notebook Virtuoso

    Reputations:
    801
    Messages:
    3,881
    Likes Received:
    0
    Trophy Points:
    105
    don't forget compilers...... without a set of compilers that can generate smp/intel core friendly code, the hardware design is wasted on legacy code.

    which is pretty much the situation we're in today.
     
  5. Dxxx

    Dxxx Notebook Enthusiast

    Reputations:
    0
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    5
    Gee guys, thank you!

    Your approach is more educated than my expectations, I shall need to do some reading...

    Regardless of the intrinsic processes inside the multicore processor, my approach was much more straightforward:
    - during normal use I almost never see a 100% load for any thread except if I run stability tests like Othos.
    - in real-life situations I see no improvement when using 4 threads capable CPU (core i3/i5) versus 2 threads (coreDuo). For example jumping to SSD makes a clear difference - so, should one go to ULV as computing power is in excess, always?

    I would like to know if it is only me, or others have similar experiences.
    Subsequently a certain soft to record peak thread usage, that would be widely accepted could make comparing easier.
    The closest thing that I found is ProcessExplorer 14.01 with a lower sample-time in order to show longer records, but it is far from ideal, and I have some doubts about it's reading since TaskManager shows else.

    Furthermore (maybe not directly related) my tests suggest that inside the same family (Arrandale) the hiperthreading-enabled CPU (core i3) is badly beaten by the 2 threads one (the modest P6000) if accepting only 1 or 2 threads loads.
    Of course that 4 threads would mean twice the work done, which brings us once again to the question: "do I really use the 4 threads?"
     
  6. ViciousXUSMC

    ViciousXUSMC Master Viking NBR Reviewer

    Reputations:
    11,461
    Messages:
    16,824
    Likes Received:
    76
    Trophy Points:
    466
    I want to say speedfan does this for some reason, perhaps Everest?

    I forget but I did have a program that had this option, and best yet it did it live to a log file. So if you crashed or something (overclock testing) the log file is there and up to date till the second it crashed.

    Also considering the nature of the log file it would be pretty easy to import the data into excel to make pretty graphs.
     
  7. Dxxx

    Dxxx Notebook Enthusiast

    Reputations:
    0
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    5
    Great - just remember what - I do not recall at all that SpeedFan does anything alike!
     
  8. Dxxx

    Dxxx Notebook Enthusiast

    Reputations:
    0
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    5
    I have received from Naton over PM these:

    That said Windows Task Manager does a good job at showing the processes and loads on a CPU.

    As whether hyper-threading is good or bad, the answer is it depends on the type of applications. I have been doing quite a lot of video encoding lately with a celeron dual core cpu. When I encode with x264 my CPU cores are running at full load (i.e. 100%). When I encode with XviD the load on the CPU cores is about 70 to 75%. With x264 having hyperthreading won't help reduce the encoding time. With XviD hyperthreading will reduce the processing time since the remaing 25 to %30 of the unused cores can be used to emulate two additional cores.


    Task Manager is limited period of recorded time - covers minutes.
    I would like long periods of time, maybe whole lifetime of laptop. If one would use 2 threads at full capacity less than 1%, than it would have no sense to upgrade to 4 threads - and this is what I suppose is the truth.

    I have researched hyper-threading on core i laptop processors.
    It works to the extend that it can do double calculations vs dual-core
    The problem is the price for that: laptop with P6100 peaks at 40W while similar laptop with core i3 M330 peaks at 70W. (they would be nearly the same processor, first has 2 cores + no hyper-threading).
    Not only it drains battery, but at 70W we talk about 5+ Amps which can destroy it.
    Furthermore when processor works on 4 threads speed drops at 70% on each thread.

    Video editing, but as well large photo editing or stress software like Prime can make all threads go to 100%, but a normal person seldom will do so - so do we recommend against hyper-threading?
     
  9. newsposter

    newsposter Notebook Virtuoso

    Reputations:
    801
    Messages:
    3,881
    Likes Received:
    0
    Trophy Points:
    105
    Intel even recommends against hyperthreading and they are the ones putting it into more and more chips.

    Intel dosn't actually say hyperthreading=bad. They do however say that hyperthreading=processor overhead and well written code=good.

    HT is meant to give some temporary boost to short code segments that can be executed out of order or in parallel, as ID's by the CPu itself.

    Properly written and compiled (!!) code will run faster on non-HT processors than 'average' code will on HT-enabled processors. This is one reason why Unix machines, with very few exceptions, consistently outperform Windows machines on both the same CPUs and on RISC vs CISC cpus. It's not the hardware so much anymore as the software and compilers.

    There are multiple Intel whitepapers on the subject.
     
  10. Dxxx

    Dxxx Notebook Enthusiast

    Reputations:
    0
    Messages:
    10
    Likes Received:
    0
    Trophy Points:
    5
    Yep!
    Just what I thought!

    If Intel has been candid as to that it must have been well hidden from the public.

    The hyperthreading really pushes power drain:

    T4300 UV (soft, pinmod) = 28W
    P8600 UV (soft, pinmod) = 35W
    P6100 = 35W
    core i3 330M = 50W (4thrds)
    core i5 460M = 48W (4thrds)

    These would be on laptops (Lenovo, Acer, HP) with Intel video.
    My power measurements are done loaded with Prime/Orthos and using a self-made Watt-meter on the 19V DC.
    They do not include major loads on video (would peak with video even more), and they are for the whole laptop.

    There is more to say at efficiency (Joule~Ws per task) where a P8600 undervolted beats the core i5-M460 astonishingly!!!
     
  11. naton

    naton Notebook Virtuoso

    Reputations:
    806
    Messages:
    2,044
    Likes Received:
    5
    Trophy Points:
    56
    To sum what newsposter is saying, you need one of the following two things to maximize the advantage of hyperthreading:
    1- Write codes in assembly
    I would say impossible because of the complexity of modern code. Writing a 10 lines code in assembly is already difficult, so imagine if you have to write an OS with millions of line of code in assomply :rolleyes:

    2- Have an optimised compiler
    also impossible because that would mean limiting the usability of software to intel based platform. Also a Compiler optimised for the core i3 for instance might not generate code that will run fast in a Core 2 Duo based platform.
    It all has to do with ensuring that programs will run in the vast majority of platforms at optimum speed, which means that even if the current technology is the Core i, programs are compiled using only the instructions set available for Pentium 4 and even possibly Pentium 3.

    Did you ever wonder why games run faster in a game console than they do in a computer despite the fact that computers have more processing power? Why a £300 laptop cannot run games as smoothly as a £200 console?