The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.
← Previous pageNext page →

    Ryzen vs i7 (Mainstream); Threadripper vs i9 (HEDT); X299 vs X399/TRX40; Xeon vs Epyc

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by ajc9988, Jun 7, 2017.

  1. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Intel also said it is keeping certain lines on 14nm, 10nm was planned for 2019 holiday and they reaffirmed, EUV 7nm according to an August article is planned for 2021, and all they seem to be doing here is an active interposer, except for the one slide saying the I/O SoC components will also be on the interposer. It looks rather boring. But I take it you haven't read AMDs research from 2014 and 2015, or the 2017 cost analysis showing the only reason they haven't done it yet is cost of active interposers, meaning Intel isn't ahead.

    The industry has been working on active interposers for awhile and cost for packaging is primarily why not adopted, not technical know how as Intel claims. Lots of marketing BS, little new info I'm seeing.

    You know that little platform on 7980XE for the die where you see two PCBs? That is a passive mesh interposer. Switch that for an active interposer. Now does that really sound that exciting?

    Sent from my SM-G900P using Tapatalk
     
  2. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    Intel hasn't made a single reliable statement about when 10nm will release for many years now, I don't see why that will change.

    Considering that Intel is now throwing 10nm under the bus when bragging about 7nm EUV I don't think there is any love lost for the debacle that is 10nm, Intel would like to downplay 10nm now in favor of 7nm EUV.

    Big problem is that the 2021 date isn't what they are saying now, they are saying 7nm is trailing 10nm out 4 more years until production, which is 2023 if Intel makes it on time for 10nm in 2019.
     
    Last edited: Dec 12, 2018
    ajc9988 likes this.
  3. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Here is the article I am relying on for Intel's 7nm timeline:
    https://www.eetimes.com/document.asp?doc_id=1333657

    Sent from my SM-G900P using Tapatalk
     
    hmscott likes this.
  4. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    5 month old promises from Intel are many revisions ago. :D

    I'm talking about the last couple of days, in print soon.

    EUV was to be a phase of 10nm at one point. Now it's coopted for 7nm, made believable because Samsung is getting there now.

    Intel is just whipping out whatever quiets down the hoards, and keeps selling 14nm.

    IDK if AMD's Zen 2 rumors are in any way true, but if they hit Intel hard enough, Intel has wasted $B's on needless Capex for 14nm production expansion.
     
    Last edited: Dec 12, 2018
  5. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    @Talon -
    So, as to Samsung EUV, they are the furthest in the use of EUV, even further than TSMC. Samsung started with EUV on the lowest levels, but will be the first to use EUV on all levels, with TSMC using it on all levels shortly after next year. So, Intel's use in 2021 is 2 years after TSMC and Samsung will have been using it in production (which they technically already are, but meaning full chip EUV use). That is a huge deficit for manufacturing process.

    Meanwhile, here are some articles over Intel's Architecture day:
    https://www.anandtech.com/show/13699/intel-architecture-day-2018-core-future-hybrid-x86/8
    https://www.pcmag.com/news/365434/intels-3d-chip-stacking-tech-aims-for-smaller-power-effici
    https://www.forbes.com/sites/antony...d-trouble-for-amd-ryzen-in-2019/#51a1c3b37269
    https://wccftech.com/intel-unveils-...-to-3d-stack-chips-with-an-active-interposer/

    Key points:
    1) Intel is stealing the idea of bigLITTLE from ARM, marrying mainstream x86 cores with Atom cores. (See Anand coverage; bigLITTLE has been used by ARM since 2012 or earlier)
    2) Intel is using an active interposer and putting chiplet dies on top of it. To put this in context, Intel's mesh chips with HEDT and Server chips are passive interposers (that lifted PCB on those chips) and has been used since Xeon Phi in 2014. https://en.wikichip.org/wiki/intel/mesh_interconnect_architecture
    But, Intel isn't ahead in this regard. AMD has not only done research on the topic, they also did a cost analysis in late 2017 which showed below 40nm, the use of an active interposer (meaning the active interposer constructed on smaller nodes) negated the cost benefit versus doing a monolithic chip.
    http://www.eecg.toronto.edu/~enright/micro14-interposer.pdf
    http://www.eecg.toronto.edu/~enright/Kannan_MICRO48.pdf


    https://seal.ece.ucsb.edu/sites/seal.ece.ucsb.edu/files/publications/2017-iccad-stow-activepassiveinterposers.pdf


    upload_2018-12-13_8-6-35.png
    Couple points on this: 1) Intel never got the taste of margins with anything except monolithic dies, so Intel wouldn't mind spending as much as a monolithic chip to be able to use the active interposer. 2) Due to Intel's 10nm woes, there was a constraint on 14nm availability and chipsets were pushed back to the 22nm nodes, which some of those fabs were set to be shutdown. Now, they cannot. You need to justify keeping the lights on as those fabs burn money when not at capacity. What is an easy way to fill capacity? Create a product that requires the node. In comes the 22nm active interposer. It fills out 22nm fab capacity, is built on a very mature node, meaning low defect rates and high yields, and is quickly subsidized into new products. So, by saving money in other ways by filling out fab capacity, and with the cost being the same as doing a single monolithic die, Intel's decision makes a lot of sense. But, from the papers cited above, Intel isn't ahead on this, they are just making lemonade from lemons.
    3) Intel will finally hit the 1TFLOP on their iGPUs. If I recall correctly, this is still behind AMD's integrated graphics, but it does show that Raja is bringing around their GPU department, which is a good thing.


    Unknowns is what Intel will put on the active interposer, if anything, beyond the logic routers. That wasn't disclosed.
     
    Last edited: Dec 13, 2018
    hmscott likes this.
  6. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
  7. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
  8. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Data Dump incoming:
    What I am saying with those links is that AMD has already evaluated the use of active interposers and is the company that is articulating a data protocol for active interposer routing so that it doesn't matter what each chiplet does on its own data routing, so that any chiplet could be attached to an active interposer and work. Further, in the 2017 white paper, it directly shows the reason that AMD hasn't yet adopted the active interposer is due to costs. Passive interposers can be used, but don't offer the benefit of active interposers. But, the cost of an active interposer is the same as doing a monolithic chip once the active interposer is produced on the 32nm or smaller active interposer. As such, adoption did not yet make sense until the costs are reduced.
    https://spectrum.ieee.org/tech-talk...iplet-revolution-with-new-chip-network-scheme

    The earlier ones show AMD did research into the optimal topology for active interposers in 2014 and 2015. It shows the optimal core count for the benefit, latencies, etc.

    Now, if you look at packaging solutions, say from GF, you find that 2.5D and 3D integration is only available on 32nm and 28nm processes, with 14nm coming soon. https://www.globalfoundries.com/sites/default/files/product-briefs/pb-aps-10-web.pdf
    I bring up GF doing the packaging as, due to the WSA, even if they cost more than competitors, being able to potentially count the wafers used for the active interposers against the WSA would reduce the fines payed under that contract for doing fabrication at TSMC, making it potentially, on net, cheaper.

    Now, there is an article today from Anand saying Intel is releasing the B365 chipset on 22nm. ( https://www.anandtech.com/show/13714/intel-adds-b365-chipset-to-lineup-the-return-of-22nm ) Intel was forced to move chipsets back to 22nm due to the 10nm process not being ready for mass deployment which caused a shortage of 14nm capacity. This means Intel could not shut down their 22nm fabs to cut that capacity. As such, a fab needs to stay as close to capacity as possible or else it bleeds money (hence why AMD eventually went fabless). So, Intel using 22nm fabs to do the 22nm active interposer is just Intel making lemonade on tech that few others adopted in the industry yet due to the basis of costs. If you go back to the cost study AMD did, the active interposer at 20nm is around the cost of doing a monolithic die, which is what Intel has done to date. So it isn't really costing them more to add in the active interposer at all, while saving expenditures by keeping the 22nm fabs full of jobs, while having awesome yields on the active interposers due to it being a very mature node with low defect rates, etc. If you examine the amount of area AMD estimated is needed for the logic routers, you can see only 1%-10% area on the active interposer is needed to achieve the goal, meaning the chances a critical defect hits the active interposer is very low.

    But, as to it making AMD do it, that is the wrong thinking. AMD already plans to adopt it, just not until the costs of doing so are lower. They will check the IC Knowledge lists to see when costs make sense.

    But, if you look at all that data I provided, AMD has all the solutions to the problems encountered with using an active interposer. All they are waiting for is it to be cost effective. Intel isn't leading anything here, other than doing it because they need to justify and subsidize having to keep certain 22nm fabs open because of their chipsets due to the delays in 10nm. Doesn't take a genius to figure it out, just takes someone paying attention to the tea leaves.

    Also, I forgot to mention that the 2.5D integration of HBM Phy onto a stitched interposer was accomplished last year. This suggests that HBM could be added to an active interposer when AMD eventually does adopt an active interposer. Meanwhile, with only 1-10% of the active interposer being used according to their papers, it leaves room for in the future having elements of the I/O chip being moved to the active interposer as an additional way that things easily could develop. The question is what benefit it would be to produce what on the 32nm or 28nm nodes over having their I/O disintegrated die on 14nm. But I would bet AMD has an idea of what would be better where and will consider it when eventually adopting, considered the detailed paper on routing topology of active interposers in that group of links.

    Forgive me, many think that Intel is showing through using it that they are significantly ahead in the field of active interposers and 2.5D and 3D chiplet integration, meaning that it would take years for other chip designers to catch up, which isn't the case. So I do apologize for that assumption in regards to you.

    What those do show is AMD does have plans to do so in the future, it is just a matter of timing. On the socket part, there is a chance they are introducing a new socket in order to support PCIe 4.0 on the upcoming Zen 2 chips, which comes from analyzing their wording from the Next Horizon event on Nov. 6th. Meanwhile, we know that PCIe 5.0 will be finalized potentially in the first half of next year and AMD did mention DDR5 potentially being available for Epyc 3 chips based on Zen 3, but that mainstream chips will not support DDR5 in 2020 (leaves open that TR HEDT platforms may or may not get DDR5 support at that time). Intel has not provided information on when PCIe 4.0, 5.0, or DDR5 will be supported. As such, though, AMD may have backwards compatibility on the CPUs for socket compatibility, but may require a new socket for the new boards containing the new features, which I think is understandable to many in the server and workstation spheres. It is also the reason I may wait for 2020 to upgrade my 1950X rather than next year (if you are going to buy a new board, and there is a chance that board won't contain the new feature sets that release that year, waiting one more year is fine IF your workloads won't suffer for the wait).

    But, I read somewhere Intel hinted at 1GB of RAM being integrated with the active interposer processors, acting as an L4 cache. Although not novel (those articles I gave for AMD whitepapers from 2014 and 2015 specifically dealt with on package integration of memory and latencies involved, suggesting we could see some type of 3d memory solution integrated when AMD does incorporate an active interposer), seeing the latencies involved with Crystalwell, which was the eDRAM on Broadwell, suggests that Intel will get a significant uplift in certain workloads, as well as keeping the chip primed and having to go off chip less often for memory calls, which is fantastic. Intel also kept the power delivery part under wraps, which is something that does excite me, but we were given no information about it (possibly bringing FIVR back, which was rumored for Ice and Tiger lake anyways).

    Also, on compatibility, part of the reason I gave the article discussing data protocols for active interposers is that the routing is chiplet agnostic, meaning you can integrate parts that have their own internal routing and not effect it. Then just comes down to appropriate socket wiring, which I mentioned they may need new sockets for these features, while just maintaining drop in compatibility for consumer side.

    Here is some information on other packaging types out there in the market.
    http://s3.amazonaws.com/sdieee/1817-SanDiegoCPMTDL_Lau_advancedpackaging.pdf (look through this one!!!)
    "DIGITIMES reports that the new TSMC plant in Chunan will be dedicated to offering the foundry’s in-house developed advanced packaging technologies: CoWoS (chip-on-wafer-on-substrate) and integrated fan-out (InFO) wafer-level packaging, and its newly-unveiled system-on-integrated-chips (SoIC) and wafer-on-wafer (WoW) packaging services."
    https://criticalmaterials.org/tsmc-to-set-up-new-fab-for-advanced-packaging/
    https://fudzilla.com/news/pc-hardware/47265-tsmc-to-set-up-new-fab
    https://electroiq.com/2018/10/synop...ti-die-3d-ic-advanced-packaging-technologies/
    Older article for foundational work from 2014:
    https://semiengineering.com/time-to-revisit-2-5d-and-3d/
    2016 follow up: https://semiengineering.com/2-5d-becomes-real/

    Packaging Market from February 2018:
    http://worldherald24.com/2018/02/19...samsung-electronics-toshiba-amkor-technology/

    Paper and slide show discussing the implementation and challenges of 3D packaging from 2016:
    https://smtnet.com/library/files/upload/25d-3d-semiconductor-packaging.pdf

    Then, somewhiere in this thread, unless deleted, I already posted the articles discussing Intel handing over AIB patents (interconnect similar to EMIB) to DARPA to try to undercut AMD's work to date on 2.5D and 3D technologies related to their processors and what gets adopted for the exabyte project. AMD countered with proposal of routing protocols for active interposers to standardize it so that it didn't matter which chips were put on the interposers or their internal routing, they could all work without an issue, which is the ieee article I mentioned discussing it. Either way, leaving for awhile. Peace.
     
  9. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    With all this we need actual silicon we can test and feel. This from both camps. So the big question is what do we get and when can we have it from whomever?
     
    hmscott likes this.
  10. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    That is already disclosed. Intel is starting with 10nm mobile small CPUs for the 3D chips moving I/O and other features onto the lower stacked wafer/active interposer, then will move it up the stack over the coming years (read as 2020 or 2021) where it will be in all products. AMD's 2.5D solution with the I/O moved off chip will deploy over 2019, including adding support for PCIe 4.0, which Intel has not said whether or not it will support on sunny cove/ice lake. Then, we don't know if the cost of active interposers will drop enough by 2020 to make it cost effective, but moving the disintegrated components on the I/O to the 90% empty space on their active interposer, although reducing yields, should be relatively simple, while TSMC has excess capacity and is the largest 2.5D/3D integrator in the world, as shown by one of my links.

    The main questions surrounding Zen 2 for consumers is whether mainstream will have a separate 7nm die. This is because no one has a lead on a different I/O chip being made for mainstream and no one knows whether or not the monster I/O die for Epyc can be shrunk. The cost of a 7nm chip design costs 3x that of a 14nm/16nm chip, so financially that makes less sense, not to say it cannot happen.

    AMD should have their products out on the normal cadence, roughly, meaning Zen 2 products should proceed ahead of the 10nm designs, with Intel also planning for a backup in case 10nm production is still problematic, hence the 14nm 10-core rumor and Intel prefacing doing mixed silicon and that 14nm will stay an important part of their designs, with the execs suggesting that it has more iterative refinements to be had.
     
    hmscott likes this.
  11. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    Do not want to hear starting, I want the silicon! We can not test what dose not exist!
     
    ajc9988, hmscott and Papusan like this.
  12. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Well invent the forward moving time machine and we can.
     
    TANWare and hmscott like this.
  13. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    It's called the "waiting patiently" machine. We all have one. ;)
     
    ajc9988 likes this.
  14. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    This is all good except when vaporware becomes neverhappenedware :)
     
    hmscott likes this.
  15. bennyg

    bennyg Notebook Virtuoso

    Reputations:
    1,567
    Messages:
    2,370
    Likes Received:
    2,375
    Trophy Points:
    181
  16. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    I found this funny so decided to share:
    upload_2019-1-2_9-12-0.png
     
    ole!!! and hmscott like this.
  17. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    The issue may lie in what Intel has planned to try and steal AMD's thunder. So far this has been the pattern.
     
    hmscott and ajc9988 like this.
  18. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Yeah, Intel has largely failed in that regard as of late. Now, I just wanted to be cheeky about Intel's process woes, but their architecture is still good, and ice lake/sunny cove will likely have an IPC uplift of like 11%, so even if the process causes it to run slower, it will be fine and be about the same as AMD, which should have a 7-11% IPC gain over the skylake family, excluding outliers like floating point IPC changes.

    But, that product is October on the early side, Dec. Late side, or pushed to 2020 if things go horrible.

    Intel has a 10C 14nm mainstream chip for the second half as a backup, then overpriced 28C HEDT in no man's land, Cascade/Cooper 14nm server and HEDT chips on the way (basically coffee and refresh for servers and HEDT), and the AP 48 core glued chip.

    This is just to give context, along with pushing 10nm off since 2016 and refreshing skylake again and again and again (literally).

    Intel tried the marketing bluster. All AMD has to do is deliver. If performance on mainstream matches, then AMD wins.

    The rumored pricing suggests AMD is bringing to mainstream what they did on HEDT. AMD did 16C@$1K the first year, about half the price of Intel's top two chips. They then did 32 cores for a little less than Intel's 18C with Zen+. If they price their 8c at $220-270, do 12 core for $330, and 16c for $450-500, then Intel is screwed on pricing. At 14nm, Intel cannot cut prices that deep, so it is a matter of repeating what they did to Intel on HEDT, but with better clocks and IPC. Also, could mean $1K for a 32-core over 4GHz for TR.

    Sent from my SM-G900P using Tapatalk
     
    hmscott likes this.
  19. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    I have some doubts on performance of the 16c, I think the dual channel memory will hamper the 3800 let alone the 3850. But we shall see soon enough.
     
    hmscott and ajc9988 like this.
  20. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Agreed on those fears, but I think part of the 32c 2990WX's problem was pre-fetch and retiring of data, not just bandwidth. This also would explain why the I/O chip holds the controller and chip to chip comms all go through the I/O chip now, which standardized latencies. That means the data will be less stale, even though it effects average latencies.

    But we will find out pretty quickly after the chip's release whether memory bandwidth on mainstream tanks performance.

    Sent from my SM-G900P using Tapatalk
     
    hmscott likes this.
  21. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681


    More on the problem of performance regression in windows. Ian Cuttress is also going to update his article on it soon. This is pretty cool!

    This is showing windows is a problem, not mainly the memory bandwidth.

    @ole!!! @hmscott @TANWare @jclausius @jaybee83

    I know I'm missing some people here.

    Sent from my SM-G900P using Tapatalk
     
    jaybee83, Robbo99999, Papusan and 6 others like this.
  22. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    This is too funny. When I pointed out an unknown performance bug with the original 1950 I was scoffed at by all. I showed where a system image restore was the only way to get performance back to reasonable but it still seemed slow. now it is finally being looked into.
     
    Last edited: Jan 3, 2019
    Mr. Fox, ajc9988, jclausius and 2 others like this.
  23. Robbo99999

    Robbo99999 Notebook Prophet

    Reputations:
    4,346
    Messages:
    6,824
    Likes Received:
    6,112
    Trophy Points:
    681
    Arghh, he did my head in for the first 10mins repeating the same thing like 10 times or something, but was good work he did to work out the issues! I can't see Microsoft leaving the bug for too long, the fact that 'the community' was able to provide a work around for the problem suggests that Microsoft should be able to fix it fairly easily.
     
    ajc9988 and jclausius like this.
  24. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    2990WX Threadripper Performance Regression FIXED on Windows
    https://www.reddit.com/r/hardware/comments/ac03yx/2990wx_threadripper_performance_regression_fixed/

    2990WX Threadripper Performance Regression FIXED on Windows
    https://www.reddit.com/r/Amd/comments/abzmxg/2990wx_threadripper_performance_regression_fixed/
     
    Last edited: Jan 3, 2019
    jclausius and ajc9988 like this.
  25. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    A theory; it seems to me that when started with all 64, or 62, threads fully loaded. when it needs to switch 2 threads since they all are active at the time it has to slow execution to physically move the thread. Now once active if a core is given back to the OS the OS now has an inactive core to switch in and out of, or just a spare core needed to find out the switch is not needed to begin with.
     
    ajc9988 likes this.
  26. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    I did try to find it for awhile, but thought it was related to timer issues on the windows OS (although that may have been part, stopped digging after that). To be fair, even this is a bit deeper/over my head for my skill to track down. It took better people than I to find it. I installed the fix and plan to test the system with it on, just to see if the NUMA overflow is active and degrading performance even with direct memory access.
    Yeah, don't count on it. I've seen remarks that a person with either a 2p or 4p server system based on Intel noted performance issues years ago. Maybe since others narrowed down the culprit for Microsoft something will be done about it, but I don't have faith in that company!
    That doesn't disagree with their findings, which are summarized as such:
    1) the kernel is telling the scheduler which thread is recommended;
    2) the scheduler is taking the recommendation as gospel;
    3) when an 8 core die is full, windows allows for it to overflow the work to a single NUMA node more;
    4) as it shuffles to one other NUMA node, between that and the thread recommendation, thread thrashing starts occurring where the shuffle of data to other threads then starts to effect the scheduler and so the CPU is pegged, but you only get the usefulness of 2 NUMA nodes on performing actual work;
    5) this program acts to see when the behavior is occurring and tests the NUMA affinity scheduler, fixing the locking to two NUMA nodes.

    That, at least, seems to be what I've gathered on the topic so far. Really looking forward to Ian Cuttress of Anandtech's results testing this fix and doing a deeper dive.

    Either way, this is great news as Epyc 2 details will be shared soon enough and AMD is preparing a 16-core mainstream CPU.

    Also, hardware unboxed may examine this and benchmark it as well soon.

    Sent from my SM-G900P using Tapatalk
     
    Raiderman and jclausius like this.
  27. Mr. Fox

    Mr. Fox BGA Filth-Hating Elitist

    Reputations:
    37,213
    Messages:
    39,333
    Likes Received:
    70,629
    Trophy Points:
    931
    "This strongly suggests something is broken in Windows..." LOL... really? It's about time somebody else on YouTube beside me finally realized admitted this. I've been complaining about it before Zen and TR processors were invented. The problems started with Windows 8 and the Redmond Retards have done nothing to fix it.

    If they are using exclusively Windows 10, this should come as no surprise because Windows 10 severely gimps Intel CPU performance as well. If they haven't already tested with Windows 7 as they have with Linux, they should do so to see if the same debilitating nonsense that the Redmond Retards use to castrate Intel CPUs is doing the same or worse to AMD.
     
    Raiderman, Papusan and ajc9988 like this.
  28. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    If it is the scheduler thrashing threads to the cores it to some point will effect any multi core CPU.
     
    Last edited: Jan 3, 2019
    Papusan, Mr. Fox and ajc9988 like this.
  29. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    They do know windows 10, on average, performs 20% slower than Linux with both Intel and AMD. There are exceptions, but that is the general rule of thumb. This issue of 50% performance regression is due also to the scheduler and kernel thread recommendations in relation to NUMA nodes, which is why we saw in benchmarks, for at least some, where the 32 core CPUs were performing the same as the 16 core CPUs.

    Now, I do agree that looking into whether older windows and windows server OSes had his NUMA regression should be done. But it also shows why Windows as a Service is bull and needs ended, because since it's implementation, WaaS has allowed windows to degrade, as Microsoft fired the in house testers for QA instead using the "feedback hub," as well as what seems like a allow down in addressing major issues.

    Sent from my SM-G900P using Tapatalk
     
    Raiderman, Aroc and jclausius like this.
  30. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    A little nit... I know they're not your titles, but I wish they would fix it as they seem misleading. This isn't really fixed until MS releases a patch to Windows 10. Until then, what we have here is an end-user work-around.
     
  31. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    Personally, I'm glad to see this general performance gain. However, I think @Mr. Fox is more concerned about the performance drop seen between Windows 7 and Windows 10.
     
    hmscott, Aroc, Mr. Fox and 1 other person like this.
  32. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Whenever I see "what we have here," I always think of this song:


    Which getting Microsoft to fix much of anything feels like a war in and of itself!

    Sent from my SM-G900P using Tapatalk
     
    hmscott and jclausius like this.
  33. Mr. Fox

    Mr. Fox BGA Filth-Hating Elitist

    Reputations:
    37,213
    Messages:
    39,333
    Likes Received:
    70,629
    Trophy Points:
    931
    There is absolutely no excuse for it getting worse, being broken and staying broken. The exact opposite should be true of any product... it should get better, stronger, faster. That it has not is absolutely unacceptable. We would fire half-assed loser employees for the same crap in most other businesses. These stupid idiots also broke the RTC since the release Windows 8 and have since just left it that way.

    http://hwbot.org/newsflash/2684_win...ock_bug_like_windows_88.1_disallowed_for_now/

    http://hwbot.org/news/9824_breaking_windows_8_benchmark_results_no_longer_accepted_at_hwbot
     
    Last edited: Jan 3, 2019
    Raiderman, hmscott, Papusan and 4 others like this.
  34. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    As always I try to quote the source "exactly", then it's up to you to go to the source and make your displeasure known. That's why I always include as much detail as I can from the source, so that you have everything you need to review here, then you can go there to respond if needed.
    It's a well known phenomena that certain words in use within each discipline of engineers and developers for hardware and software products come up with their own "spin" or interpretation of various "states of being" - and are quoted similarly but can mean many more nuanced things to each specific situation rather than a singular meaning of finality.

    In my world I have to be very careful to listen to the context of how and when and where the words that are spoken are interpreted, so to me his statement is valid, as I know it's nuanced in the situation it's declared.

    "Fixed" has many meanings, ranging from "wow, I found the fix" all the way to "the fix is implemented and checked in", we are "waiting for qa test results to confirm the fix works in all regression tests", to "the fix was rolled out to customers as a beta test trial", to "the fix has been confirmed to work in one client installation", to "the fix has been rolled out to a limited number of client sites for validation", and "the fix will be rolled out in the next dot release", to finally(?!!) "the fix has been shipped in the current release".

    Of course that fix might end up being only "a partial fix", or merely "a temporary fix" until that rarity "a complete fix" is found, and then "the fix is implemented" - starting it's long journey as a "fix in transition" to "deployment of the fix".

    To software developers the word "Fixed" takes on many meanings, I haven't counted them all, but it's likely as many nuanced meanings of "fixed" exist as there are meanings for "snow" - 100? 50? 40?

    So, it's likely this is a "we know what the fix is", which means it's "a high probability fix found", but not "a guaranteed 100% fix".

    This "Fixed" has merely entered the "Fixed Zone", at a very early stage in the "Fixed Continuum", a single scintilla of instantiation of "Fixed" existence.

    Which probably means, in this specific case, while a specific instance of mitigation has been identified, the problem is not actually "fixed", per se.
    You could take this to Wendell, expressing your displeasure at what appears to be a preemptive declaration of "FIXED", when in actuality the journey toward a final fix delivered to users has only just begun. But, I wouldn't recommend doing this to Wendell.

    Software developers are very touchy, and respond erratically - unpredictably - to criticism's. You could imbue a sense of urgency that ends up in a quickly but poorly implemented fix, or you could send Wendell off into a long journey toward a "perfect fix", from whence Wendell may never return, forever lost in the FIX.

    It's best to smile, say "Cool", and look inquisitively at the developer - encouraging them to continue in their own way, and hope that a FIX will eventually arrive.
    If you can't do that, if you've waited as long as possible - and then some, and the situation has "hit the fan" of criticality, then that's where I come in to coax the fix out of the ether with the help of all the resources available.

    I'm the guy they bring in to bring the FIX home. I'm 100% successful at delivering a FIX that works out of a situation where no FIX has been found, for far longer than can be tolerated.

    I fix problems. That's my world. When the feeling and the listening times are over, and the fix must happen, that's when I am allowed in to bring a problem to it's successful conclusion.

    It's a wonderful life, with little of the BS everyone else seems to tolerate. :)
     
    Last edited: Jan 3, 2019
    jclausius likes this.
  35. Mr. Fox

    Mr. Fox BGA Filth-Hating Elitist

    Reputations:
    37,213
    Messages:
    39,333
    Likes Received:
    70,629
    Trophy Points:
    931
    Fix is simply a nice way of saying alternative solution. The only true fix is one that identifies the source of an issue and corrects that problem at the source. Everything else is a workaround. There is nothing, per se, wrong with a workaround. Sometimes that is the only solution available because the entity responsible for the fix is either too ignorant to identify a fix, or doesn't care about correcting their mistakes, or refuses to admit they made a mistake in the first place. Where Micro$lop is concerned, I think we need to be content with workaround solutions because, while they are not ignorant, they clearly do not care and also don't readily admit to making mistakes. It's always someone else's fault when their crap doesn't work right.
     
    Raiderman, TANWare and hmscott like this.
  36. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    I can tell you since issues started M$ has said it is not at fault and there is no real issue. Their default answer is just restore the OS!
     
  37. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    Turn it OFF and ON again?

    Yup, typical distancing move, total non-engagement. "It's not our problem".

    Well, I guess we'll all just have to move to UNIX / Linux for our high core count CPU's.

    Microsoft Windows- "lost to UNIX OS's in the multi-core CPU wars of the early 21st Century, completely unknown in modern times, except for the now ubiquitous "Solitaire", "Minesweeper", and "Start Menu" included in every desktop."
     
    Raiderman and jclausius like this.
  38. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    LOL. Ain't that the truth!

    When something is 'fixed', then that is fixed and patched by the vendor. So assuming this is reported to MS, and they come out with an update where the scheduling problem disappears, then it is fixed. Until then, all that is left is a work-around or sometimes called a hack. If it was indeed fixed, then the work-around (hack) wouldn't be needed in the first place.

    Note, I said I didn't want to pick nits, but there were multiple posts regarding the subject, I needed to weigh in. I'm touchy and unpredictable in this way.. ;)
     
  39. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    In the mind of the developer, the instant the fix is found, it's Fixed, like I said, it's a problem of communication - the "Normies" see the word fixed and assign a single level of finality.

    A developer has a much more nuanced view, and shouldn't be allowed contact with "Normies", lest such confusions and disappointments ensue.

    It's something to keep in mind when interacting with developers. :)
     
    jclausius likes this.
  40. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    Agreed. It can be a struggle every day.
     
    Papusan and hmscott like this.
  41. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    7,110
    Messages:
    20,384
    Likes Received:
    25,139
    Trophy Points:
    931
    Ahhh, I see @jclausius issue with the posting, I didn't make it. It was @ajc9988 that posted the video and he didn't copy over all of the posted info exactly as it is in the youtube listing... :(

    Here's the same video as I would have posted it, notice the "*" that qualifies the "FIXED" statement:

    2990WX Threadripper Performance Regression FIXED on Windows*#threadripper
    Level1Techs
    Published on Jan 2, 2019
    *At least in cases like this one :D
    [ sry, clickbait works, but see these awesome articles below v]

    Full article, including the mentioned numa/"ideal cpu" dumps
    https://level1techs.com/article/unloc...

    Phoronix Windows & Linux comaprison
    https://www.phoronix.com/scan.php?pag...

    Ian Cutress' article on The Core0 mystery
    https://www.anandtech.com/show/13446/...

    Our Last Video on this:
    https://www.youtube.com/watch?v=WSSAF...


    Wendell did qualify the status of the "FIXED" statement external to the video, but still I think he could have qualified it further outside the video where most people will get their "take-away", that it's FIXED.

    This is a good example of why I include:

    1) The full Title of the Video or Article - in this case that's where the "*" qualifying indication starts.

    2) The channel name / link to the channel - in case the video is pulled you can then go to the channel and find out if there is a new video uploaded in it's place.

    3) The Date - is important to help find within the range of video postings for that and other channels additional videos on that subject.

    4) The Details - which I usually put in a "spoiler" that often continues the explanation for what is in the video with links to other information - and of course the video itself.

    All of that information is on Youtube for a reason, so if we are going to link to that video we should mirror that information in full.

    As @jclausius has shown, not everyone follows the link back to the source to gather all of the pertinent data before commenting. ;)
     
    Last edited: Jan 4, 2019
    jclausius likes this.
  42. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    That was clear from clicking on the video and I really don't care enough if people don't read the additional information at the link. At least from my phone, the video clearly has Wendell's disclaimer. Also, if you read the hardocp, hexus, oc3d, and other articles on the topic, along with the posts at bitsum, the article at level 1 techs website, the original article at Anandtech by ian Cuttress, or Wendell reporting that Ian is firing his machine up to analyze the performance change, all show the problem is with windows kernel and Microsoft cannot just shirk responsibility or say this is a perfect fix, as the solution is obviously not the best to use.

    This also shows that the memory bandwidth line fails as an absolute for a critique of AMD, or that this is a problem of AMDs design. That is why I bashed on hardware unboxed trying to lay the blame for this on AMD for Microsoft's issues.

    Sent from my SM-G900P using Tapatalk
     
    Raiderman, jclausius and hmscott like this.
  43. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    So, here is a list of articles and forum entries on the high core count EPYC and Ryzen with coreprio:
    Blames windows:
    https://segmentnext.com/2019/01/04/ryzen-threadripper-2990wx-performance-regression/
    https://community.amd.com/thread/235352
    https://community.amd.com/thread/235304
    https://tech4gamers.com/a-bug-in-wi...nce-of-ryzen-threadripper-2990wx-by-up-to-50/
    https://forums.anandtech.com/thread...formance-regression-fixed-on-windows.2559476/
    https://www.extremetech.com/computi...an-double-amd-threadripper-2990wx-performance
    https://level1techs.com/article/unlocking-2990wx-less-numa-aware-apps
    https://bit-tech.net/news/tech/cpus/coreprio-tool-near-doubles-threadripper-2990wx-performance/1/
    https://hexus.net/tech/news/cpu/125819-amd-ryzen-threadripper-2990wx-perf-boosted-2x-coreprio-tool/ (labels windows kernel bug, as did others above)
    https://www.hardocp.com/news/2019/0...performance_regressions_linked_to_windows_bug
    https://www.guru3d.com/news-story/a...-see-up-to-2x-boost-with-coreprio-tool,2.html
    Instead, Guru3D says AMD needs to work with Microsoft to resolve it, which is assuming Microsoft would do anything without outside pressure. ("Once you watch and read the entire documentation, you wonder this: AMD needs to work with Microsoft to get a kernel level fix so that the CPU works as intended.").
    https://linustechtips.com/main/topic/1016319-windows-to-blame-for-poor-amd-threadripper-performance/
    https://translate.google.com/transl...ionali-risolti-grazie-a-coreprio/&prev=search
    https://www.overclock3d.net/news/cp...ripper_2990wx_performance_in_some_scenarios/1

    Doesn't Blame Windows:


    That seems to be a decent collection of articles to date on the matter.
     
  44. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    Yes. I watched the video from @ajc9988's post inline since I was tagged. No reason to follow any links. If a post contains the video, the developer in me is too lazy to bother opening up a new tab, going to youtube, or reading the title or any comments. I just watch the video if it seems interesting. ;)

    However, you posted an update to the video with reddit links to the same video, but the title was now outlined. I had already watched the work-around video since I just really like Wendell videos, and already tucked it away as a nice hack for TR and EPYC. But it was the 'FIXED' text in the title that caught my eye. I originally had thought that someone from reddit had generated the text, but after clicking their link, I see they just copied it from Level 1. Again, this title doesn't show up in ajc's post.

    In any case, I think we can just move along now. It's nothing more than pointing out the semantics of the title... Nothing more. Nothing less.
     
    Last edited: Jan 4, 2019
    Papusan, hmscott and ajc9988 like this.
  45. jclausius

    jclausius Notebook Virtuoso

    Reputations:
    6,160
    Messages:
    3,265
    Likes Received:
    2,573
    Trophy Points:
    231
    Here's where things get interesting.

    A long, long time ago, AMD decided to release their own x86 compatible CPU. The CPU rocked, and (if memory serves), was a new cool RISC architecture that could take an x86 CISC instruction, convert it to RISC and then back out to x86-land. (Aside - This really caught the eye of this particular grad student at the time working on virtual memory systems.) Unfortunately, also at this time, Chipzilla had a nice side-arrangement with the Redmondians that they wouldn't really optimize their dominant OS to work with this new CPU entry.

    Now, fast forward 20-25 years. Again, a new AMD CPU contender arises. Does the newly made over Microsoft under Satya's watch decide to make changes to address the issue? However, there's a catch this time. The once dominant OS now has chinks in the armor due to the lack of understanding of their users, and the entire WaaS fiasco for Windows 10. With each day, it is easier and easier to pitch the Linus alternative. So, what do the Redmondians do this time around?
     
    Last edited: Jan 4, 2019
    Raiderman and ajc9988 like this.
  46. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Probably nothing. It is so late in the cycle that a Kernel patch regarding ideal thread recommendations or a scheduler change wouldn't make it into the spring update. Maybe in a year (said 2 years after Epyc launched and 6 months after the TR with 4 dies launched) when they do the next major update. I don't even think Intel pitching a fit could change this.

    What they need to do is ditch the dumb scheduler and use the cutting edge research on aware schedulers that can better feed the upcoming CPUs. There is a chance this will also hit the 48-core Intel Cascade-AP processors, as those in 2P will contain 4 NUMA nodes, with each 24 core die having their own memory controllers. Because of this, it isn't the older issue of two nodes on a HCC or XCC chip which shared a ring bus, it will have enough similar criteria to what is happening with AMD (remember, EPYC has 4 NUMA nodes and 4x2-channel memory controllers) that it will likely be harmed in performance much the same way Epyc and TR are harmed. So, if Intel was smart, they would start looking to see if this can help them as well.

    But, I sincerely hope that M$ does make the changes in a quick fashion, even if pushing the March/April launch of the new OS build (even though they shouldn't have launched this fall's variant when they did and hopefully learned their lesson about releasing too early). If they fix this plus get the sandbox working well, I think that could help put them back on track. But they may need to push the update to May to accomplish it, which I would be OK with (their DX12 ML can wait for something as important as this).
     
    Raiderman, hmscott and jclausius like this.
  47. Mr. Fox

    Mr. Fox BGA Filth-Hating Elitist

    Reputations:
    37,213
    Messages:
    39,333
    Likes Received:
    70,629
    Trophy Points:
    931
    I would not even limit it to an "AMD" problem. I think doing to is totally unfair to AMD and masks the identity of the real culprit. Bottom line is that it is a crappy OS issue. I think Micro$lop fixing this would also benefit Intel HEDT CPUs with more than 8C/16T. I suspect the underlying problem that impacts AMD in an obvious way also adversely affects all high core count processors to some degree and contributes to software application optimization problems in general. Micro$lop has done a half-assed job of developing an OS that caters to chintzy tablets and trashbooks with cheap panty-waist CPUs.
     
    Raiderman, Papusan and ajc9988 like this.
  48. ajc9988

    ajc9988 Death by a thousand paper cuts

    Reputations:
    1,750
    Messages:
    6,121
    Likes Received:
    8,849
    Trophy Points:
    681
    Agreed. Fixing the issues with the outdated "dumb" scheduler and kernel working with NUMA nodes (so addressing the multiple parts of the OS that needs fixed, not just this issue with NUMA and kernel, as the scheduler likely contributes to the CPUs for both AMD and Intel being 20% slower on windows than Linux) will benefit everyone!

    I also mentioned in the post after that about Intel's upcoming 48 core server chips which could suffer potentially the 50% problem like AMD.

    This is a good time for a hard push on Microsoft to fix their broken software!

    Sent from my SM-G900P using Tapatalk
     
    Arrrrbol, Mr. Fox, Raiderman and 2 others like this.
  49. Papusan

    Papusan Jokebook's Sucks! Dont waste your $$$ on Filthy

    Reputations:
    42,701
    Messages:
    29,840
    Likes Received:
    59,615
    Trophy Points:
    931
    They have more enough with fixing flaws they created for last version vs. previous. Don’t expect miracles. + make their OS work with own tablet hardware is probably first priority.
     
  50. Talon

    Talon Notebook Virtuoso

    Reputations:
    1,482
    Messages:
    3,519
    Likes Received:
    4,694
    Trophy Points:
    331


    "Just to be 100% crystal clear since some people still seem to be misunderstanding (mishearing?) what we are saying and have been saying. AMD will not be unveiling/announcing any new 3rd-generation Ryzen products or new Navi GPUs during their CES 2019 keynote. They will likely talk about 3rd-gen Ryzen and Navi, they might show some demos, release a few teasers, talk roadmaps, possible launch dates. But the product announcements themselves - SKUs, names, specs (etc) - will come at a later date."

    Thoughts?
     
    Mr. Fox and ajc9988 like this.
← Previous pageNext page →