The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    Major TRIM bug found in Samsung SSD's (Limited to Linux)

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by Spartan@HIDevolution, Jun 16, 2015.

  1. Spartan@HIDevolution

    Spartan@HIDevolution Company Representative

    Reputations:
    39,584
    Messages:
    23,560
    Likes Received:
    36,855
    Trophy Points:
    931
  2. Spartan@HIDevolution

    Spartan@HIDevolution Company Representative

    Reputations:
    39,584
    Messages:
    23,560
    Likes Received:
    36,855
    Trophy Points:
    931
    https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/

    The complete picture

    At this moment we finally got a complete picture of what was going on. The system was issuing a TRIM to erase empty blocks, the command got misinterpreted by the drive and the controller erased blocks it was not supposed to. Therefore our files ended-up with 512 bytes of zeroes, files smaller than 512 bytes were completely zeroed. When we were lucky enough, the misbehaving TRIM hit the super-block of the filesystem and caused a corruption. After disabling the TRIM, the live big files were no longer corrupted but the small files that were once mapped to the memory and never changed since then had two states – correct content in the memory and corrupted one on the drive. Running a check on the files found nothing because they were never fetched again from the drive and just silently read from the memory. Massive reboot of servers came into play to restore the data consistency but after many weeks of hunting a ghost we came to the end.

    As a result, we informed our server provider about the affected SSDs and they informed the manufacturer. Our new deployments were switched to different SSD drives and we don’t recommend anyone to use any SSD that is anyhow mentioned in a bad way by the Linux kernel. Also be careful, even when you don’t enable the TRIM explicitly, at least since Ubuntu 14.04 the explicit FSTRIM runs in a cron once per week on all partitions – the freeze of your storage for a couple of seconds will be your smallest problem.

    =================================================================

    Broken SSDs:

    SAMSUNG MZ7WD480HCGM-00003
    SAMSUNG MZ7GE480HMHP-00003
    SAMSUNG MZ7GE240HMGR-00003
    Samsung SSD 840 PRO Series
    recently blacklisted for 8-series blacklist
    Samsung SSD 850 PRO 512GB
    recently blacklisted as 850 Pro and later in 8-series blacklist

    Working SSDs:

    Intel S3500
    Intel S3700
    Intel S3710
     
    Tinderbox (UK) likes this.
  3. Tinderbox (UK)

    Tinderbox (UK) BAKED BEAN KING

    Reputations:
    4,740
    Messages:
    8,513
    Likes Received:
    3,823
    Trophy Points:
    431
    Well I have an Plextor, so i am alright, not good for Samsung`s reputation.

    And Intel uses Samsung WHAT????

    I recommended an Samsung to my nephew he got an 256GB 850 i dont think it was a pro, so i think he will be fine.

    So will a firmware update fix this problem? , should you turn trim off until a fix is found.

    John.
     
    Last edited: Jun 16, 2015
  4. Spartan@HIDevolution

    Spartan@HIDevolution Company Representative

    Reputations:
    39,584
    Messages:
    23,560
    Likes Received:
    36,855
    Trophy Points:
    931
    I can't tell you for sure but I can tell you why I sold my two Samsung 850 PRO 1TB SSDs and switched to two 960GB SanDisk Extreme PROs.

    Initially, I had a 256GB 850 PRO which had amazing performance.

    Next, I buy two 850 PRO 1TB as an upgrade but they had the new firmware which was known to be buggy and was removed from the Samsung site few days after releasing it. Not only did the new firmware brick many people's 850s, it had worse performance. So I was waiting for 3 months hoping that Samsung would release a new firmware to fix their bad firmware but nope. So I got tired and sold it as the performance dropped from 520 / 500 to 480/460 on the sequential tests so I lost hope in Samsung after their recent firmware issues one after the other and sold em. And now this, the topping on the cake
     
  5. Tinderbox (UK)

    Tinderbox (UK) BAKED BEAN KING

    Reputations:
    4,740
    Messages:
    8,513
    Likes Received:
    3,823
    Trophy Points:
    431
    Yeah, but now Samsung will test their SSD firmware a lot more, so you can expect slower releases once the current trim problem is fixed.

    I will still buy Samsung if i am in the market for an larger SSD.

    John.
     
  6. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    I am now in the market for a 1TB SSD. This is too replace the second 750GB HDD and eventually have Linux on it. I have a 480GB Mushkin Chronos drive with Windows 7 for the primary right now.

    I may just tear the install down and use the 1TB with Linux as a primary drive and rebuild the Windows 7 install. Once I can convert all non essential system internet access to Linux I will no longer have an issue with the 2020 date for production work in Windows 7.
     
  7. namaiki

    namaiki "basically rocks" Super Moderator

    Reputations:
    3,905
    Messages:
    6,116
    Likes Received:
    89
    Trophy Points:
    216
    Hmm, do you think this would affect Windows systems as well with the affected drives?
     
  8. bigspin

    bigspin My Kind Of Place

    Reputations:
    632
    Messages:
    3,952
    Likes Received:
    566
    Trophy Points:
    181
    This seems like Linux only issue. I have 850 Pro 1TB OS drive and write/erase around 2-300GB par day due to work stuff. I run TRIM (Windows defrag & optimise utility) everyday and never saw any problem.
     
  9. saturnotaku

    saturnotaku Notebook Nobel Laureate

    Reputations:
    4,879
    Messages:
    8,926
    Likes Received:
    4,701
    Trophy Points:
    431
    Unless you're running Linux,
    Probably not. This seems to be primarily an issue with Linux and only on specific kernels that don't have a certain patch.

    Molehill ---> Mountain
     
    djembe likes this.
  10. djembe

    djembe drum while you work

    Reputations:
    1,064
    Messages:
    1,455
    Likes Received:
    203
    Trophy Points:
    81
    According to the article, the error was in how Samsung SSDs interpreted some of the TRIM commands in the Linux kernel. In simple terms, they were saying these drives should have been listed on the blacklist, which would specify commands to avoid the problem, but were not listed. As noted in the list, all Samsung 830/840/850 drives were placed on the blacklist recently (presumably after the article was first published and/or after the company's experiences), which should ensure they work properly. According to the blacklist (linked in the article), both Micron/Crucial and Samsung SSDs exhibit some problems with the default implementation of queued TRIM in the Linux kernel, which is fixed by assigning specific actions to those drives.

    So essentially it's a problem that only happens in Linux and either has been or is in the process of being addressed in the Linux kernel. I agree with Saturnotaku. This is only going to affect a very small number of users and there's a fix available, so it's not anywhere near the catastrophe the original poster made it out to be.
     
  11. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    I have too ask as well, is this all based form a Linux kernel under BSD? What kernel(s) will be affected by this?
     
  12. CustomDesigned

    CustomDesigned Newbie

    Reputations:
    0
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    6
    The linux kernel devs seem to think the problem is limited to QUEUED TRIM commands. The drives advertise QUEUED TRIM, but it is horribly (as in massive data loss) broken. But unqueued trim seems to be fine. Windows doesn't support QUEUED TRIM, so it is unaffected.

    It is not only Samsung drives - Crucial, and some other brands are also now blacklisted with horribly broken queued trim.
     
    alexhawker likes this.
  13. namaiki

    namaiki "basically rocks" Super Moderator

    Reputations:
    3,905
    Messages:
    6,116
    Likes Received:
    89
    Trophy Points:
    216
    I see. I'll just hope that MS/Intel do some testing with a wide variety of hardware combinations before they decide to implement queued TRIM in their Windows 8/10 drivers. If they ever decide to.
     
  14. CustomDesigned

    CustomDesigned Newbie

    Reputations:
    0
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    6
    The simplest fix for Samsung, Crucial, et al, would be to STOP ADVERTISING QUEUED TRIM. That is a trivial patch. It really isn't all that important of a feature performance wise. Then they can start advertising it again on drives where it ACTUALLY WORKS. By refusing to stop advertising the broken feature, they are ensuring that their drives will be blacklisted (and have slow TRIM) for the forseeable future.

    Unqueued TRIM means that the host must stop all writes and wait for queued writes to finish, send the TRIM, then when it is complete, start writing to the drive again. This means that disk io freezes while the TRIM is processed. For an enterprise web server, this might be unacceptable (which is why the enterprise shop with the horror story was using drives allegedly implementing QUEUED TRIM). But for a consumer drive in a single user system, I'm going to run fstrim once a week at 2am, and I really don't care if disk io freezes for a few seconds.

    QUEUED TRIM is a very tricky real-time task to get right. You have hundreds of queued writes in process, many of which will require erasing an erase block and copying in-use sectors - and updating the on-drive data structures that track all that. And now the queued trim is going to jump in there and update the same data structures in a different way at the same time. Multi-threaded code is notoriously hard to test and debug. So it really is no reflection on the manhood of Samsung engineers that they failed to get it working reliably in time for market. I suspect that the decision to make the drive advertise it anyway was made by a PHB.

    One way to amortize the cost of getting queued trim to work right is to have it OFF on consumer drives, and only enabled on more expensive enterprise drives (where it had better work).
     
    Last edited: Jun 30, 2015