Hmm, problem is, this is not the only mode of failure.
Go on Newegg, look at the ratings for the drives. Normally the 1 or 2 egg ratings come from people whose drives failed. Sure, there's a bias because people with good drives are less likely to report, but comparing SSDs to HDDs, it appears that there are a lot more people with dead SSDs than HDDs, proportionately. None of these guys wore out their SSD, most of the time it's something like "SSD just disappeared off the BIOS radar for no reason"
-
-
Tsunade_Hime such bacon. wow
-
-
Tsunade_Hime such bacon. wow
DOA stuff happens, and a product shouldn't be punished because 1/10000 arrived DOA. -
Let the reader to decide whether it should be punished(i.e. state clearly what is wrong). -
Tsunade_Hime such bacon. wow
The same can say for buying a new car. You buy a brand new Honda Accord drive it off the lot and the transmission dies. Will you bad mouth the entire Honda brand because you got that one in a million defective transmissions? It shouldn't be but people have stupid prejudices against bad experiences.
Now if a product under manufacturer warranty get denied for a BS reason, then I can understand an angry review and shunning the entire brand. The same thing happened to my parents, we had a 1991 Toyota Camry. The Toyota dealership botched the transmission job and asked us to pay 1400. My parents wrote to their Japanese HQ back in the day. In 2-3 days we got a phone call saying the transmission would be replaced for free in an attempt at customer service. Nevertheless it was in vain as we sold the car immediately after the repair and only bought Honda from that day on. -
What is the difference between brushing off all those information reported in Newegg just because you may not agree with certain 'rating' on that site and judging a brand based on one's own bad experience ?
-
-
No, I didn't say that it gives a quantitative estimate.
I said, you can use similar data for SSDs and HDDs both from Newegg as a REFERENCE for their RELATIVE reliabilities. People with bad HDDs also give it 0 egg. -
On the other paw, chip/bit-level EEC for flash memory will be seen in devices shipping in the next 6-8 weeks with complete assemblies (drives) following shortly thereafter.
This will help make this whole thread, and the suppositions behind it (sheesh, judging a single tech based on ill-informed 'early adopter' end-user rants at newegg??) a waste of breath. -
tilleroftheearth Wisdom listens quietly...
No, ECC will not make the problem disappear, it will merely mask it - like I've stated in this thread already.
-
chip/bit level ECC does fix the problem. Just like aggressive sector reallocation 'fixed' conventional hdd tech many years ago. And just like certain mfg processes 'fixes' other problems with CPU design.
No tech stands on it's own. An integrated 'system' does.
SSDs don't currently integrate much in the way of error corrective tech. They will. Soon. -
I agree with tiller. Error correction only works for a case where you've got data stored in a medium that may be corrupted, so you'd like to recover that. However, right now the issue of wearing out is NOT about errors in storage, it's about electron flow eventually wearing down the nanometer-thick insulation between the gate and the substrate in the gates. This is a physical phenomenon, which, when it happens, also physically destroys the cell. Logical operations like ECC cannot compensate for physical damage.
-
Meaker@Sager Company Representative
It's also a highly predictable process and so the controller stops before the data becomes unreadable so your data can be read off and backed up.
-
and process/material (espec insulator materials) changes will continue to take care of electron flow degradation in flash just as similar improvements were made in other chip designs.
Don't you guys have *any* imagination?
You've set up this straw man slamming SSD reliability when what you really need to do is slam the inability of users to understand the current state of the tech and read/understand the freely available documents on the subject.
Tech can and will improve. The ability of consumers to understand the limitations of tech, espec new tech, will likely remain at it's current low level. -
This is ridiculous.
Not a single one of you has a end of life/failed SSD in hand.
At least wait for them to steal our souls and first born children after our SSD's prematurely fail well before the advertised MTBF and then... oh wait no body has one that has. DOA is DOA. As in it's NOT an EOL drive.
Taking statistical data from Newegg reviews, that's a poor poor case for statisticians. No data for two products can be compared "relatively" on there. At least not for an SSD and HDD.
Theoretical bashing of theoretically prematurely failing drives?
On the other hand lots of people on here can offer smart attributes showing they've written terabytes upon terabytes of data and put it through X amount of power cycles and Y amount of power on hours, etc on both old and relatively new SSD's.
No one is trying to short change you. If you don't trust the tech, don't buy it. -
This entire thread reminds me about how everyone freaked out about Y2K, but nothing happened. Clever engineers behind the scenes recognized the potential problem, and engineered solutions to work around those problems before they did any practical damage. I'll bet there were plenty of theorycraft alarmists for Y2K as well. -
I gave real world data a few pages back but some seemed too busy arguing to understand its significance. -
If I could, I'd flood an entire page of this thread with +rep posts from people who aren't theorycraft alarmists. -
I'm not sure what's wrong with the theories presented.
2.0, I have noted your posted statistics, and they seem similar to what I have in terms of cycles/GB written ratio, which is still far higher than what is expected.
Obviously, manufacturers have tested NAND cells to destruction before to come up with numbers like 10k, 5k, 3k, and so on. They have also built in monitoring tools. So, what exactly is the problem with doing some independent calculations?
Some of you also mentioned that future technology will work to address this. Sure, that may be the case, but right now I'm addressing the technology of TODAY, and how the reality isn't the same as what we assume it to be. In addition, the fact that smaller fab processes create less reliable gates - this is explained on the semiconductor physics level. Right now the manufacturers are only moving to smaller fab processes to save money. Unless there's some change in material or construction, I don't see how it will magically get better. -
-
NotEnoughMinerals Notebook Deity
2.0, your real world data doesn't mean much unless we all wait around for the next 8 years to see the cycles get used up.
There are all sorts of usage patterns to factor in, you seem to be using these drives primarily as exclusively boot drives. What about the people who use it for everything? Isn't the goal of this faster drive to eventually get to a place where everything is on SSDs? Pure SSD systems are going to see crazy amounts of writes , especially in laptops where a lot of people don't have the luxury of multiples drives.
I think the main reason for this thread was to help shed some light on misconceptions of how long SSDs will really last, especially with the current trend of using smaller architecture that seems to be making things cheaper but at the cost of durability/write cycles which is true. The "I have drives with older NAND that will last forever" argument doesn't really help anyone unless we're gonna tell everyone to rush out and buy old tech. -
Also, prematured used up of P/E cycle should not be a concern due to the fact that it is not a data loss situation(and ECC has nothing to do with it), it just mean gradually the actual storage usable decrease.
If that happens, it only means that one needs to upgrade to a larger model by then. Since SSD is electronic device, it observes Moore's law(at least for now) so it would double in every 18 months or so. In other words, in 3 years time(that is the warranty period usually), you get 4x the size for more or less the same price(if not even lower). -
NotEnoughMinerals Notebook Deity
Perhaps we just play controller war? -
In other words, just use the quoted 'guranteed write'(at least for Intel) and spec out based on one's particular usage pattern. Get the enterprise line if that is called for. -
NotEnoughMinerals Notebook Deity
Haha, I know it's not a good question, but we need something to talk about on these boards. We can't rant on for a pages about guaranteed byte writes. We need more gray area!
-
So the information I provided can be used to extrapolate a conservative estimate of longevity. Baring of course, controller or ROM flash failure. Something which can and does happen, though rarely.
-
NotEnoughMinerals Notebook Deity
-
You must not get much shipped from online shopping. UPS and Fedex literally dropkick stuff on its way to you. Fragile is a codeword for "give it an extra kick", was told that with a straight face by one building's driver.
The starting scene in ace ventura is funny because its so true, not because its fake.
BTW coming from mostly a lurker: this thread has the usual ssd FUDdites filling it up...not much worth seeing. -
Also, I major in Electrical Engineering, where I have exposure to semiconductor materials and devices. I can confirm that this is scientifically true, at least on the gate level. The transmission coefficient for tunneling is inversely proportional to the exponential function of the barrier thickness, meaning rapid increases in electron leakage as the oxide barrier shrinks.
Now, Anandtech also mentions that new controller technology can potentially offset this issue. However, that is still the previous argument of trying to use logic layer operations to compensate for weaknesses in the physical layer.
Re: Aluminum. Given the lack of movable parts in the SSD, it should very well be able to stand a kick when it's in its packaging. They are often rated at something like 1000G acceleration. If a Fedex guy's foot can cause this kind of acceleration to an SSD especially when in its foam packaging, I'd advise him to quickly join and lead the best soccer team in the world. Oh and, if you got nothing meaningful to contribute, please keep lurking. I'm not sure what you are trying to achieve with your sentimentally loaded but logically devoid reply. You start calling people "FUDites" and disregard what they say, then what should they do, respond by calling you ignorant and starting a flame war? -
min2209, or ANANDTECH:
What you say makes sense. Shrink the die, shrink the barrier, shrink the protection. I am not very knowledgeable on SSD at all but I have some questions:
As the die shrinks, and barrier shrinks is the use of electrons the same? For example if SSD A is an exact copy of SSD B, the only differences are in the size of the manufacturing process (lithography), would both utilize the same amount of electrons to program/erase the same amount of information?
When the barrier fails and leaks occur, this will lead to data loss. Is this correct? If so, do SSD have any way of detecting this leakage, preventing or finding a way to work around it?
Is there any other material that be can or is being used to provide a barrier to stop electron leakage more efficiently and with less corrosion?
At what estimated lithography size would one expect a barrier to become futile because of it's thinness? -
If you shrink the size, less electrons will be required, as all gate capacitance factors drop, so you need less charge to build up the same voltage. There's a problem though. Shrinking the barrier width exponentially lowers the resistivity to tunneling, whereas the charge required is proportional to area of capacitors. That means that a smaller size is inherently less stable.
The barrier doesn't "fall" like a wall, rather it just gets weak. It's the same thing with larger scale insulation, for example on a power cable. When the barrier is weakened because too many electrons tunneled through the oxide layer (which has a very high energy conduction band, hence forming a barrier), the structure will be weakened so that electrons will begin tunneling through at a greater rate. This is what I referred to as "leakage", because the increased level might not be part of the design considerations. Moreover, the increased rate of leakage will reduce the ability of the floating gate to hold a charge. Electrons held by the floating gate will tunnel through the weakened barrier and back into the P-type substrate, which lacks electrons in the conduction band and naturally has an affinity for electrons. This leads to data loss.
I'm not sure if the SSDs have a way of detecting this. I guess ECC could help, but once again, that's not a complete solution. That's like answering the statement "an SSD might wear out, so it's not eternally durable" with "but when it wears out, we can still recover the data".
The barrier MUST be there, otherwise the second you remove the voltage on the top layer, the floating layer will lose all the charge back to the substrate. That means the second you write the data it's already gone. I'm not sure at what size this will occur, but I guess we haven't gotten there yet. Then again, the oxide molecular structure must still be there, so I suppose there must be at least one complete layer of that material on the molecular/atomic level. -
Very informative min2209! Greg said once in another read not too long ago that shrinking the die process on SSDs makes them less reliable (but he's a computer engineer). Nice to hear a clear and concise explanation why, and confirmation on what Greg has said.
-
How does weaker barrier = less realiable SSDs?
-
Yes I figured the amount of electrons decreased as die size decreases in a proportional volume. I see. So as the wall decreases in size, it's "durability" decreases exponentially. In other worses it decreases faster then the electron use decreases as die sizes are shrinked, thus leading to less uses.
I don't think ECC may be used well with SSD. It would hinder performance greatly, increase electrical needs, and finally increase heat output. While heat output isn't a problem, electrical circuitry's efficiency is affected by heat thus perhaps act a little different. Is this correct?
Also
Do you know if temperature increase negatively affects the materials used to create the barrier? Perhaps we can start cooling our SSD with phase changers if that helps XD
Okay so as far as die shrinks go, we haven't hit our limit yet although durability should be decreasing, theoretically.
Sorry if I am bothering you, I can stop asking question. :S -
Meaker@Sager Company Representative
As voltages lower so does the damage, but the barrier shrinking effect is larger, however you have more cells too.
The degredation is predictable, the SSD can count and detect the damage.
Once the damage reaches a certain point the SSD will refuse to write, but is still readable. -
What is the barrier shrinking effect? Still doesn't answer my question, how does weaker barrier = less reliable ssd?
Because as far as I can see, weaker barrier equals less voltage to move the electron. Not seeing how that makes it less reliable ssd. -
-
Intel's X25-M solid-state drive - The Tech Report - Page 1 -
dynkin, whats wrong with a 5-6 year old hard drive lifespan? My 3 year old WD 5400rpm hard drive, still works but is slow as hell (even after formatting it a few times).
I'd be very content with a ssd if it would give me 6 years of fast and reliable data storage. -
Pretty much, Anand's article gives a good overview of the NAND operation principles. Here's his diagram:
The floating gate is insulated on all sides by "barriers". If you apply a very high positive voltage to the Control Gate, it will affect the Floating Gate such that the tunneling of electrons through the tunneling oxide layer will shift, and there will be a net gain of electrons in the floating layer. So this way you perform a "write". when you then release the voltage on the control layer, the floating layer should retain the charge. Now, even though it's insulated, it still affects the P-type substrate through static charges, and causes the path between the N+ drain and source to change its resistive characteristics - so if I pass a current through the drain to the source (conventional current), I can measure the voltage and current to tell the state of the floating gate (this is a READ operation).
Now imagine the oxide tunneling layer is made so thin it's very weak. That's not good, because once you release the Control Gate voltage, stuff starts leaking back to the P-type substrate. -
-
-
-
Dunno if this has been posted before:
Micron integrates ECC controller into next-gen flash memory - The Tech Report
"As this article over at AnandTech points out, flash endurance and error rates present bigger problems as the size of each NAND cell shrinks. The move from 50- to 34-nm NAND cut the write-erase cycle endurance from 10,000 cycles to just 5,000, and the 25-nm flash chips currently coming off the line are reportedly lasting only 3,000 cycles." -
So what does all this mean? Why move the process then? Super fast memory does you no good if it's life is cut 66%. SSD's are already reasonable with power consumption, and most of the speed comes from the controller anyhow, right?
-
-
-
But in this case what difference does it make because you seem to keep reducing the lifespan in 1/3 to 1/2 with every iteration, so you will have to allow double the provisioning space (which they won't do), unless it's less than twice as cheap, for double the storage size. Which I highly doubt because SSD's haven't dropped below half the cost with each iteration, not even close.
Even if the manufacturers don't allow for sufficient provisioning space, once the general user base realizes that they need a drive twice as large as their storage needs, there will be a lot of backlash. SSD's are very expensive as it is, to think you'll need one twice as large just doesn't sit real well. And what of the next fab process? 22nm? 16nm? We'll get what, 500 writes per cell? That's just crazy stupid. You'd need a 1TB SSD for 200GB of storage.
SSD Endurance - the Big Lie
Discussion in 'Hardware Components and Aftermarket Upgrades' started by min2209, Nov 30, 2010.