hello kremlin. your post about NAND flash got me interested in some functional aspects of SSDs, specifically what the whole hubbub and brouhaha was about regarding the TRIM command and write operations wearing out and possibly killing expensive SSDs within the timeframe of just a few years! what's going on here?

that’s a very good question kremlin. lets investigate:

read this post before you read this one

back some years ago, SSDs became popular on the consumer markets. they had existed long, long before that for industrial applications, and they are nothing spectacular, in fact. they rely on NAND-type flash storage which predates most people reading this post.

NAND storage was only used, for a long time, for small ROM chips & the like. later on, we started using them in sd/mmc cards & USB drives. it has a somewhat poor write tolerance (we’ll get to that later), is slow and expensive. 

to make them viable optical/magnetic disk replacements for consumer markets, we had to fix some of these problem. we sort of fixed the write tolerance problem and sort of fixed the ‘expensive’ problem. they’re still somewhat expensive (much more expensive than traditional hard drives, bit-per-bit) and still have somewhat poor write tolerance (we just add lots and lots of reserve blocks).

to fix the ‘slow’ problem, we just engineered very fancy controllers that use all kinds of witch’s magic to read & write to them very quickly. these controllers communicate over the SATA protocol, much faster than USB/sd/mmc/otherwise. they turned out to be way faster than spinning rust

a big problem with the very first SSDs had to do with write tolerance. i explained to you how when blocks go bad, they are replaced by reserve blocks until those reserve blocks run out, at which case your disk becomes read-only & effectively useless. writing new data on a disk causes blocks to go bad quicker, this is what i mean with the term ‘write tolerance’

people got real dumb & thought their precious expensive SSDs were going to die 2 years into use. this turned out to be way overblown and only partially true for a small period of time. the SSD i bought last year will never hit its write limit, practically speaking

part of why that is has to do with a special piece of engineering we retro-actively applied to accommodate write-sensitive SSDs. it is a single new command, TRIM, sent over the SATA protocol. we’ll get to that in a bit, but first some more information:

optical/magnetic disks, “hard drives”, “spinning rust” all work under the same basic paradigm. a very accurate actuator moves an arm holding a very accurate & sensitive “read write head” at its tip; extremely similar to how a record player has an arm with a needle at the tip. the magnetic/optical disk spins underneath, leveling itself out with its rotational inertia. as rings of magnetized data pass underneath the read/write head, one at a time, the magnet on the read/write head “feels” this data and translates it to a 1 or 0 a computer understands. it can also move closer to the disk and fire an electromagnet in order to encode a 1 or 0

as one ‘bit’ passes under the read/write head a time, each bit is individually accessible. if i have a 64 MB long list of ‘0′s on my drive, i can change a random ‘0′ in that list to a ‘1′ by simply navigating to the area corresponding with the bit i want to change, and writing a ‘1′. an extremely ‘cheap’ operation that only involves a single ‘bit’, or surface on a disk

SSDs, or NAND storage, do not work this way. if you would like to change the bits in any block of NAND storage that hasn’t been entirely cleared to ‘0′s, you must wipe the whole block before writing any new data. you cannot write to individual bits one at a time like you can with traditional hard drives, any kind of overwrite operating involves reading affected blocks into a cache, clearing them, then writing the modified information back all over again. this exacerbates the aforementioned ‘write tolerance’ problem as frequently changing information saved to disks incurs lots of block clears which degrade the storage cells

part of this has to do with the fact that up until very recently, drives themselves had no idea which blocks were being used by the operating system. let me explain:

a drive presents itself to the kernel/operating system as a huge list of 1′s and 0′s. a 128 GB drive would present itself as a list of exactly 1099511627776 bits. these bits can either be read or written to. 1099511627776 is a big crazy number, so we subdivide into (for example) 8000 blocks of 16MB each

8000 x 16MB = 128GB

these 1′s and 0′s can be set to anything. the drive doesn’t care. it just takes ‘read’ or ‘write’ commands and performs them. the blocks that are actively being used by the operating system is something only the operating system knows. this only represents a fraction of the total number of blocks the OS comprises as operating systems tend to scrap a lot of blocks. what do i mean by this?

whenever you “delete” a file on a computer nothing much happens physically. your OS/kernel mark the space formerly containing that file as “free” (only in the OS/kernel’s brain) and that is it. it doesn’t bother writing 0s to that space because that is a lot of unnecessary work. if your OS wants to use that block again, it simply overwrites it

but overwriting is a problem with SSDs!

so finally we reach the TRIM command. like i said, it is a SATA command sent over the SATA protocol. this happens via a SATA cable between your motherboard and your SSD

TRIM lets the SSD know which blocks are actively in use & which blocks the OS no longer cares about. this is beneficial knowledge to the SSD, as unused blocks can be cleared on the SSD’s own time when it isn’t super busy (increases performance) and allows strategic clearing/mapping of blocks to achieve wear leveling, which is the idea that you try to use all blocks equally so the drive lasts as long as possible

this effectively solved a huge problem in computing, albeit only eventually. for a while the dumb people mentioned at the top of this post went apeshit over implementing TRIM support and we ended up with kernels that checked/implemented TRIM on a software level, checked/implemented TRIM on a driver/kernel level and checked/implemented TRIM on the disk controller on the SSD itself

this made it so that, for a while, your whole system was viciously fighting and lying to itself and the SSD in a hail-marry attempt at TRIM working correctly. this led to horrible SSD abuse which said morons misinterpreted as more SSD write tolerance hardware issues

it was a particularly irritating few years