Tech —

Ask Ars: “My SSD does garbage collection, so I don’t need TRIM… right?”

Modern SSD controllers are super-smart, but using TRIM is still a good idea.

SSDs—how do they work? Not with magnets.
SSDs—how do they work? Not with magnets.

Last week, we ran a syndicated piece from The Wirecutter on what the best consumer-grade SSD is for most people to buy. In it, guest author Nathan Edwards issued this caveat about adding third-party solid-state disks to Macs:

You should know that Apple doesn’t support TRIM (an operating-system-level garbage collection command) on third-party SSDs. It was possible to work around this limitation in previous versions of OS X, but you can’t do it in Yosemite without opening up a security hole. If your SSD’s controller has good onboard garbage collection algorithms, you should be fine even without TRIM.

This sparked a pretty big discussion in the comments on the ATA TRIM command, SSD garbage collection, and whether TRIM is even needed if your SSD has garbage collection.

Here’s the skinny: TRIM is never a requirement, but it always helps, and you’re always better off with it than without it. In this (relatively) short article, I'm going to explain why.

Now, this isn’t going to be an exhaustive look into how SSDs work. We have you covered if that’s what you’re looking for, though—just hit up this 10,000 word monster I wrote a couple of years ago if you want every single detail of what goes on inside those magical little devices. Today we're going to go through why you still get benefit from TRIM even on a modern SSD with super-efficient garbage collection. It comes down to the fact that even though SSD controllers are brilliant and getting better with every iteration, they still have no way of knowing what the operating system is doing.

Blocks and files

Any modern storage device, be it an SSD or a hard disk drive or a flash-based USB stick, has two different "layers" of internal organization. The first is the hardware layer, or the "block" layer. Blocks are usually defined during manufacture and are hard divisions—a hard disk drive’s 512-byte or 4Kbyte sector size is a block-level construction, set by the factory and programmed into the drive’s controller. On an SSD, a block is a physical construct made up of some number of pages, which are in turn made up of some number of individual NAND cells. For SSDs, a page is usually 8KB and is the smallest structure an SSD can read and write.

But Windows, OS X, and Linux can’t see into the physical structures of their disks—this has been true for decades, ever since logical block addressing became a thing. Abstracting away the physical characteristics of the disk lets the disk’s controller use lots of tricks for more efficiently organizing data behind the scenes, and this is especially true of SSDs, which have a bag of tricks that would make Penn and Teller jealous.

Instead, operating systems have their own organizational schemes—file systems. Most computer-savvy people know at least a bit about the file system they’re using—for modern Windows, it’s NTFS; OS X uses HFS+; and most Linux distros these days use ext4. Modern file systems impose their own sets of order on top of the disk’s block structures—each file system will have its own cluster size and its own way of laying out files and storing its tables and journals.

There is a barrier between the world of files and the world of blocks. The operating system doesn’t have any say in how an SSD’s controller does its job or which blocks the SSD controller uses to write data—the OS knows all about its file system, but nothing about the blocks underneath. Similarly, the SSD’s controller knows everything about what blocks are in use and what blocks are free, but it has no way of knowing which blocks correspond to which files.

And when the operating system deletes files, things get complicated.

Garbage collection—it might not mean what you think

SSDs can read and write at the page level—which is usually 8KB for modern drives—but they have a peculiar failing in that they cannot erase at the page level. SSDs can only erase entire blocks, which are usually made up of hundreds of pages. The reason for this (explained in much greater detail in our big SSD guide) is that erasing a page’s contents requires zapping that page with a not-insignificant amount of voltage, and the NAND-style layout of all modern SSDs makes it prohibitively difficult to isolate that voltage to only the pages that need erasing.

SSDs can only be erased one whole block at a time.
SSDs can only be erased one whole block at a time.

This is one reason why earlier SSDs seemed to slow down as they aged. It’s fast to write and fast to read, but when the disk’s initial store of free pages ran out, the disk had to erase old blocks before writing new ones.

To prevent this undesirable situation from happening, modern SSDs run complex routines called garbage collection in order to always keep as large a reserve as possible of pristine empty blocks ready for writing. Garbage collection involves having the controller search through its inventory of written pages for pages that have been marked as "stale"—that is, they were written to and then the data they contained needed to be modified by the OS; because changing the page’s state is impossible without first erasing it, the changes are always written to new pages and the old pages marked stale. Garbage collection looks for blocks that contain a mix of good and stale pages and then duplicates all the good pages into new blocks and leaves behind only stale pages in the old block. Then it erases the old block and marks it ready for use.

"Garbage collection" is therefore somewhat of a misnomer because it’s actually good pages that are "collected," with the garbage being left behind.

The complicated part comes not when a file is altered by the operating system but rather when a file is deleted. File systems don’t actually "delete" anything—when a file is deleted, the OS simply marks it in some specific way as being overwritable by new data. How this is done varies a bit by file system—NTFS, for example, updates its Master File Table to show the file system clusters occupied by the deleted file as being free for use.

But remember how the OS can’t see the SSD’s blocks and the SSD can’t see the OS’s file system? A file system deletion operation looks to the SSD like just another series of writes. On a hard disk drive, where individual sectors can be easily reused, this isn’t a problem—there’s usually a fixed correlation between the file system’s clusters and the disk’s sectors (this is, after all, what LBA was invented to facilitate). But on an SSD, where there’s no fixed correlation and where in-use pages have to be tracked and picked up by garbage collection, it can be a big deal. Pages containing deleted files look like valid pages, and they keep getting collected along with actually good pages.

Enter TRIM

This is the situation TRIM was introduced to remedy. TRIM (which is properly capitalized but is not an acronym) is an ATA command that the operating system can cause to be sent when it deletes a file. The TRIM command provides that bridge from the file level to the block level, giving the operating system a way to tell the SSD that it’s deleting files and to mark those files’ pages as stale.

With TRIM, an SSD is no longer forced to save pages belonging to deleted files. TRIM doesn’t obviate the need for garbage collection—it works with garbage collection to more properly mark pages as stale. And you don’t need TRIM for garbage collection to work—but TRIM makes an SSD’s garbage collection more efficient.

Without TRIM, garbage collection doesn't know about deleted files and continues to move pages containing deleted data along with good pages, increasing write amplification. TRIM tells the controller that it can stop collecting pages with deleted data so that they get left behind and erased with the rest of the block.
Without TRIM, garbage collection doesn't know about deleted files and continues to move pages containing deleted data along with good pages, increasing write amplification. TRIM tells the controller that it can stop collecting pages with deleted data so that they get left behind and erased with the rest of the block.

TRIM isn’t magical, and you don’t have to have it. Modern SSDs with garbage collection will work fine without it (and most SSD OEMs have utilities available to "refresh" SSDs that are being used in non-TRIM environments). Indeed, if you’re adding a third-party SSD to a Mac, you won’t be able to enable TRIM support without loading some old and insecure kernel extensions—don’t do this, because although TRIM makes a difference in reducing write amplification and extending the life and performance of an SSD, it’s not worth the barn-sized security holes you have to open up in order to get it.

So, always use TRIM if you can. It will make your SSD’s garbage collection work a lot better. But if you find yourself in a situation where TRIM isn’t available, don’t panic—it’s nice to have, but it isn’t a requirement.

Channel Ars Technica