Chapter 5
Uptime, Reliability, and SLAs
The fundamental storage building block is the disk drive, so we start with drive reliability.
Disk drive spec sheets show two reliability-related specs: Mean Time Between Failure (MTBF) and Annual Failure Rate (AFR).
MTBF is a statistical measure of preproduction drive families. MTBF is defined as the mean time between failure of a group of disk drives. In English, that’s the statistical likelihood of how long it will take for half (50%) of a group of drives to fail. MTBF is established before drives go into high-volume production, using a process called the Reliability Demonstration Test (RDT). RDT involves putting 1000+ drives into test chambers, running them hard, 24×7, one hundred percent I/O, at ...
Get Data Center Storage now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.