* RAID combinations RAID 10: combine 2 or more RAID0 stripes into another RAID1 mirror. RAID 50: combine 2+ RAID5 arrays and mirror them * replacing drives While a drive is being replaced, system is in "degraded" mode. One a new drive is added, it needs to be initialized to bring the array back to normal working mode (the process called "rebuilding" or "re-silvering"). In professional systems, you have a few standby drives called "hot spares" (HS). If a drive fails anywhere, the system automatically disconnects the bad drive and replaces it with one of the HS drives. * integrity In RAID parity, you can detect integrity violations but can't tell where they came from. Still, you "waste" 1 whole drive (in RAID5) just for parity, to check integrity. In RAID6, we spend even more storage space on parity. Q: How can we detect integrity with even less space? A: hash functions * Hash functions D: data of any size, usually large H: a "hash" of the data D, usually much smaller than D f(): hash function used to convert D to H Properties: 1. sizeof(H) << sizeof(D) 2. H "uniquely" represents D If I take a different data D1, D2, ..., and run it it through f(), I'm going to get very different H1, H2, ... hashes. This calls for a specific property called the "avalanche property": a change of 1 bit in the input, should on average flip sizeof(H)/2 (half) of the hash bits. The distribution of #hashbits changed from a single input bit change should form a perfect Gaussian distribution. Said differently, if a hash is of size B bits, there 2^B possible hash numbers. But there's way more data (2^D). What we want is to avoid any collisions where two different data items hash to the same hash number. That is, the hashes should be well distributed in the hash space of 2^B. Thus, hash functions are said to be "probabilistically unique": can't eliminate all possibility of collisions, but you can make it very very low. Non-invertibiity: given a hash H, it's very hard to find a data D that matches the hash (i.e., you cannot "invert" a hash, and why the hash functions are called "one way functions"). That property is very helpful for hashes to be used in crypto, digital signatures, and more. In fact, hashes that are "large enough" (e.g., SHA256) are called "cryptographically strong hashes". The chance of collisions grows if the hash size is smaller, and the number of items you hash grows Examples: CRC32: 32 bit "hash", rather a checksum. MD4: 64-bit MD5: 128-bit (16B) SHA1: 160 bits (20B) SHA256: 256 bits. and more Lots of hash functions to try and minimize the chance of collisions, work faster (smaller hashes are faster), and use less space. Uses: 1. Hashes are good to detect integrity violations 2. Can use to detect duplicates (used for "deduplication" systems) * virtual device drivers Take a disk (or any storage that behaves like one), carve out a portion to store hashes. On a write, write the data, calc a hash, and store the hash too. On a read, read data, calc hash, compare to stored hash: if mismatch, can report it as error to caller (integrity violation). Benefit: you get integrity checking with a lot less space taken. A lot of OSs support all kinds of virtual block drivers: they work on top of 1 or more other block devices, and they export the view of another block device. In Linux, the technology is called Device Mapper (DM). Examples: 1. RAID 2. dmintegrity: detects integrity violations using hashes. 3. dmcrypt: transparently encrypt/decrypt data 4. deduplication driver * dedup take the raw storage, break it into data containers + data structure that includes: - the user logical LBAs (e.g., 17, 343) - the actual physical stored LBA (e.g., 2) - the hash number of the block - maybe the block size (e.g., 2k) - a refcount: how many different logical LBAs use this hash Just like hard links, you keep a refcount, and release the data when the last ref has been released (e.g., users deleting files) The mapping of logical to physical LBAs is called an "indirection map". Dedup driver: - on a read of logical LBA X, find the physical LBA Y, and return that. - hashes used to detect duplicates, but can also be used for integrity - on a write of LBA Z, calc hash, see if seen before if seen, store Z together with existing hash structure if not seen, calc hash and store new entry (refcount=1) dedup systems have been shown to get space reductions of 10-40x at times! Two forms of dedup: - inline: performs dedup right in the read/write I/O path, as data comes in. Can slow user activity, but detects dups right away. - offline: dedup process runs in background periodically, scans files/disks looking for dups, then dedups them. Doesn't slow down user I/Os, but takes longer to detect dups and save the space. Hash collisions in dedup systems are bad! Data can be lost. Vendors have thus increased hash sizes. Dedup systems can have a lot of data (petabytes). The num of hashes alone, can be many terabytes. Access to hashes is seen as random accesses, so typical caching algorithms ala LRU/LFU don't work well. Solution: Bloom filters. * hash collision probabilities See https://preshing.com/20110504/hash-collision-probabilities/