Intel VROC RAID Write Hole closure
The Intel VROC product family will support the ability to close the RAID Write Hole scenario in RAID 5 configurations. This applies to Intel VROC on Intel Xeon Scalable Platforms.
RAID Write Hole (RWH) is a fault scenario, related to parity based RAID. It occurs when a power-failure/crash and a drive-failure (e.g., strip write or complete drive crash) occur at the same time or very close to each other. Unfortunately, these system crashes and disk failures are correlated events. This can lead to silent data corruption or irrecoverable data due to lack of atomicity of write operations across member disks in parity based RAID. Due to the lack of atomicity, the parity of an active stripe during a power-fail may be incorrect and inconsistent with the rest of the strip data; data on such inconsistent stripes does not have the desired protection, and worse, can lead to incorrect corrections (silent data errors).
The previous Intel VROC mechanisms implemented to address the RAID Write Hole condition encompassed a combination of Dirty Stripe Journaling and Partial Parity Logging. This implementation only partially closed the RAID Write Hole. With the Intel VROC VC product family, the RWH solution included will completely close this condition (when RWH is enabled). When RWH is disabled, the old implementation (using Dirty Stripe Journaling and Partial Parity Logging) is used.