Bitrot & ZFS Scrubbing: When Data Quietly Rots
"Bitrot" sounds like an internet myth – but it's real, well-documented, and affects every storage system. RAID alone doesn't protect against it. Here's the explanation with real probabilities and how ZFS scrubbing solves it.
Short version
Bitrot is the silent change of stored data – individual bits flip, sectors read incorrectly. Classic RAID often misses this. ZFS (and Btrfs) computes a checksum for every block on write. Scrubbing reads all blocks regularly, verifies checksums, and repairs mismatches from redundancy – the only reliable defense against bitrot.
What bitrot actually is
Several physical causes:
Magnetic drift. HDD magnetization weakens over years, can cause bit flips.
Cosmic rays / TID. High-energy particles flip bits in DRAM or flash cells. Rare, but statistically relevant at large data volumes.
Controller bugs. Firmware bugs in HDDs/SSDs occasionally write incorrect data.
Cable or power noise. SATA transmission errors with bad cables. Normally caught by the interface, but not 100%.
How often bitrot really happens
Vendors quote Unrecoverable Read Error (URE) rates: ~1 in 10^14 bits for consumer drives, 10^15 for enterprise. On 16 TB:
- 16 TB ≈ 1.28 × 10^14 bits
- Per full read: ~12% URE probability on a consumer drive
- Enterprise: ~1.2% per full scan
That's the lower bound – silent corruption that doesn't surface as a read error never even reaches the HDD mechanics. Backblaze estimates an additional 0.1-0.5% per year for actual flipped bits not reported as UREs.
Why classic RAID isn't enough
RAID 5/6 computes parity, but doesn't check on read whether data is correct. A flipped bit on a data drive gets passed along with the wrong info – parity still computes "correctly" because it's calculated from the (wrong) data.
During a rebuild this becomes catastrophic: the missing drive is reconstructed from the others, including the flipped bits → corrupted file is "restored".
mdadm, hardware RAID controllers, NTFS, ext4 – all share this problem. They have no mechanism to detect silent corruption.
How ZFS solves it
ZFS stores a hash (Fletcher4 or SHA256) for each data block (typically 128 KB) in the parent block – never on the same drive. On read:
- Block is read
- Hash is recomputed
- Compared to stored hash
- If mismatch: ZFS fetches the copy from another drive (RAIDZ parity or mirror), validates it, returns correct data, rewrites the corrected version
This happens transparently on every normal read. Plus regular scrubbing forces it proactively.
What scrubbing does
A scrub reads all allocated blocks in the pool, validates checksums and repairs where needed. Runs alongside normal operation – no downtime.
Recommended frequency:
- Consumer drives: every 2-4 weeks
- Enterprise: every 1-3 months
- SSDs: every 4-8 weeks (lower bitrot rate)
Synology DSM, TrueNAS and Proxmox have scrub schedulers built in. CLI: zpool scrub tank
Btrfs as alternative
Btrfs has a similar concept (corruption detection via checksums + scrubbing). But: Btrfs RAID 5/6 has been marked unstable for years and isn't recommended in any current distro. Btrfs only recommended for RAID 1/10.
More comparison: ZFS vs ext4 vs Btrfs.
Synology DSM setup
DSM supports Btrfs Data Scrubbing on Btrfs volumes:
- Storage Manager → Volume → "Data Scrubbing"
- Schedule: every 1-3 months
- RAID type: SHR/RAID-5/RAID-6 – scrubbing applies
On RAID 1 with ext4, DSM scrubbing helps less because ext4 has no checksums. Block-level scrubbing only verifies parity.
TrueNAS setup
Default automatic every 35 days. Adjust under Storage → Pools → Scrub Tasks. Recommendation: every 14-21 days for home use.
Unraid setup
On Unraid with Btrfs/ZFS cache pool: scrub plugin available. Default parity check every 7 days (that's plain RAID-style, not checksum-based).
What scrubbing doesn't replace
- Backup: Scrubbing doesn't repair data already gone. Multiple drive failures or accidental deletion need backup. RAID is not a backup.
- ECC RAM: If bitrot happens in RAM (before write to disk), ZFS writes corrupted data with correct checksum. ECC RAM prevents that.
- Power protection: During a crash, ZFS can roll back the pool, but unsynced writes are gone. UPS still mandatory.
Recommendation
If data integrity really matters:
- ZFS RAIDZ2 or ZFS mirror as filesystem
- ECC RAM (10-20% premium, worth it)
- UPS against power issues
- Monthly scrub job
- Off-site backup as last line
Related articles
Further reading
ZFS Encryption Guide: Native Encryption Done Right
ZFS vs ext4 vs Btrfs: Which File System for Your NAS?
Btrfs RAID 5/6: Why You Still Shouldn't Use It in Production in 2026