ZFS alike self healing, what database has this technology?

After I came over ZFS and started reading about it's self healing capabilities I started to wonder about what other technologies which also use this. What came to my mind first was that all serious databases should have this.

So I started googling to find the answers.

The big commercial databases like Oracle, DB2, MS Sql Server claims to support self healing.

For Windows / SQL server I found this
http://www.infoworld.com/d/networking/windows-server-2008-windows-also-rises-286
and this
http://technet.microsoft.com/en-us/library/cc771388(WS.10).aspx
It says:
Self-healing NTFS repairs file system corruption in the background, without interrupting service.
Sounds quite similar to ZFS, but is it?


As for DB2 v9 the information was very sparse
http://www.ibm.com/developerworks/data/library/techarticle/dm-0606ahuja2/
It claims to have self-healing capabilities, but how does it work?


For Oracle I found this, quite detailed
http://www.dba-oracle.com/oracle11g/oracle11g_healthchecks.htm
Data Block Integrity: detects disk image block corruptions such as checksum failures, head/tail mismatch, and logical inconsistencies within the block.
And listed as one of the main features that came with Oracle 11g is self-healing. So my guess is that Oracle has self healing capabilities similar to ZFS?

PostgreSQL does not have self-healing according to this: http://www.xaprb.com/blog/2010/02/0...inst-partial-page-writes-and-data-corruption/
But it can be used with ZFS to get the self healing feature.
 
1. just put stuff on RAID and backup..
2. also replication to slaves can greatly help

Nothing saves u from machine hardware failure or hack but time delayed slave.
In production environment (read _mene_ gigabytes) it means master machine dead, :) customers waiting on checkout or w/e .. It can be critical to change hot replicated slave to temporary hold master role, repair main master switch back and continue.

http://www.mysqlperformanceblog.com/2009/01/12/should-you-move-from-myisam-to-innodb/
Business Continuity / Disaster Recovery using InnoDB? We use an internal process based on mysqldump and innodb file-per table.
Our “Database File-Per-Table Archives” web site summarizes like this: “31 Hosts 90 Databases 2919 Tables 91 Dates 354456 Backup Files”

The 31 Hosts are the Master hosts only. Add another 67 slaves in groups replicating upto 6 slaves machines per group. Largest instances top out at approx 60M rows. Most tables are < 0.5M rows. Some applications use MyISAM with partitioned tables. Since every machine is a dedicated instance, mixing engines is not an issue. All DBs are fronted by memcached machines (64 instances). All applications are written to use memcache before hitting DB. We use RAID0 with multiple disks on all SLAVES. MASTER machines use RAID5. Ratio of reads-to-writes is 7,000-to-1. Collectively, about 600M DB transactions per day after peeling off 78% of the reads on the memcaches.

Any table can be selected from the backup archive with a couple of clicks to choose date,table, and host instance. It can be restored to a master with a few more clicks by appropriately authorized DBA. For failures on a MASTER, we do a quick shift and make one of the slaves the new master. We then repair/reload the new master and SYNC it from one of the slaves before switching it back to MASTER. Slaves are simply rebuilt and added back to the mix when their SYNC is completed. Therefore, the HUGE/SLOW restore of large InnoDB tables is not an issue,

My suggestion to use all the tools and techniques available to you, icrease the use of MASTER-SLAVE replication, and use file-per-table management. Then restore-from-backup can be done a comfortable pace.
 
I think you are missing the point. Raid and backup are still vulnerable for bit rot.

A database usually store very important data, which in any case should never be modified or deleted by data corruption. ZFS prevents this, as long you store the database on a mirror.
 
olav said:
I think you are missing the point. Raid and backup are still vulnerable for bit rot.

A database usually store very important data, which in any case should never be modified or deleted by data corruption. ZFS prevents this, as long you store the database on a mirror.

yes that is but ZFS useless if
1. hacker dd ur disk, or administrator (accidentally) delete database
2. dead system hardware

in that cases slaves (for serving while master gone) and (+delayed also) replication can help
u will have new healthy machine (u can use zfs also) opposed to useless healthy disk in dead machine. Database is a system, solution not just filesystem it needs more integrity than just files. And if u read u can see i was talking about high availability and overall integrity not just checksums.
 
Indeed. fsck(8) has "corrected" away several important files that I would have liked to have kept, but at least things were "consistent" afterwards, right? I don't see zfs as much different. It might be able to tell when something is corrupt, but I'd rather it didn't try (cack-handedly) to "correct" it without my say-so.

Let's just say I'm dubious about the whole self-healing thing.
 
I agree with gordon that self-healing is a misnomer, but even if it weren't, it would still probably be safer to do get the checksum of the backup file itself rather than the individual blocks.
 
Back
Top