[HAST] Preventing split-brain on simultaneous node failures

I'm looking into creating a highly available SAN setup with HAST and ZFS, and I'm trying to find a solution for the following scenario:

  1. secondary HAST node dies --> primary starts to accumulate dirty writes
  2. primary HAST node dies as well before secondary comes back up --> SAN is highly unavailable :p
  3. secondary HAST node returns.

The (now unavailable) primary node still needs to send stuff to the secondary, so until then we cannot promote the secondary to primary... or else a split-brain occurs.

How can I figure out reliably whether or not I still need to wait for incoming writes before making myself primary, in case the other node is not available at that point? I cannot rely on CARP information in this scenario because it will simply set the interface to MASTER even though storage-wise we cannot become master yet.
 
Hi,

I think if you know of or have experience of enterprice clustering solutions like Veritas Cluster then by comparison HAST and CARP are never going to provide you with a solution of a similar level of robustness and funcionality. I don't think there is any concept of quorum and certainly not IO fencing.
So basically it may be good enough for some in some circumstances but it isn't really a full featured cluster solution AFAIK,

cheers Andy.
 
Never might be too strong a word. I don't know how far along fullsync replication method is, but with that and some kind of arbitrator function on a third machine, you'd have a pretty bulletproof setup.
 
Back
Top