I've been daydreaming again and I need some help ironing out the details of this little adventure. I've looked at HAST and I don't think it's applicable, the reasons for which should become apparent.
What I'm thinking here is a three box setup containing two FreeBSD servers (node1 and node2) connected to a shared DAS JBOD. Both servers can see, and access, all disks at the same time. For the sake of argument the DAS has 16 disks. node1 uses disks 0-7 to create pool1 and node2 uses disks 8-15 to create pool2. CARP is in play here and there are two virtual IPs, ip1 in which node1 is the primary and ip2 in which node2 is the primary. The kernel iSCSI target is used to present a LUN on each node on it's virtual IP. So far, at least to me, this seems pretty straight forward.
Now in the event of a graceful reboot of a node the hand off would look something like this:
1. node1 stops ctld(8).
2. node1 does
3. node1 shuts down.
4. CARP updates ip1, node2 is now going to receive traffic.
5. node2 becomes aware node1 is gone.
6. node2 does
7. node2 adds the configuration for the LUN on pool1 and does ctld(8) reload.
Any holes in this so far? I'm not sure the best way to achieve #5. Perhaps heartbeat?
In the event of an ungraceful disappearance of node1 I would think things would play out like above, but without steps 1-3 and the import on node2 might take a little bit longer because perhaps the contents of the ZIL need to be replayed or something else.
Now, when node1 comes back how do I control the start of ctld(8) (because node2 has to remove the LUN from ctld conf and reload first) and the import of the pool (because node2 has to export it first)? Again, perhaps this is best achieved through heartbeat? I would also assume that zfs(8) and ctld(8) should not be enabled in rc.conf so that nothing automatically happens with them.
So I said that I didn't think HAST was appropriate. The two main reasons are it has the notion of local and remote disks, and it wants to replicate data from the active to the passive. Neither of these are applicable here. None of the disks are remote, they are all local and it's just a question of who is using them. And there is no replication of data, just a (hopefully) seamless hand off of the pool and service configuration.
I have set up a pair of FreeBSD VMs with shared disks so I should be able to test this pretty well. Any thoughts on the matter?
What I'm thinking here is a three box setup containing two FreeBSD servers (node1 and node2) connected to a shared DAS JBOD. Both servers can see, and access, all disks at the same time. For the sake of argument the DAS has 16 disks. node1 uses disks 0-7 to create pool1 and node2 uses disks 8-15 to create pool2. CARP is in play here and there are two virtual IPs, ip1 in which node1 is the primary and ip2 in which node2 is the primary. The kernel iSCSI target is used to present a LUN on each node on it's virtual IP. So far, at least to me, this seems pretty straight forward.
Now in the event of a graceful reboot of a node the hand off would look something like this:
1. node1 stops ctld(8).
2. node1 does
zpool export pool13. node1 shuts down.
4. CARP updates ip1, node2 is now going to receive traffic.
5. node2 becomes aware node1 is gone.
6. node2 does
zpool import pool17. node2 adds the configuration for the LUN on pool1 and does ctld(8) reload.
Any holes in this so far? I'm not sure the best way to achieve #5. Perhaps heartbeat?
In the event of an ungraceful disappearance of node1 I would think things would play out like above, but without steps 1-3 and the import on node2 might take a little bit longer because perhaps the contents of the ZIL need to be replayed or something else.
Now, when node1 comes back how do I control the start of ctld(8) (because node2 has to remove the LUN from ctld conf and reload first) and the import of the pool (because node2 has to export it first)? Again, perhaps this is best achieved through heartbeat? I would also assume that zfs(8) and ctld(8) should not be enabled in rc.conf so that nothing automatically happens with them.
So I said that I didn't think HAST was appropriate. The two main reasons are it has the notion of local and remote disks, and it wants to replicate data from the active to the passive. Neither of these are applicable here. None of the disks are remote, they are all local and it's just a question of who is using them. And there is no replication of data, just a (hopefully) seamless hand off of the pool and service configuration.
I have set up a pair of FreeBSD VMs with shared disks so I should be able to test this pretty well. Any thoughts on the matter?