SCSI reservations with dual attached disks

Hi,

I have been trying to figure out how to use SCSI reservations with FreeBSD recently. This is using two FreeBSD servers with shared Fibre Channel storage. Both machines can see the disks and can import a zpool which I've configured. What I would like to do is to use SCSI reservations to fence the disks so that only one machine can see the disks at a time. To reserve a disk, I use: camcontrol cmd da1 -c "16". This works fine, and the result is that the other machine can see da1, but can't access it and can't see any of the partitions.

My problem comes when I want to release the reservation. To do the release, I do: camcontrol cmd da1 "17". After I do that, the other machine still can't access the disk. If I reboot the other machine, then it can see everything fine, so the release obviously works, but the other node doesn't notice it until after a reboot.

Does anybody know the proper procedure to do this with FreeBSD? I do a similar thing in Illumos using the mhd driver and it works as expected. Hopefully somebody can tell me what I'm doing wrong / not doing.

Thanks!
 
You may try sysutils/sg3_utils from ports. It has a more functional tool to work with persistent reservation and is known to work.

Aside from the SCSI level, you should consider the GEOM level. If the slave system was unable to access a device during boot because of a reservation held by master, it could be unable to read the partition table and see file systems. You may need to make GEOM retaste the disk after a reservation handover, for example, by doing false >/dev/daX.
 
Thanks for the reply! Installing ports isn't really possible, but I think you were right about getting GEOM to retaste the disk. I wasn't clear in the original post, but the problem is if I reserve the disk with one machine, then boot up the other. It looks like it can see /dev/da10 but not /dev/da10p*.

The problem now is how to retaste the disk. I tried false >/dev/da10 and it said
Code:
/dev/da10: Device not configured.
 
Hmm. Maybe the disk is still reserved and the slave can't access it. I never used the reservation method you've tried, the issue may be somewhere there.
 
I had a look through dmesg and it looks like the problem is more of a low level problem during bootup:
Code:
# dmesg|grep da12
(da12:isp0:0:11:0): bad underrun (count 8, resid 8, status not marked)
(da12:isp0:0:11:0): bad underrun (count 8, resid 8, status not marked)
(da12:isp0:0:11:0): bad underrun (count 8, resid 8, status not marked)
(da12:isp0:0:11:0): bad underrun (count 8, resid 8, status not marked)
(da12:isp0:0:11:0): bad underrun (count 8, resid 8, status not marked)
(da12:isp0:0:11:0): got CAM status 0x59
(da12:isp0:0:11:0): fatal error, failed to attach to device
(da12:isp0:0:11:0): lost device - 0 outstanding, 2 refs

I'll try using sysutils/sg3_utils as you suggested, and see if I get any different results.
 
Back
Top