iSCSI Initiator hangs after some I/O activity

Hi,

I am trying to get the iscsi_initiator to work on FreeBSD 7.3. Discovery and establishing a connection works. I can fdisk, label and format the disk with ufs2 without problems. But when I start copying data to the new disk the connection seems to hang after a couple of hundred megabytes.

In gstat I can see that the queue Length (L(q)) jumps from one to a couple of thousands and then just freezes. Every process that tries to access the iSCSI disk freezes. Like rsync, ls or df.

I can issue the shutdown -r command but I will get a bunch of
Code:
g_vfs_done
error = 6
entries and the server won't restart by itself.

I have tried with several versions of the initiator driver from http://ftp.cs.huji.ac.il/users/danny/freebsd/ ( mirror: http://uminac.com/mirror/ftp.cs.huji.ac.il/users/danny/freebsd/)

/var/log/messages does not seem to log anything useful regarding this.

I have also tried two different iSCSI storages. An Isilon and NexentaStor. With both I experience this error.

My /etc/iscsi.conf just has initiatorname, TargetName, TargetAddress and "tags = 256".

Does anybody know how to fix this?

Thanks and regards
 
Hi,

I did some further testing. It seems that this only occurs when many "small files" (thousands of files below 1k) are transferred to the iSCSI volume.

I have arranged a test setup where I cloned the physical server on which I first encountered this problem with [CMD=""]dump[/CMD]/[CMD=""]restore[/CMD] to a VM.

I also set up a VM running NexentaStor and sharing a iSCSI LUN. I can successfully transfer small files from the FreeBSD VM to the NexentaStor VM. Both connected to one vSwitch. I am not able to transfer the same files from a physical FreeBSD Server to the NexantaStor VM or from the FreeBSD VM to a physical NexentaStor or Isilon iSCSI share.

Both virtual and physical FreeBSD Servers and both virtual and physical NexentaStor Servers are clones - so basically identical.

From time to time the Server does not completely freeze and I am able to initiate a shutdown of [CMD=""]iscontrol[/CMD]. Then I find the following in [CMD=""]dmesg[/CMD]:

Code:
(da1:iscsi0:0:0:1): WRITE(10). CDB: 2a 0 2 f6 23 7c 0 0 4 0 
(da1:iscsi0:0:0:1): CAM Status: SCSI Status Error
(da1:iscsi0:0:0:1): SCSI Status: Check Condition
(da1:iscsi0:0:0:1): UNIT ATTENTION asc:29,0
(da1:iscsi0:0:0:1): Power on, reset, or bus device reset occurred
(da1:iscsi0:0:0:1): Retrying Command (per Sense Data)

Anyone any ideas?

Thanks and regards.
 
Hi,

How should I proceed here? Is there a proper way to get in touch with the developers? Or should I open a bug report? We would like to use iSCSI in a production environment, but due to the described instability that does not seem possible.

Thanks and regards,

eezzeee
 
Hi,

Thanks for the reply. I tested on FreeBSD 8.2 as well. The iscsi_initiator version was 2.2.4.2. I have also tried newer versions on FreeBSD 7.3. But no luck.
 
I run into a similar problem with 7.3. The solution was to disable the TX offload engine of my network card. Tell me if it worked for you too.
 
Hi,

Thanks for the reply. It sounds promising. Can you point me into the direction where I could disable the engine?

Thanks and regards
 
From man ifconfig
Code:
     rxcsum, txcsum
             If the driver supports user-configurable checksum offloading,
             enable receive (or transmit) checksum offloading on the inter-
             face.  Some drivers may not be able to enable these flags inde-
             pendently of each other, so setting one may also set the other.
             The driver will offload as much checksum work as it can reliably
             support, the exact level of offloading varies between drivers.

     -rxcsum, -txcsum
             If the driver supports user-configurable checksum offloading,
             disable receive (or transmit) checksum offloading on the inter-
             face.  These settings may not always be independent of each
             other.
 
Hi,

I have now tested both on the VM and the physical Server and with both storages. But no change. I have also tested this on FreeBSD 8.2 without any change. I have further tested this on Ubuntu Server and there it is working.

My test only consists of syncing a few thousand files smaller than 1k with rsync.

regards
 
Back
Top