Other ctld exiting prematurely during discovery

Hi,

I'm currently running 10.1-RELEASE and was setting up a test SAN with ctld(8). However, my iSCSI Initiator (globalSAN iSCSI Initiator on OSX) won't get any information, and I get log messages on FreeBSD showing read: connection lost. I fired up ctld(8) in debug mode, and this is the output:
Code:
ctld: incoming connection; not forking due to -d flag
ctld: accepted connection from 10.211.55.2; portal group "pg0"
ctld: 10.211.55.2: setting session timeout to 60 seconds
ctld: 10.211.55.2: Capsicum capability mode enabled
ctld: 10.211.55.2: beginning Login Phase; waiting for Login PDU
ctld: 10.211.55.2: key received: "InitiatorName=iqn.2014-09.antares.df.mytestsite.com.mx"
ctld: 10.211.55.2: key received: "SessionType=Discovery"
ctld: 10.211.55.2: key received: "InitialR2T=Yes"
ctld: 10.211.55.2: key received: "MaxConnections=1"
ctld: 10.211.55.2: key received: "ImmediateData=Yes"
ctld: 10.211.55.2: key received: "MaxOutstandingR2T=1"
ctld: 10.211.55.2: key received: "DataPDUInOrder=No"
ctld: 10.211.55.2: key received: "DataSequenceInOrder=Yes"
ctld: 10.211.55.2: key received: "ErrorRecoveryLevel=0"
ctld: 10.211.55.2: key received: "MaxRecvDataSegmentLength=65536"
ctld: 10.211.55.2: key received: "MaxBurstLength=524288"
ctld: 10.211.55.2: key received: "FirstBurstLength=524288"
ctld: 10.211.55.2: key received: "DefaultTime2Wait=0"
ctld: 10.211.55.2: key received: "DefaultTime2Retain=0"
ctld: 10.211.55.2: key received: "OFMarker=No"
ctld: 10.211.55.2: key received: "IFMarker=No"
ctld: 10.211.55.2: key received: "HeaderDigest=None"
ctld: 10.211.55.2: key received: "DataDigest=None"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): initiator requests discovery session; auth-group "no-authentication"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): auth-group does not define initiator name restrictions
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): auth-group does not define initiator portal restrictions
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): initiator skipped the authentication, and we don't need it; proceeding with negotiation
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "InitiatorName=iqn.2014-09.antares.df.mytestsite.com.mx"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "SessionType=Discovery"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "InitialR2T=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "MaxConnections=1"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "ImmediateData=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "MaxOutstandingR2T=1"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "DataPDUInOrder=No"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "DataSequenceInOrder=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "ErrorRecoveryLevel=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "MaxRecvDataSegmentLength=65536"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "MaxBurstLength=524288"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "FirstBurstLength=524288"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "DefaultTime2Wait=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "DefaultTime2Retain=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "OFMarker=No"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "IFMarker=No"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "HeaderDigest=None"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "DataDigest=None"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "InitialR2T=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "MaxConnections=1"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): discovery session; ImmediateData irrelevant
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "ImmediateData=Irrelevant"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "MaxOutstandingR2T=1"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "DataPDUInOrder=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "DataSequenceInOrder=Yes"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "ErrorRecoveryLevel=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "MaxRecvDataSegmentLength=65536"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "MaxBurstLength=524288"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): capping FirstBurstLength from 524288 to 131072
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "FirstBurstLength=131072"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "DefaultTime2Wait=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "DefaultTime2Retain=0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "OFMarker=No"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "IFMarker=No"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): discovery session; digests disabled
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "HeaderDigest=None"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): discovery session; digests disabled
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "DataDigest=None"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): operational parameter negotiation done; transitioning to Full Feature Phase
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): beginning discovery session; waiting for Text PDU
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key received: "SendTargets=All"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): key to send: "TargetName=iqn.2015-01.mx.com.mytestsite:bsdtest0"
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): done sending targets; waiting for Logout PDU
ctld: 10.211.55.2 (iqn.2014-09.antares.df.mytestsite.com.mx): read: connection lost
The weird thing is that the error only happens with the discovery session. If I manually attach the LUN by specifying target and port, I actually get my target LUNs. It seems ctld(8) isn't answering something that the iSCSI initiator is expecting.
 
Ok, I found two issues. One is easily sidestepped: if ctld is listening on 0.0.0.0 (INADDR_ANY) the code searching for the TargetAddress will match 0.0.0.0 and thus discard it as a possible target. The Target information is sent, but the initiator doesn't know what to do with it because it contains the TargetName but doesn't include the address. If I set up ctl.conf to listen on the actual IP address, it will work. (Another solution involves getsockname() and converting this to the actual IP address the initiator is using to connect to ctld.)

The second one is what is actually causing the initiator to disconnect before sending the Logout PDU. It's a bug.

usr.sbin/ctld/discovery.c has the following line to update CmdSN after receiving a PDU, right at the end of text_receive():
Code:
conn->conn_cmdsn = ntohl(bhstr->bhstr_cmdsn);
It should actually read:
Code:
conn->conn_cmdsn = ntohl(bhstr->bhstr_cmdsn) + 1;
as bhstr_cmdsn is sequential. In fact, the code immediately preceding this line will check if the received CmdSN is higher than the one we have in session, and will exit with an error stating "decreased CmdSN" as the reason.

Changing this line and recompiling fixes the issue. :D
 
Would these commits from a few weeks ago be fixing the same bug?

Code:
77 conn->conn_cmdsn = ntohl(bhstr->bhstr_cmdsn);
78+ if ((bhstr->bhstr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0)
79+      conn->conn_cmdsn++;

135 conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn);
136+ if ((bhslr->bhslr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0)
137+      conn->conn_cmdsn++;
The code changes above only increment cmdsn when the if statement checks out, but it could be that your fix may possibly increment the value in some cases when it shouldn't?

Commit message for above changes:
Fix handling of queued text and logout requests.

While it may have little sense, text and logout requests can be queued.
If they are, they consume cmdsn, so we should increment our conn_cmdsn.
 
Yes, it does! I haven't read the iSCSI RFC, I basically noticed that cmdSN had to increase but didn't check for the ISCSI_BHS_OPCODE_IMMEDIATE flag. The other issue is still standing, so I might submit a bug report for the TargetAddress issue. Meanwhile, adding code to use
Code:
getsockname()
for the listen 0.0.0.0 case or listening on the actual IP address works as a temporary fix. :)
 
iSCSI RFC does not require TargetAddress to be reported during discovery. In particular it tells: "Each target record returned includes zero or more TargetAddress fields", and "It is assumed by the initiator that the IP address and TCP port for this target are the same as those used on the current connection to the default iSCSI target".
 
Back
Top