Adaptec 5805

analogue problems
Code:
http://communities.vmware.com/message/975407
http://forums.freebsd.org/showthread.php?t=2311
http://lists.freebsd.org/pipermail/freebsd-scsi/2008-June/003524.html
 
short memory or ignore ...

First answer - update firmware.

From: "Adaptec Support" <ask_support@adaptec.com>
Reply-To: "Adaptec Support" <ask_support@adaptec.com>
MIME-Version: 1.0
Message-Id: <49A2D90B.000001.00436@adprn01.adaptec.com>
Date: Mon, 23 Feb 2009 09:12:43 -0800 (Pacific Standard Time)
Subject: System freeze [Incident: 090219-000063]
X-Spam: Not detected
X-Mras: Ok

Your question has been received.

**Please note: If your question is regarding RAID issues please attach
the RAID Controller Support Archive to the incident. See Answer ID
14929
(http://ask.adaptec.com/scripts/adaptec_tic.cfg/php.exe/enduser/std_adp.php?p_faqid=14929)
on how to create it.


To update your question from our support site, click the following
link or paste it into your web browser.
http://ask.adaptec.com/Scripts/adap...=myq_upd.php&p_iid=38112&p_created=1235068490


Question Reference #090219-000063
---------------------------------------------------------------
Summary: System freeze
Product Level 1: Serial Attached SCSI (SAS)
Product Level 2: Adaptec RAID 5805
Category Level 1: Troubleshooting / Error Messages
Date Created: 02/19/2009 10:34 AM
Last Updated: 02/19/2009 10:34 AM
Status: Unresolved
Product Details: Serial Number
Number: 8C4310BCEC5
Operating System: FreeBSD
OS Version: 7.0,7,1


Discussion Thread
---------------------------------------------------------------
Customer (Dmitry Brazhnikov) - 02/19/2009 10:34 AM
Supermicro servers with adaptec 5805, bios firmware b16343
System have 6 disks - ST3500320NS
Operation system FreeBSD x86 7.1, 7.0 - drivers from systems or from adaptec (latest)
I have create test raids 5ee, 5 - after a while (5minutes --- 6 hour) system freeze - systat before freeze show 100% io on disks.


----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
Controller Status : Optimal
Channel description : SAS/SATA
Controller Model : Adaptec 5805
Controller Serial Number : 8C4310BCEC5
Physical Slot : 4
Temperature : 56 C/ 132 F (Normal)
Installed memory : 512 MB
Copyback : Disabled
Background consistency check : Enabled
Automatic Failover : Enabled
Global task priority : High
Performance Mode : Default/Dynamic
Defunct disk drive count : 0
Logical devices/Failed/Degraded : 6/0/0

Auto-Response - 02/19/2009 10:34 AM
Thank you for using ASK Us.

The incident has been received and will be handled soon.

**Please note: If your question is regarding RAID issues please attach the RAID Controller Support Archive to the incident. See Answer ID 14929 (http://ask.adaptec.com/scripts/adaptec_tic.cfg/php.exe/enduser/std_adp.php?p_faqid=14929) on how to create it.


[---001:002112:55908---]



Hmmm... This is FreeBSD forum and not M$ Forum or i`m not right? Windows TESTED too - freeze, not BSOD. On new controller after change - i can`t see problem on FreeBSD, Linux and M$ Windows.
 
Hi Guys, I have some news for you.

1. This problem was duplicated in LAB, current assumption is a problem in driver.

2. Some correcting actions were worked out and under test now, u can take part… Two ways to check it:

- If you have a board with BIOS setting for the PCI Memory Mapped I/O above 4GB behavior of the system – simply turn of that and the problem should go away.
- Approach me (address is in above posting) to get test driver. Just update the system with the new driver package – please install with pkg_add and if formerly test driver is installed please use force command switch to install. If installation is successful the driver will report the following at startup. „THIS IS A TEST DRIVER WITH DMA >4GB DISABLED (NEW!)“ - Please verify with dmesg.

New info soon.
 
I think trouble not in driver
Code:
server7# sysctl hw.physmem; sysctl hw.usermem
hw.physmem: 8580644864
hw.usermem: 7964139520

server7# uname -a
FreeBSD server7 7.2-RELEASE FreeBSD 7.2-RELEASE #0: Fri May  8 06:04:17 UTC 2009     root@server7:/usr/src/sys/amd64/compile/HOSTING  amd64

server7# arcconf GETCONFIG 1
Controllers found: 1
----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
   Controller Status                        : Optimal
   Channel description                      : SAS/SATA
   Controller Model                         : Adaptec 5805
   Controller Serial Number                 : 8C4310BCEE5
   Physical Slot                            : 4
   Temperature                              : 54 C/ 129 F (Normal)
   Installed memory                         : 512 MB
   Copyback                                 : Disabled
   Background consistency check             : Disabled
   Automatic Failover                       : Enabled
   Global task priority                     : High
   Performance Mode                         : Default/Dynamic
   Defunct disk drive count                 : 0
   Logical devices/Failed/Degraded          : 3/0/0
   --------------------------------------------------------
   Controller Version Information
   --------------------------------------------------------
   BIOS                                     : 5.2-0 (16501)
   Firmware                                 : 5.2-0 (16501)
   Driver                                   : 2.2-4 (16343)
   Boot Flash                               : 5.2-0 (16501)
   --------------------------------------------------------
   Controller Battery Information
   --------------------------------------------------------
   Status                                   : Optimal
   Over temperature                         : No
   Capacity remaining                       : 99 percent
   Time remaining (at current draw)         : 1 days, 19 hours, 55 minutes

----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
   Logical device name                      : SYS
   RAID level                               : 5EE
   Status of logical device                 : Optimal
   Status of RAID 5EE                       : Expanded
   Size                                     : 102400 MB
   Stripe-unit size                         : 64 KB
   Read-cache mode                          : Enabled
   Write-cache mode                         : Enabled (write-back)
   Write-cache setting                      : Enabled (write-back) when protected by battery
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : Yes
   Failed stripes                           : No
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,0) 
   Segment 1                                : Present (0,1) 
   Segment 2                                : Present (0,2) 
   Segment 3                                : Present (0,3) 
   Segment 4                                : Present (0,4) 
   Segment 5                                : Present (0,5) 

Logical device number 1
   Logical device name                      : USER
   RAID level                               : 5EE
   Status of logical device                 : Optimal
   Status of RAID 5EE                       : Expanded
   Size                                     : 2044924 MB
   Stripe-unit size                         : 64 KB
   Read-cache mode                          : Enabled
   Write-cache mode                         : Enabled (write-back)
   Write-cache setting                      : Enabled (write-back) when protected by battery
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : No
   Failed stripes                           : No
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,0) 
   Segment 1                                : Present (0,1) 
   Segment 2                                : Present (0,2) 
   Segment 3                                : Present (0,3) 
   Segment 4                                : Present (0,4) 
   Segment 5                                : Present (0,5) 

Logical device number 2
   Logical device name                      : BACKUP
   RAID level                               : 5EE
   Status of logical device                 : Optimal
   Status of RAID 5EE                       : Expanded
   Size                                     : 711680 MB
   Stripe-unit size                         : 64 KB
   Read-cache mode                          : Enabled
   Write-cache mode                         : Enabled (write-back)
   Write-cache setting                      : Enabled (write-back) when protected by battery
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : No
   Failed stripes                           : No
   --------------------------------------------------------
   Logical device segment information
   --------------------------------------------------------
   Segment 0                                : Present (0,0) 
   Segment 1                                : Present (0,1) 
   Segment 2                                : Present (0,2) 
   Segment 3                                : Present (0,3) 
   Segment 4                                : Present (0,4) 
   Segment 5                                : Present (0,5) 


----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
      Device #0
         Device is a Hard drive
         State                              : Online
         Supported                          : Yes
         Transfer Speed                     : SATA 3.0 Gb/s
         Reported Channel,Device            : 0,0
         Reported Location                  : Connector 0, Device 0
         Vendor                             : WDC
         Model                              : WD7502ABYS-0
         Firmware                           : 03.00C05
         Size                               : 715404 MB
         Write Cache                        : Disabled (write-through)
         FRU                                : None
         S.M.A.R.T.                         : No
      Device #1
         Device is a Hard drive
         State                              : Online
         Supported                          : Yes
         Transfer Speed                     : SATA 3.0 Gb/s
         Reported Channel,Device            : 0,1
         Reported Location                  : Connector 0, Device 1
         Vendor                             : WDC
         Model                              : WD7502ABYS-0
         Firmware                           : 03.00C05
         Size                               : 715404 MB
         Write Cache                        : Disabled (write-through)
         FRU                                : None
         S.M.A.R.T.                         : No
      Device #2
         Device is a Hard drive
         State                              : Online
         Supported                          : Yes
         Transfer Speed                     : SATA 3.0 Gb/s
         Reported Channel,Device            : 0,2
         Reported Location                  : Connector 0, 
==skip===
server7# uptime
 2:13PM  up 15 days,  2:37, 1 user, load averages: 1.33, 0.94, 0.81
 
I do agree..... that's possible in this particular case that there are incompatible disks for some BACKPLANES (what is this system backplane?).

The whole logical chain...

Let's check compatibility list for 5805

http://www.adaptec.com/NR/rdonlyres...ompatibilityReport_061109_Series5_LowPort.pdf

Can u c this model there //Model : WD7502ABYS-0/FW(disk FW:03.00C05//?

Yes. Good.

What we know about this model? Insert "WD7502ABYS" in ASK search

http://ask.adaptec.com/scripts/adap...ccessibility=0&p_redirect=&p_lva=&p_sp=&p_li=

FIND ANSWER

1) For 5xx45 (12, 16, 24 ports) cards

See
http://ask.adaptec.com/scripts/adap...mNoX3RleHQ9V0Q3NTAyQUJZUw**&p_li=&p_topview=1

2) The same for 5xx5 connected to some backplanes with expander

Do you have ANY BACKPLANES?

To fix it try SSC (Spread Spectrum Clocking) disabled (link to WD database with HOW TO) inside above ASK link.

As was said in some cases other problems not connected with Driver are mixed!!!!

It could be checked... just send me you SUPPORT.ZIP file.

And if you design/assemble/integrate server systems subscribe for our tech letters.... SAS is more complicated than SCSI.

Some news:: This fixed driver is being checked worldwide.

2 cases reported currently: for both cases it helped. More info soon.
 
Answer from Adaptect support and from site adaptec.
1. update firmware - done - not help
2. check compatible hard disk - check - i have compatible seagate and wd disks - not help
3. check backplane - done, direct connect HDD to Adaptec controller - not help
4. disable SSC - done - not help


Now problem resolved, i can send logs from now working systems, old logs i can`t send
 
Hi All,

Only for FreeBSD 7.x=========

For all cases test driver was sent IT HELPED.

R and D dep is looking for a root cause and promised an official release of driver in a couple of weeks.

So, if you have a case described above with Adaptec RAID 5, 3, 2 Series, please, approach me for a FIXING DRIVER.

Please, use e-mail address russia_sales@adaptec.com
Please, make a ref: FREE BSD PROBLEM ADPml11405

or call official phone numbers from Adaptec WEB site adaptec.ru ====> About ====> Contacts ====> Russia

From 29.06.09 information about this case will be distributed among our integrators.

If you have any other case with suspicion Adaptec cards involved please send me detailed information about this case. Needed information in above presented postings.

Now I don’t have any open cases with FreeBSD OS among reported to me.
 
Hi All,

We have now official release of that driver 2.2.8-16891. You can ref it.

That has helped all I sent it to.

One case with some negative feedback, just have it in mind:

===direct translation===

Driver helped, but there was a problem, we were forced to migrate from FreeBSD 7.0 to FreeBSD 7.1, as shutdown stopped working normally. System hangs during any attempts to unmount volumes by the sync command.



Disks renamed from aacd to aacdu. Spent 30 min total on working system to fix it.

========================
 
Hi, I can confirm the same problems with adaptec 2405 in FreeBSD 6.x and 7.x

They work for some time and sudenly bum:

COMMAND XXXX TIMEOUT AFTER XX SECONDS

Only a hard reset will fix that for some more weeks.

This is reproducible in 3 or 4 machines, runing diferents hardwares, Tyan, Intel, Supermicro servers.
 
How to act

Hi All,

Above are presented exact instructions how to act if you need direct help from Adaptec. Please, use presales address (above) and send logs (instructions above).

Or send direct request to Tech Support using link
http://register.adaptec.com/ask_us.html

ask a question

All who approached us we were able to help.
 
Workaround

Hi all,

I'm still having this issue with the controller, latest firmware (5.2-0
(17517)) and drivers (2.2-8 (17517)) don't fix the problem, but Adaptec
suggested me to turn off write cache (switch it into write throught) on
the controller (smth like arcconf SETCACHE 1 LOGICALDRIVE 0 wt) - so I
did that and now I'm able to highly load the server with no problems.
Current uptime after this change is 15 days.
I didn't measure the performance penalty though.
 
FreeBSD 7.x problem

Hi All,

I have to stress that cache disabling could help only in some cases. As well as testing driver was able to help not for all cases.

I would like also to inform you that this problem (that seems to be quite complex) was duplicated in our lab. And Adaptec is working hard to understand its nature, hardware dependence (if any) and fix it. We are communicating with other manufactures of server components that also could be involved into that problem.

We already have some preliminary results that allows this problem to be fixed (they are testing for being presented for usage).

I have a list of companies approached us with it and as soon as I have any FW, driver etc. updates that could fix it I will send it directly and put information here.

Please, don’t worry about it, this problem is under control.
 
As was promised

http://www.adaptec.com/support/files/

Official version.

If it's not difficult, share results, please.

Password: CANDLE



Filename:



- 2045_fw_b17547.zip

- 2405_fw_b17547.zip

- 5085_fw_b17547.zip

- 5405_fw_b17547.zip

- 5405z_fw_b17547.zip

- 5445_fw_b17547.zip

- 5445z_fw_b17547.zip

- 5805_fw_b17547.zip

- 5805z_fw_b17547.zip

- 51245_fw_b17547.zip

- 51645_fw_b17547.zip

- 52445_fw_b17547.zip
 
Fix Confirmed

I can confirm that the updated firmware fixes the lockups I was experiencing running buildworld on this controller with FreeBSD 8.0-RELEASE.

I was able to repeatedly provoke the hang by doing the following, which does not work post firmware update.

Hardware:
SuperMicro H8DMU+
2x Operton 2382
(8x 4GB) = 32 GB RAM
Adaptec 5405 V5.2-0 Build 17544
4x Fujitsu 72GB 15K SAS

Reproduction case:
- Create RAID10 array out of 4 disks
- Format and mount as normal
- Copy src tree to array
- point MAKEOBJDIRPREFIX onto array
- while : ; do; time sh -c 'make -j8 buildworld && make -j8 buildkernel'; done
- Should hang in 30 minutes or less

If you get a hang with this, then the updated firmware posted above should fix your problems.
 
Similar issue in 8.2...

Code:
aac1: COMMAND 0xffffff8000a8b6d0 (TYPE 502) TIMEOUT AFTER 133 SECONDS

Tons of these after rsync from another BSD box, both running 8.2 Release amd64.

Firmware 18252 - the latest, even turned off write cache on disks.

Any ideas?
 
I see new drivers posted...

3 Apr 2011 AACRAID driver files b18284 for FreeBSD

Let's give those a try and report back.


celt said:
aac1: COMMAND 0xffffff8000a8b6d0 (TYPE 502) TIMEOUT AFTER 133 SECONDS

Tons of these after rsync from another BSD box, both running 8.2 Release amd64.

Firmware 18252 - the latest, even turned off write cache on disks.

Any ideas?
 
Are you using ARCCONF regularly to poll RAID status? I've had problems with that causing stuck transactions. The diagnosis for this is to run an ARCCONF command while the timeouts are occuring -- if they stop, then its a stuck admin command. A newer ARCCONF might address the root cause if this is the case.

Stuck commands can also be indicative of SAS stability issues, either due to a bad drive rapidly attaching/detaching from the fabric or a bad cable causing disconnects. Traffic halts while the SAS DISCOVERY runs and if this is happening fast enough it could stall commands enough to trigger the warning.

The best thing to do in this case is use the ARCCONF SAVESUPPORTARCHIVE command (I think that is the name) to dump the controller's event logs and console and see if there are a bunch of SAS DISCOVERY events when nothing is otherwise changing on the fabric (i.e., you're not removing drives).

Adaptec Support might be able to help you diagnose the issue interpret the logs as well.
 
Back
Top