SAS6IR Adapter Issue on FreeBSD 7.2

Hello. And much Thanks to all in advanced. I equate my FreeBSD knowledge to that of my 7th grader.. enough to be dangerous lol. I have a Dell 1950 with a SAS6 Adapter in RAID 1 (IM) running FreeBSD 7.2. Disk 0 has failed and the status of the RAID is obviously degraded. The server needed to be hard booted as console was unresponsive. Upon reboot I am now getting mpt0 errors and the server will no longer boot up fully.

Code:
mpt0:  <LSILogic SAS/SATA Adapter> port 0xec00-0xecff mem 0xfc4fc000-0xfc4fffff, 0xxfc4e0000-0xfc4effff irq 16 at device 0.0 on pci1
mpt0:  [ITHREAD}
mpt0:  MPI Version-1.5.18.0
mpt0:  mpt_wait_req(6) timed out
mpt0:  port 0 enable timed out
mpt0:  failed enable port 0
mpt0:  unable to initialize IOC

edited------

Code:
GEOM_LABEL: Label for provider da0s1a is ufsid/4a030d8e4cae7b51.
Trying to mount root from ufs:/dev/da0s1a
Loading configuration files.
kernel dumps on /dev/da0s1b
Entropy harvesting: interrupts ethernet point_to_point
mpt0:  mpt_scsi_reply_handler: req already free

At this point the server hangs and never boots any farther. Server has been running for several years and died in the middle of the night last night. Dell is sending a replacement drive but I'm not sure if that will resolve the issue. If it is indeed the drive then why wouldn't the server boot past the mpt failure using the remaining good disk1. The only debug I have is the console msg during boot.

Thanks,
mike
 
Mike_Eberline said:
I have a Dell 1950 with a SAS6 Adapter in RAID 1 (IM) running FreeBSD 7.2. Disk 0 has failed and the status of the RAID is obviously degraded. The server needed to be hard booted as console was unresponsive. Upon reboot I am now getting mpt0 errors and the server will no longer boot up fully.
Event and error handling in the mpt driver isn't perfect, as there isn't full documentation from LSI on the controller chips - hence the rather opaque mpt_cam_event messages when it encounters something it doesn't understand. More recent FreeBSD versions may have improved things. Once you get the current situation resolved, you should consider upgrading to a more recent and supported FreeBSD version.

You can probably get a better idea of what is going on from the controller's BIOS menu. I think it displays something like "Press control-C to enter menu". I'd start with the "View array" command to see what the controller thinks is happening. You can refer to the Dell manual (PDF) for more information, in the section "SAS 6/iR BIOS".

Be VERY CAREFUL if you change any options or try to re-synchronize the volume, since specifying the wrong drive can lead to loss of data. This also applies replacing the failed drive.
 
SAS 6 Issue

Thanks for the insight Terry. I will upgrade the server once I get it back up. Hopefully replacing the failed DISK0 will at least allow me get the server up.

I entered the SAS conf utility and it basically says the Array is degraded and I only see a 'primary' device listed. Do you think the mpt time over port 0 reflects it's inability to see DISK0? I should be receiving the drive anytime now. I assume if I swap out the disk and boot up the server the RAID should 'sync' the diisks however is there anything else I need to do? I am not a very knowledgable server guy. Thanks again.

Mike
 
Mike_Eberline said:
I entered the SAS conf utility and it basically says the Array is degraded and I only see a 'primary' device listed. Do you think the mpt time over port 0 reflects it's inability to see DISK0? I should be receiving the drive anytime now. I assume if I swap out the disk and boot up the server the RAID should 'sync' the diisks however is there anything else I need to do? I am not a very knowledgable server guy.
The drive could have dropped offline and not come back, or at least the controller thinks so. If you can live with the degraded array until the new drive arrives, I would suggest leaving it alone. The replacement drive may or may not be auto-detected by the controller. If it came from Dell as a warranty replacement for your existing drive, it should have some instructions with it.
 
Unfortunately the drive failed, the server went completely unresponsive and had to be hard booted. Since the drive failure the server no longer boots up completely. I'm concerned because a mirrored pair failure should not have resulted in this issue. Obviously defeats the purpose of having a mirrored pair. My fear is the OS cannot mount the root file system as I saw a brief message flash across while powering if off after the failed reboot that said something to the effect of ':/ bad dir ....'. Dell provided no instructions with the replacement drive (although nice enough to enclose a return shipping label) but I gather entering the SAS conf, selecting the array, verifying the drive is present and no longer missing and selecting sync should do the trick. I just hope the server restores so I can get all my nagios cfg files and scripts off. Time to build that FreeBSD VM Server.
 
Quick update. I received the original replacement drive and it was defective. So, a second replacement drive was sent. I installed the new drive and the Utility immediately started to sync the drives. After some time the sync completed, the server rebooted and fully restored. Though I am thankful I do not understand why the OS cared about the failed drive. Thanks Terry.
 
Back
Top