Solved System seems to freeze when storage pool is accessed

pming

Member

Reaction score: 5
Messages: 30

Hello everybody

I am currently building my first physical FreeBSD box which should soon replace my Synology NAS.

I have put together all the pieces and was able to install FreeBSD just fine, but I have had some issues ever since I created the main storage pool where all the files are supposed to go.

For example, when I try to issue any zfs or zpool command on my storage pool, the system does not seem to respond (I don't get any output back, it just sits there like it would still be processing something). However, yesterday, I tried to run a bonnie++ benchmark on the storage pool, and while the benchmark was running, I was able to access and modify the zfs datasets on the pool without any problem (and I also noticed the hard drives were running along nicely). I specified a benchmark size that was bigger than my RAM, because with smaller sizes the hard drives wouldn't seem to do anything. Also when I issue any zfs or zpool command, I don't notice any activity of the drives, be it by looking at the activity LEDs on the case or by trying to hear sound from the hard drives.

Sometimes I also experience the following problems:
- The system does not boot up correctly, it freezes while loading devices
- The system suddenly shuts down (happened only once until now I think)
- The system freezes after I shutdown -p now (also happened only once or twice up until now). It put out the following:
Code:
uptime: 2h21m51s
agtiapi_resetcard: reset ERROR
(da0:pmspcbsd0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00

(da0:pmspcbsd0:0:0:0): CAM status: SCSI Bus Reset Sent/Received
(da0:pmspcbsd0:0:0:0): Error 5, Retry was blocked
(da0:pmspcbsd0:0:0:0): Synchronize cache failed
- The system freezes after I shutdown -r now and puts out the following:
Code:
...
Writing entropy file:.
90 second watchdog timeout expired. Shutdown terminated.
Sun Aug 14 15:04:51 CEST 2016
Sun Aug 14 15:04:51 zfstored init: /bin/sh on /etc/rc.shutdown terminated abnormally, going to single user mode
Sun Aug 14 15:04:51 zfstored syslogd: exiting on signal 15
Sun Aug 14 15:05:11 init: some processes would not die: ps axl advised

On all of these occasions, I received at least:
Code:
agtiapi_resetcard: reset ERROR
Sometimes I also receive that error message when I try to run a zfs or zpool command.
Sometimes, I am still able to log in via ssh and do certain things (direct console via VGA and USB-Keyboard would be blocked if I ran any zpool or zfs command on the storage pool via direct console beforehand).
Yesterday I setup samba and webmin, and it really worked fine while running the benchmark. But again, today, as I power up the system again, I can't access smb shares on the storage pool.

So far the description of my problem.

The box consists of the following hardware:

Supermicro SC836TQ-R500B chassis
Supermicro X10SRL-F mainboard
Intel Xeon E5-2620-V4
64 GB DDR4 2133-MHz Crucial RAM
2 x Adata SP900 for mirrored boot drive
3 x Samsung SM951 128 GB for L2ARC and ZIL
8 x WD 2 TB Red Pro HDDs hooked up to a
Adaptec 71605H HBA card
Intel XL710-DA2 nic

I installed the driver for the adaptec card like so:
Code:
fetch http://download.adaptec.com/sas/unix/adp80xx_freebsd_drivers_b11068.tgz
tar zxpf adp80xx_freebsd_drivers_b11068.tgz
pkg install freebsd_10/x64/pms10x-amd64.txz

kldstat shows:
Code:
Id Refs Address            Size     Name
1   31 0xffffffff80200000 17bc718  kernel
2    1 0xffffffff819bd000 2fc440   zfs.ko
3    2 0xffffffff81cba000 6040     opensolaris.ko
4    1 0xffffffff81cc1000 157c0    aio.ko
5    1 0xffffffff81cd7000 4a70     coretemp.ko
6    1 0xffffffff81cdc000 5a30     aesni.ko
7    2 0xffffffff81ce2000 352d0    crypto.ko
8    1 0xffffffff81d18000 444d00   pmspcv.ko
9    1 0xffffffff82211000 56c6     fdescfs.ko
10    1 0xffffffff82217000 2ba8     uhid.ko
11    1 0xffffffff8221a000 358d     ums.ko

I created my storage zpool like so:
Code:
# (for every WD 2 TB Red Pro)
gpart create -s gpt da0
gpart add -a 1m -t freebsd-zfs -l wd_redpro_2tb_1 #the number indicates the hdd slot da0 is in

# (zpool creation)
zpool create storage raidz2 gpt/wd_redpro_2tb_1 gpt/wd_redpro_2tb_2 gpt/wd_redpro_2tb_3 gpt/wd_redpro_2tb_4 \
gpt/wd_redpro_2tb_13 gpt/wd_redpro_2tb_14 gpt/wd_redpro_2tb_15 gpt/wd_redpro_2tb_16 cache nvd2 log mirror nvd0 nvd1

I'm sorry I cant give any output of zpool status at the moment. When I happen to successfully run it, it shows no errors whatsoever.

Some more information that might be of interest: In the readme that was included with the firmware for the Adaptec card, it said I had to build a custom kernel with
Code:
#device ahd
in my kernel configuration. I did this for updating the firmware, but after that, I reinstalled FreeBSD and didn't build a new, custom kernel. Also I think the error might have some connection to the Adaptec driver, since I found the "agtiapi_resetcard" in https://github.com/freebsd/freebsd/blob/master/sys/dev/pms/freebsd/driver/ini/src/agtiproto.h

Any help or advice on this would be greatly appreciated!

Greetings
 
OP
pming

pming

Member

Reaction score: 5
Messages: 30

Hello again

I can now zfs create and zfs destroy without any problems and I haven't seen the error up until now.
It seems it was a mistake to plug in the sideband cables into the backplane.
Or is there a problem with the backplane controller and the Adaptec driver?

Anyways, with the sideband cables removed the system works like a charm now.

I still have two questions though, if anybody is reading this:
1) Do I need to plug in the sideband cables? If yes, why?
2) Do I need to partition the NVME SSDs?

Have a nice sunday ;-)
 

Phishfry

Beastie's Twin

Reaction score: 2,752
Messages: 5,685

Alot of disk chassis/subsystems use I2C-SGPIO to transmit fans and thermal information via sideband. I don't see a big deal if it still works. Some backplanes flat out won't work without the sideband cabling. You might be able to load some FreeBSD modules and see the temps from the backplane.
 
OP
pming

pming

Member

Reaction score: 5
Messages: 30

Thanks for your replies.
The PDF notes:
TODO
– add support for SGPIO/I2C interfaces in more drivers

So probably support isn't (yet) implemented in the driver I'm using ...

I guess I'll just leave the system like it is for now and maybe try again when they have a new driver for FreeBSD 11.

This thread can be considered "solved".

Have a nice day ;-)
 

User23

Well-Known Member

Reaction score: 68
Messages: 496

The mainboard has enough SATA Connectors and the backplane is directly connected, so you you could use SATA cables to test without the Adaptec controller.
 
Top