Hello!
After the transition to FreeBSD 10.1-RELEASE-p23.
There are such periodical messages:
The computer continues to operate, but the disk subsystem is disabled and everything freezes.
"Reset" and booting normally.
Then in one day or two days (the computer does not turn off) appears again:
File system in ZFS raidz1. (ada0, ada1, ada2)
The system is not highly loaded. No very intensive work is not done.
Powered by nominal parameters. It system not overclocked.
Typical status:
smartctl is not find on the critical problems and drive errors.
What to do? How to find the problem?
This is happening on a different system on the same AMD.
There, too, ZFS, but stripped zfs pool.
Maybe the problem is in the driver AHCI?
After the transition to FreeBSD 10.1-RELEASE-p23.
Code:
> uname -a
FreeBSD wfid78-172 10.1-RELEASE-p23 FreeBSD 10.1-RELEASE-p23 #0: Thu May 14 13:35:13 UTC 2015
root@amd64-builder.pcbsd.org:/usr/obj/usr/src/sys/GENERIC amd64
> sudo dmidecode -t 2
# dmidecode 2.12
SMBIOS 2.4 present.
Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: GA-880GA-UD3H
Version: x.x
Jun 3 12:28:42 wfid78-172 kernel: CPU: AMD Phenom(tm) II X4 925 Processor (2812.51-MHz K8-class CPU)
Jun 3 12:28:42 wfid78-172 kernel: real memory = 34359738368 (32768 MB)
Jun 3 12:28:42 wfid78-172 kernel: avail memory = 33271947264 (31730 MB)
Code:
May 30 11:43:29 wfid78-172 kernel: ahcich3: Timeout on slot 24 port 0
May 30 11:43:29 wfid78-172 kernel: ahcich3: is 00000008 cs 00000000 ss 00000000 rs 01000000 tfd 40 serr 00000000 cmd 00207817
May 30 11:43:29 wfid78-172 kernel: (ada2:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 bc f5 1e 40 00 00 00 00 00 00
May 30 11:43:29 wfid78-172 kernel: (ada2:ahcich3:0:0:0): CAM status: Command timeout
May 30 11:43:29 wfid78-172 kernel: (ada2:ahcich3:0:0:0): Retrying command
May 30 11:43:29 wfid78-172 kernel: ahcich1: Timeout on slot 15 port 0
May 30 11:43:29 wfid78-172 kernel: ahcich1: is 00000008 cs 00000000 ss 00000000 rs 00008000 tfd 40 serr 00000000 cmd 00206f17
May 30 11:43:29 wfid78-172 kernel: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 02 bd f5 1e 40 00 00 00 00 00 00
May 30 11:43:29 wfid78-172 kernel: (ada1:ahcich1:0:0:0): CAM status: Command timeout
May 30 11:43:29 wfid78-172 kernel: (ada1:ahcich1:0:0:0): Retrying command
"Reset" and booting normally.
Then in one day or two days (the computer does not turn off) appears again:
Code:
Jun 3 12:23:32 wfid78-172 kernel: ahcich3: Timeout on slot 14 port 0
Jun 3 12:23:32 wfid78-172 kernel: ahcich3: is 00000008 cs 00000000 ss 00000000 rs 00006000 tfd 40 serr 00000000 cmd 00206e17
Jun 3 12:23:32 wfid78-172 kernel: (ada2:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 04 c9 25 1c 40 0f 00 00 00 00 00
Jun 3 12:23:32 wfid78-172 kernel: (ada2:ahcich3:0:0:0): CAM status: Command timeout
Jun 3 12:23:32 wfid78-172 kernel: (ada2:ahcich3:0:0:0): Retrying command
Jun 3 12:23:32 wfid78-172 kernel: ahcich1: Timeout on slot 13 port 0
Jun 3 12:23:32 wfid78-172 kernel: ahcich1: is 00000008 cs 00000000 ss 00000000 rs 00003000 tfd 40 serr 00000000 cmd 00206d17
Jun 3 12:23:32 wfid78-172 kernel: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 04 c9 25 1c 40 0f 00 00 00 00 00
Jun 3 12:23:32 wfid78-172 kernel: (ada1:ahcich1:0:0:0): CAM status: Command timeout
Jun 3 12:23:32 wfid78-172 kernel: (ada1:ahcich1:0:0:0): Retrying command
Code:
> zpool list -v
NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT
zpool0 2,70T 194G 2,51T 12% - 6% 1.04x ONLINE -
raidz1 2,70T 194G 2,51T 12% -
ada0p2 - - - - -
ada1p2 - - - - -
ada2p2 - - - - -
> zpool status -v
pool: zpool0
state: ONLINE
scan: scrub repaired 0 in 1h10m with 0 errors on Wed Jun 3 02:27:02 2015
config:
NAME STATE READ WRITE CKSUM
zpool0 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0p2 ONLINE 0 0 0
ada1p2 ONLINE 0 0 0
ada2p2 ONLINE 0 0 0
errors: No known data errors
The system is not highly loaded. No very intensive work is not done.
Powered by nominal parameters. It system not overclocked.
Typical status:
Code:
> top
last pid: 5205; load averages: 0.54, 0.42, 0.41 up 0+00:24:38 12:52:14
181 processes: 1 running, 179 sleeping, 1 zombie
CPU: 0.9% user, 0.0% nice, 0.5% system, 0.1% interrupt, 98.5% idle
Mem: 2153M Active, 1097M Inact, 2262M Wired, 9748K Cache, 26G Free
ARC: 1476M Total, 642M MFU, 784M MRU, 226K Anon, 11M Header, 39M Other
Swap: 17G Total, 17G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1514 root 1 -21 r31 1148M 204M select 1 0:49 2.78% Xorg
2206 wcsn 2 28 0 524M 98M select 1 0:07 2.29% kdeinit4
2287 wcsn 12 28 0 1244M 357M uwait 2 0:17 0.20% chrome
2300 wcsn 12 33 0 1067M 187M uwait 2 0:08 0.20% chrome
2293 wcsn 12 28 0 1072M 190M uwait 0 0:08 0.20% chrome
What to do? How to find the problem?
This is happening on a different system on the same AMD.
There, too, ZFS, but stripped zfs pool.
Maybe the problem is in the driver AHCI?
Last edited: