Hello,
Have been running ZFS as a media server for more than a year. It has worked well and haven’t had any problems, until about a week ago. All of a sudden, 1 of 2 ZFS pools started performing very poor, both across the network and for transfers within the machine. The pool “tank2tb†basically went to what looked like kilobyte transfer speeds, while tests to “vault†remain good.
I was on FreeBSD 7.2, but recently upgraded in an attempt to resolve this so now I’m on:
Looking at transfers from network machines to tank2tb I can see that it starts great, then after a burst it starts seeing the TCP window size decrease all of a sudden. Then it hits zero and I see:
It stays in zerowindow for 10's of seconds then another burst of traffic gets through...rinse...repeat.
Reads do roughly the same thing, but just don’t zerowindow. They have spikes of good perf, followed by multiple seconds of no activity…spike…pause.
And I’ll point out again, in case it wasn’t clear, that this is only occurring with the pool tank2tb. Same test from vault show no problems for read or write. There were no (intentional) changes made to the server. In fact I had not logged onto it for months when this happened. There recently was a power outage that knocked it offline, but I can’t directly correlate the lost functionality to that day.
I would appreciate any help tracking this down. Let me know what test/output might help you and I’ll get it.
Thanks,
Aaron
Some various outputs to start with.
[CMD=""]zpool status[/CMD]
[CMD=""]zpool get “all†vault[/CMD]
Have been running ZFS as a media server for more than a year. It has worked well and haven’t had any problems, until about a week ago. All of a sudden, 1 of 2 ZFS pools started performing very poor, both across the network and for transfers within the machine. The pool “tank2tb†basically went to what looked like kilobyte transfer speeds, while tests to “vault†remain good.
I was on FreeBSD 7.2, but recently upgraded in an attempt to resolve this so now I’m on:
uname -a
Code:
FreeBSD filefs.local.com 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012
root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
Looking at transfers from network machines to tank2tb I can see that it starts great, then after a burst it starts seeing the TCP window size decrease all of a sudden. Then it hits zero and I see:
Code:
66331 9.258478 192.168.0.102 192.168.0.197 TCP 60 [TCP ZeroWindow] microsoft-ds > 52415 [ACK] Seq=91601 Ack=56973523 Win=0 Len=0
It stays in zerowindow for 10's of seconds then another burst of traffic gets through...rinse...repeat.
Reads do roughly the same thing, but just don’t zerowindow. They have spikes of good perf, followed by multiple seconds of no activity…spike…pause.
And I’ll point out again, in case it wasn’t clear, that this is only occurring with the pool tank2tb. Same test from vault show no problems for read or write. There were no (intentional) changes made to the server. In fact I had not logged onto it for months when this happened. There recently was a power outage that knocked it offline, but I can’t directly correlate the lost functionality to that day.
I would appreciate any help tracking this down. Let me know what test/output might help you and I’ll get it.
Thanks,
Aaron
Some various outputs to start with.
[CMD=""]zpool status[/CMD]
Code:
pool: tank2tb
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank2tb ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada6 ONLINE 0 0 0
ada3 ONLINE 0 0 0
errors: No known data errors
pool: vault
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
vault ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0 ONLINE 0 0 0
ada1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
errors: No known data errors
zpool get “all†tank2tb
Code:
NAME PROPERTY VALUE SOURCE
tank2tb size 5.44T -
tank2tb capacity 92% -
tank2tb altroot - default
tank2tb health ONLINE -
tank2tb guid 15262784380037294764 default
tank2tb version 28 default
tank2tb bootfs - default
tank2tb delegation on default
tank2tb autoreplace off default
tank2tb cachefile - default
tank2tb failmode wait default
tank2tb listsnapshots off default
tank2tb autoexpand off default
tank2tb dedupditto 0 default
tank2tb dedupratio 1.00x -
tank2tb free 395G -
tank2tb allocated 5.05T -
tank2tb readonly off -
[CMD=""]zpool get “all†vault[/CMD]
Code:
NAME PROPERTY VALUE SOURCE
vault size 2.72T -
vault capacity 36% -
vault altroot - default
vault health ONLINE -
vault guid 1719042198761293107 default
vault version 28 default
vault bootfs - default
vault delegation on default
vault autoreplace off default
vault cachefile - default
vault failmode wait default
vault listsnapshots off default
vault autoexpand off default
vault dedupditto 0 default
vault dedupratio 1.00x -
vault free 1.73T -
vault allocated 1016G -
vault readonly off -