I am trying to destroy a dense, large filesystem and it's not going well.
Details:
- zpool is a raidz3 with 3 x 12 drive vdevs.
- target filesystem to be destroyed is ~2T with ~63M inodes.
- OS: FreeBSD 10.3amd with 192 GB of RAM.
- 120 GB of swap (90GB recently added as swap-on-disk)
What happened initially is the system locked after a few hours up and I had to reboot. Upon rebooting and starting zfs, I see sustained disk activity in gstat *and* that the sustained activity is usually just 6 disks reading. Two raidz3 vdevs are involved in this filesystem I am
deleting so there are 6 parity disks ... not sure if that is correlated or not.
At about the 1h40m mark of uptime I see things start to happen in top: a sudden spike in load, and drop in the amount of "Free" memory as reported in top:
(
)
It drops down under a GB and then fluctuates up and down till eventually it reaches some small amount (41 MB). As this drop starts, I see gstat activity on zpool drives cease, and there's some light activity on the swap devices, but not much. Also, the amount of swap used is reported as very little, maybe less than a MB to 24 MB. swapinfo shows nothing used. After the memory usage settles the system eventually ends up in a locked state where:
- nothing is going on in gstat; the only non-zero number is the queue length for the swap device which is stuck at 4
- load drops to nothing, and occasionally I see the zfskern and zpool procs stuck in vmwait state*.
- shell is unresponsive, but carriage returns register
- there are NO kernel/messages of any kind on console indicating a problem or resource exhaustion
Finally, I cannot do this:
I had set:
(and nothing else in loader)
but even removing that and setting more realistic figures like:
have not resulted in a different outcome, *though I don't see the processes in vmwait any longer, the state is just "-"
I've just lowered these to:
to see if that will make a difference.
No matter how many times I reboot, so far about 6, I never make it past the 1h40m mark and this memory dip. I don't know if I'm making any progress or just running into the same wall.
My questions:
- is this what it appears to be, a memory exhaustion?
- if so, why isn't swap utilized?
- how would I configure my way past this hurdle?
- a filesystem has a DELETE_QUEUE ... does the zpool itself have a destroy queue of some kind? I am trying to see if I can see the zpool working and how far along it is, but I do not know what to query with zdb
Thanks!
Details:
- zpool is a raidz3 with 3 x 12 drive vdevs.
- target filesystem to be destroyed is ~2T with ~63M inodes.
- OS: FreeBSD 10.3amd with 192 GB of RAM.
- 120 GB of swap (90GB recently added as swap-on-disk)
What happened initially is the system locked after a few hours up and I had to reboot. Upon rebooting and starting zfs, I see sustained disk activity in gstat *and* that the sustained activity is usually just 6 disks reading. Two raidz3 vdevs are involved in this filesystem I am
deleting so there are 6 parity disks ... not sure if that is correlated or not.
At about the 1h40m mark of uptime I see things start to happen in top: a sudden spike in load, and drop in the amount of "Free" memory as reported in top:
(
Code:
Mem: 23M Active, 32M Inact, 28G Wired, 24M Buf, 159G Free
It drops down under a GB and then fluctuates up and down till eventually it reaches some small amount (41 MB). As this drop starts, I see gstat activity on zpool drives cease, and there's some light activity on the swap devices, but not much. Also, the amount of swap used is reported as very little, maybe less than a MB to 24 MB. swapinfo shows nothing used. After the memory usage settles the system eventually ends up in a locked state where:
- nothing is going on in gstat; the only non-zero number is the queue length for the swap device which is stuck at 4
- load drops to nothing, and occasionally I see the zfskern and zpool procs stuck in vmwait state*.
- shell is unresponsive, but carriage returns register
- there are NO kernel/messages of any kind on console indicating a problem or resource exhaustion
Finally, I cannot do this:
Code:
# zdb -dddd pool/filesystem | grep DELETE_QUEUE
zdb: can't open 'pool/filesystem': Device busy
(presumably because it is pending destroy ...)
Code:
vm.kmem_size="384G"
but even removing that and setting more realistic figures like:
Code:
vm.kmem_size=200862670848
vm.kmem_size_max=200862670848
vfs.zfs.arc_max=187904819200
I've just lowered these to:
Code:
vm.kmem_size=198642237440
vm.kmem_size_max=198642237440
vfs.zfs.arc_max=190052302848
No matter how many times I reboot, so far about 6, I never make it past the 1h40m mark and this memory dip. I don't know if I'm making any progress or just running into the same wall.
My questions:
- is this what it appears to be, a memory exhaustion?
- if so, why isn't swap utilized?
- how would I configure my way past this hurdle?
- a filesystem has a DELETE_QUEUE ... does the zpool itself have a destroy queue of some kind? I am trying to see if I can see the zpool working and how far along it is, but I do not know what to query with zdb
Thanks!