FreeBSD 9.1 + ZFS deadlock at shutdown

I don't know if there's a report about this type of problem and I'll appreciate if anyone who is skillful to reproduce this and submit a debug report. I don't care about credit but would like this fixed.

I tried FreeBSD 9.1 (RELEASE/RELENG/STABLE) with ZFS and iRedMail and it hangs when I issue a shutdown command. The only last message displayed "All buffers synched" and it hangs indefinitely until I hit the reset/power button.

I tried FreeBSD 9.1 without ZFS and it shutdown normal.

I tried FreeBSD 9.0 with ZFS and it shutdown normal.

So the problem is FreeBSD 9.1 with ZFS.

This only happens after I install iRedMail. If I uninstall or delete iRedMail and it shutdowns normal. If I reinstall iRedMail and it hangs. iRedMail's mail daemons must be causing some kind of deadlock and preventing FreeBSD 9.1 from normal shutdown. I think this is serious problem if the server is at a remote location without physical access to power button.

I was able to reproduce this problem in VirtualBox and two different computers.

The server can be shutdown using this command 'shutdown -n -o -r now' but it's risky.
 
Remington said:
I tried FreeBSD 9.1 (RELEASE/RELENG/STABLE) with ZFS and iRedMail and it hangs when I issue a shutdown command. The only last message displayed "All buffers synched" and it hangs indefinitely until I hit the reset/power button.

What is "indefinitely" in your case?
Sometimes I had to wait for 5-10 Minutes for servers with large ZFS to shutdown after "All buffers synched" was displayed.
 
I've also seen this problem with FreeBSD 9.1 and ZFS.

It occurs on my hosted server which has 2GB RAM, with a single mirrored zpool containing 22 datasets.

The server operates just fine for months on end, but when I come to reboot it I get the same deadlock. I then have to remotely power cycle it to recover.
 
jem said:
I've also seen this problem with FreeBSD 9.1 and ZFS.

It occurs on my hosted server which has 2GB RAM, with a single mirrored zpool containing 22 datasets.

The server operates just fine for months on end, but when I come to reboot it I get the same deadlock. I then have to remotely power cycle it to recover.

My hosted server has 16GB RAM and (2) 1TB drives with ZFS mirrored pools as well and it hangs indefinitely. I duplicated the problem in 4GB RAM VirtualBox with ZFS mirrored virtual drives. Same problem.

I have not tested to see if it works with three or more drives with raidz pool.

It works fine in FreeBSD 9.0 and it does not hang or deadlock.
 
User23 said:
What is "indefinitely" in your case?
Sometimes I had to wait for 5-10 Minutes for servers with large ZFS to shutdown after "All buffers synched" was displayed.

In my case it was 30 minutes or more. It is unacceptable for hosted server. I don't like to use [cmd=]shutdown -n -o -r now[/cmd] or press the button remotely during deadlock and risk data loss or corruption.
 
@Remington
Have You tried with 9.1-STABLE to check if its not already fixed?

As You have ZFS there You may want to use Boot Environments (sysutils/beadm) to be able to get back to 9.1-RELEASE after upgrade to 9.1-STABLE.
 
vermaden said:
@Remington
Have You tried with 9.1-STABLE to check if its not already fixed?

As You have ZFS there You may want to use Boot Environments (sysutils/beadm) to be able to get back to 9.1-RELEASE after upgrade to 9.1-STABLE.

I already tried -RELEASE, RELENG and -STABLE with same result. I tried HEAD but had no luck getting it to compile as its highly experimental branch.
 
Latest FreeBSD 9.1 -STABLE branch does work and it shutdown or reboot normal.

However, FreeBSD 9.1 -RELEASE and -RELENG branch continue to have problems so I am going to use the -STABLE branch.

I use the following buildworld procedure:
Code:
cd /usr/src
svn co svn://svn.freebsd.org/base/stable/9/ /usr/src
make buildworld
make kernel
<reboot in single mode>
mount -u /
zfs mount -a
cd /usr/src
mergemaster -p
make installworld
mergemaster -Ui
make delete-old
make delete-old-libs
<reboot>
 
Remington said:
However, FreeBSD 9.1 -RELEASE and -RELENG branch continue to have problems so I am going to use the -STABLE branch.
Just to clarify, RELENG_9 = 9.1-STABLE while RELENG_9_1 = 9.1-RELEASE-pX and RELENG_9_1_0 = 9.1-RELEASE (without any -pX).
 
Back
Top