I spent a long time trying to research and fix this, but unfortunately failed.
The actual symptom is very sporadic, and prone to leading people to think they have fixed it. Whenever a new terminal is spawned (it doesn't have to be a second terminal), there's a chance the window will close almost immediately. Sometimes it happens almost every time. Sometimes it only happens every 100 times. Sometimes it won't happen for days.
GTK+-based terminals use a GNOME library called
libvte. This provides a GTK+ widget for the terminal itself, and also opens a process to the user's shell (
fork+
exec).
libvte then communicates with it over a pipe, which it uses
glib to monitor (
g_io_add_watch_full), which it calls a reaper. When it detects that the shell has terminated (it is receiving G_IO_HUP), it then kills its own widget. Both
gnome-terminal and
xfce4-terminal watch to see when the
vte widget is destroyed, and will then terminate the terminal processes. Usually, typing
exit
on the terminal does this, so it's normally a good thing to do.
The root of the problem is that the
/bin/sh process is exiting normally (exit code 0, EXIT_SUCCESS) for some reason. What makes it so ridiculously insane to debug is that it happens approximately 200ms - 2000ms after
libvte has completed initialization. The
/bin/sh process starts, prints PS1, and waits a bit, and then boom. It just closes all of a sudden. Your terminal is in the middle of
gtk_main_iteration_do when this happens, so you can't place any
debug-printf statements inside either
xfce4-terminal or
libvte to catch what exactly is doing it.
The only way this is going to be fixed properly is for someone to build their own
/bin/sh and insert their own debug hooks to determine why it is exiting "normally." I was getting overwhelmed by the very large codebases of
xfce4-terminal and
libvte already, and didn't want to delve into building debug-hooked versions of system tools that were critical to my OS running at all next, so I had to reluctantly give up. We will likely need a core FreeBSD developer to take interest in this.
This bug never occurs with non-
libvte terminals, such as
xterm and
urxvt. But of course, those are far less "elegant."
What I believe to be the underlying cause of the issue:
libvte was written by the GNOME team, which mostly focuses on Linux. There,
/bin/sh is actually GNU
bash in drag. Whereas FreeBSD has a real
sh without all the
bash extensions present. I believe that
libvte is doing something that
bash is fine with, but FreeBSD
sh sporadically is not okay with. I suspect the fix will belong in
libvte, but given all the GNOME 3 /
systemd / Linux-only stuff lately ... I am not sure how receptive they'd be. Then it'd be on FreeBSD to handle whatever weird thing it is they are doing, which might possibly also be an issue. It's a nasty gray-area issue.
What is really bizarre is that I don't actually see any FD writes from
libvte into the
/bin/sh file descriptor handle at the time it exits. Yet if it were strictly a bug in
/bin/sh, it would manifest itself with
xterm as well.
The only reliable workaround I have found is the same as
@onyxrev's.
sudo pkg install bash && chsh -s /usr/local/bin/bash && logout
When you log back in to Xorg, your virtual terminals will be using
bash as well now, and the problem goes away entirely.
Things I can definitively rule out:
- this is not caused by your shell profile (eg
set +o emacs
, etc)
- this is not caused by dbus (dbus enables terminal tabs to be dragged between windows)
- this is not caused by the video driver
- this is not caused by the input section of xorg.conf
- this is not caused by a compositor
- this happens with multiple versions of libvte
- this has existed through several FreeBSD release builds; I've seen people reporting it since FreeBSD 6.x and is still present in 10.0
If you want the relevant code sections for
xfce4-terminal and
libvte, I posted my research here, with the interesting research on page 2:
http://board.byuu.org/viewtopic.php?f=7&t=4601 (warning: crass language, intense frustration contained therein.)
Note: I don't know if this also happens with
csh, as I don't use it.