Problem opening files

kr0m · Oct 23, 2020

I have the problem totally debugged, if i make a symlink in ZFS filesystem to a ext4 filesystem, it makes the crash app, its not the mount point per se, its the symlink.

Code:

bagheera $ ~> ln -s /mnt/6T/evilDir/ evilDir
bagheera $ ~> ls -la evilDir
lrwxr-xr-x  1 kr0m  kr0m  16 Oct 23 22:57 evilDir -> /mnt/6T/evilDir/

In that way it crashes, but its not related to symlinks in general it only happens with evilDir name, if i make the symlink in that other way:

Code:

bagheera $ ~> ln -s /mnt/6T/evilDir/ evilDir2
bagheera $ ~> ls -la evilDir2
lrwxr-xr-x  1 kr0m  kr0m  16 Oct 23 23:09 evilDir2 -> /mnt/6T/evilDir/

It doesnt crash, i suspect about making gtk bookmarks to symlinks in another filesystems.

kr0m · Oct 23, 2020

richardtoohey2
So was everything working before some sort of update? What update(s) did you do if any?
I dont remember if it started to fail after an update in concrete.

You seem to have moved away from discussing the /etc/fstab errors - are those still reported? Is your /etc/fstab correct?
I reported some strange messages about fstab in my second post.

If it seems to be to do with opening recent files, is there anything wrong with your home file directory or location of dot files? Maybe the editors are crashing trying to read or handle their recent file lists? Can you figure out e.g. where geany's recent file list is stored and clear it?
I have cleared recent file list from geany config: .config/geany/geany.conf recent_files= parameter but it still crashes.

If you cat any of the files that cause issues, does the cat work OK? Think it will based on what you've said about things like vi working.
cat and other tools works perfectly, geany and graphical editors crashes regardless the opened file.

I cant figure out why symlink with a concrete name makes the app crash and other symlink name to the same destination directory not.

kr0m · Oct 23, 2020

I only use pkg for everything. And yes i update my system regularly but i cant say when it started to crash.

VladiBG · Oct 24, 2020

Did you run any memtest?

_martin · Oct 24, 2020

I realized the crash happened in the function, not when it was entering the function. Hence those regs don't say much about the state (disassemble few instruction before and after would help). But it's hard to debug it like this non-interactively.

But truss does show you the SIGSEGV on address 0xa0, so access to a not valid memory. Hence most likely a bug. The report anonymous9 showed you is interesting.

I don't have any X on my FreeBSD machines so I can't test myself. But it might be worth also asking on mailing list. And/or replying to a bug mentioned above.

_martin · Oct 24, 2020

Out of curiosity I spawned the VM and installed gnome3 there. It's a 12.1-RELEASE (r354233) amd64 VM. I've recreated the setup you have (/mnt/6T to be the ext4 FS mounted with fusefs-ext2), the whole system is on ZFS. I tried to open/edit several files with geany but was not able to reproduce the bug. I've used the 12.1 install image and then used pkg to install binary packages.

There are probably way too many programs to list, but the most obvious:

Code:

dbus-glib-0.110                GLib bindings for the D-BUS messaging system
glib-2.66.0_1,1                Some useful routines of C programming (current stable version)
geany-1.36                     Fast and lightweight GTK+ IDE
gnome3-3.36                    "meta-port" for the GNOME 3 integrated X11 desktop

EDIT: I didn't have the ext4 FS in fstab but rather used a md device and set the mountpoint manually. Once I've put it in fstab I was able to reproduce the crash.
I used geany, tried to open a file following the symlink. flockfile() has NULL stream passed (stored in rbx now), hence the 0xa0 segfault:

C:

=> 0x00000008016556df <flockfile+15>:    cmp    QWORD PTR [rbx+0xa0],rax

I'll try to play around with it .. didn't use any X program for some time now though ..

kr0m · Oct 25, 2020

anonymous9 said:
Do you use Gnome? The problem is with glib. Try the link to the bug in my previous post.

I dont use Gnome, i use awesome, i have checked the link and it seems to be related.

olli@ · Oct 25, 2020

kr0m said:
Tested, the only source of problems is the #/dev/ada2p1 /mnt/6T ext2fs rw 0 0 line in my fstab file, i will try to fsck.ext4

I would regard the ext2fs support in FreeBSD as experimental, especially when mounted read+write. I’m not really surprised that it may cause problems. I recommend to use it in read-only mode, or avoid it altogether if possible.

Another option would be to use the ext2fs FUSE module (from packages / ports) instead of FreeBSD’s own ext2fs driver. It’s less efficient because it runs in userland instead of the kernel, but it might be more reliable.

_martin · Oct 26, 2020

I chose geany as a program to debug. Crash occurred *sometimes* when I open an open file dialog (ctrl+o). No actual opening of the file happens. Note program doesn't always crash in flockfile() function. That's just an observation, not any statement though. glib is big-ish and I've never debugged it before.

What is more interesting is the fact it doesn't matter what FS I'm trying to open files on. The only thing that matters is the contents of the /etc/fstab. I was able to use the geany without any problem on an ext4 filesystem when I removed the entry from the fstab (e.g. mount and remove from fstab afterwards).

I was able to trigger the same issue when no ext4 FS was mounted but specified in fstab. I used UFS.

The actual crash seems to be related to g_unix_mount_point_at(). The NULL str is passed sometimes to strcmp(), sometimes it crashes in flockfile() on NULL structure.

geanie is built up from 8 threads. Few threads (pool-geany) are doing the same thing (parsing). It feels like sort of a race condition.

UPDATE: I've fetched the ports and installed glib from there. Using version glib-2.66.2,1 I can't trigger the bug anymore.

kr0m · Oct 26, 2020

I am using packages, so when glib-2.66.2,1 arrives to binary packages, the bug will be solved, isnt it?

_martin · Oct 26, 2020

I'd say yes, you'll get this fixed with the new version of glib.

Also, when you check the fixes that are in port version:

Code:

------------------------------------------------------------------------
r552776 | fluffy | 2020-10-20 01:33:28 +0200 (Tue, 20 Oct 2020) | 9 lines

devel/glib20: lock getfsent() usage to fix some consumers crashes

Add temporary fix while more correct solution is cooking in GNOME repo
(see details at https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1707)

PR:        250311
Submitted by:    sigsys@gmail.com
Reviewed by:    tijl

It is related to the bug provided by anonymous9.