Solved Enlightenment application menu icons missing due to efreetd crashing

Hello to anybody having issues with Enlightenment menu icons missing or disappearing. When I say menu, this will most obviously be the application menu missing some or all of it's application icons. The other symptom is that a message pops up just after login, mentioning failure to connect to efreetd.

After 2 to 3 weeks of intensively debugging the root cause I have found the problem and have a very clean and proper fix for it. Not a dirty hack, a proper solution. It is FreeBSD specific and there is an easy and long standing fix that I intend to send upstream ASAP. I am starting with this thread to help anyone using Enlightenment on FreeBSD to get it resolved as quickly as possible. Next I will try to get the FreeBSD ports Enlightenment team to make a patch fix release until upstream take official patches.

Symptoms

- Disappearing menu icons in application menu, or perhaps missing icons elsewhere too.
- Dialog pop-up message just after login mentioning efreetd (timeout trying to connect).
- Running /usr/local/bin/efreetd from command line (terminal) crashes. resulting in an "Abort trap" error message.
- First login to Enlightenment has apparently never ending wait for the initial locale (language) dialog selection box to appear.
- Potentially any badly behaving process that uses the devel/efl ecore mainloop function, waiting on many file descriptors.

Known Triggers

- Happens after installing the Breeze icons theme package x11-themes/kf5-breeze-icons.
- Happens after installing almost any KDE/KF5 (KDE Frameworks 5) application, which in turn installs Breeze icon theme package.
- Likely happens after installing one or many theme packages that result in a large number of icons, mime or xdg desktop files.


Diagnosis

After a lot of debugging and chasing down the crash deep into the belly of the efl code, it lies within the ecore mainloop function. EFL (Enlightenment Foundation Libraries) is the graphical toolkit, services and runtime libraries used by Enlightenment. This particular part of efl is responsible for handling I/O multiplexing (interrupt driven multi-processing) by using file descriptors.

The FreeBSD implementation of the mainloop function seems to use the quite dated, limited and low performance method select(). Whilst I'm guessing that Linux distributions will build to use epoll. And this would explain why this issue exists in Enlightenment on FreeBSD and not on most (probably all) Linux distributions. Enlightenment and EFL is not nearly as well tested and cared about by the devs on FreeBSD as it is on Linux platforms.

The exact cause of the problem is the FD_SETSIZE variable used by the select() system call. The default on FreeBSD is a meager 1024, unless specifically set to a higher value. This is just a very conservative value due to UNIX's long history and very low power hardware and low resources. It is still 1024 on most Linux distros too, e.g. recent RHEL versions and current Arch Linux. The Breeze icon set is huge and so when efreetd gathers those resource files into a cache and keeps it in a memory map, this daemon needs to open and process many many files at once. Hence the need for many file descriptors. In short, the default maximum number of file descriptors (FD) is 1024, but we need more than that when we have so many resource files (mainly icons).

The reason this was so hard to debug is because there is no meaningful error message or error code to go on. The "Abort trap" is generated by the OS at the system call level with no explanation. The reason for this can be seen in the man page for select():

The behavior of these macros is undefined if a descriptor value is less than
zero or greater than or equal to FD_SETSIZE, which is normally at least equal
to the maximum number of descriptors supported by the system.

This "undefined behaviour" is a trait in POSIX standards that gives rise to unpredictable behaviour that is easy to implement (no compiler warnings) and hard to debug (no warning or error messages). It is left to the developer to make sure the number of file descriptors does not exceed FD_SETSIZE at runtime.

Quick Fix

An easy problem to fix however. Just set the FD_SETSIZE to something more suitable, I think 8192 ought to do it. The select() man page tells us how to do this:

The default size of FD_SETSIZE is currently 1024. In order to accommo-
date programs which might potentially use a larger number of open files
with select(), it is possible to increase this size by having the program
define FD_SETSIZE before the inclusion of any header which includes
<sys/types.h>.

For an immediate fix, we can build devel/efl manually from ports with the below patch (also attached). As root|toor user...

- First, install or update x11-wm/enlightenment from packages or ports, so all requires binaries and dependencies are in place.
- Then get latest ports tree with either portsnap fetch extract or portsnap fetch update.
- Add the patch:

- Option 1: save the attached patch file into a directory named 'files' just inside the efl port and unzip it.
- Option 2: follow commands below...

Code:
cd /usr/ports/devel/efl
mkdir files
cat <<EOF > files/patch-src_lib_ecore_ecore__main.c                                   
--- src/lib/ecore/ecore_main.c.orig    2021-05-17 09:13:01 UTC
+++ src/lib/ecore/ecore_main.c
@@ -4,6 +4,8 @@

#define EINA_SLSTR_INTERNAL

+#define FD_SETSIZE 8192
+
#ifdef _WIN32
# ifndef USER_TIMER_MINIMUM
#  define USER_TIMER_MINIMUM 0x0a
EOF

- Build the port with the patch: make.
- Uninstall the existing efl package: make deinstall.
- Install the newly patched package: make install clean.
- There should be no need to rebuild the x11-wm/enlightenment package itself.
- Restart Enlightenment session or reboot if already in a login session.

Prognosis

I think we need to get this patch added to a new port revision in the short term. In the medium to long term we need to get it upstream into the efl codebase in a way that the efl developers are happy with. In the long term maybe we should add code to use either poll or, even better, kqueue for the I/O multiplexing in ecore. The later would be quite a lot of work. It is above my skill level at the moment, but if I have time I may choose to dig into it at a later point in time.
 

Attachments

The patch should now be in the ports tree in efl 1.25.1_9.

The upstream has been patched with a more robust approach of allowing the FD_SETSIZE to be set at build time with a meson build-time flag. An a nice check function that will output a meaningful error message if the fd reaches FD_SETSIZE.


I'm considering this done for now.
 
Back
Top