Toward a smarter ports system

I just had an update to lang/php5 port and all of the related php5-* libs because of a minor change in one of the dependencies that one of the php5-* ports depends upon (for more info see this GitHub commit).

So, basically, I think the ports system should be upgraded with some kind of "delta awareness" in which packages can be "notified" if certain change requires recompile or not. Taking the above linked commit as example, the maintainer could flag the ports system that revision from 5.4.33(_0) to 5.4.33_1 (in all the relevant php5* packages that inherited the PORTREVISION) does not require recompile.

Because recompiling a dozen of ports just because one barely related port got a PORTREVISION update is wasting energy, and with that power and money. Elsewhere on the forum it was mentioned that the binary pkg system can't be updated more than once because there's no sufficient hardware power. Well, perhaps optimizing the ports system (from which the pkgs are built) is a great start toward having sufficient hardware power for the pkgs?
 
The port versions are usually not bumped unless there is a real need for recompilation of dependents. The real reason why recompiling is needed quite often is not related to the FreeBSD ports system or any kind of flaw in it, it's direct consequence of the compiler/(dynamic)linker/runtime model used in software development all around. You have source code, object code, link libraries and the run time environment. Any change in the source code propagates changes to object code that can not be tracked completely backwards from the object code because of compiler optimizations. It's also not possible to do partial rebuilds of the object code because the compiler treats the output always as a single unit. It's only on the level of dynamic link libraries where you can potentially break the units apart and rebuild only needed parts assuming the used dynamic link libraries haven't changed. However, when a dynamic link library changes the dependents must be recompiled because the linker can not rewrite (edit) already linked binaries because it has no access to the source code in a form it could use for such task, the only option is to produce new set of object files from the source code and link them to a new executable with updated references to the used dynamic link libraries.

Hope this clarifies the issue.
 
kpa said:
Hope this clarifies the issue.

I understand that. But that's just a reason more to make the whole process more sensitive to what change is really required. Perhaps the building/linking process should be made not to require recompilation in cases like this. I'm pretty sure that, for example, ftp/php5-curl has zero dependency on libsybdb.so which belongs to databases/freetds. I don't know what exactly LIB_DEPENDS does (does it change the linker lines or is it just used to verify the dependency integrity), but I'm pretty sure /usr/local/bin/curl /usr/local/lib/php/20100525/curl.so for example is not linked to libsybdb.so, according to the output of ldd.

The only effective change in lang/php5 Makefile that triggered the recompilation of everything php5* was this. Even more so it's conditional and based on the mssql knob, which in my case was off.

Code:
 .if ${PHP_MODNAME} == "mssql"
-LIB_DEPENDS= libsybdb.so:${PORTSDIR}/databases/freetds-msdblib
+LIB_DEPENDS= libsybdb.so:${PORTSDIR}/databases/freetds

On the other hand, devel/ccache might fix the problems of unnecessary recompilations, but the problem is, according to my experience from Gentoo, devel/ccache can't be trusted and has been known to break compilation and actually slows things down (wasting even more energy). I'm pretty sure the same would apply to FreeBSD as the problem is not OS specific, but in the concept of ccache itself.

Edit: fixed curl path above, was looking at wrong package.
 
Is this something that could be solved by a shell script which, given a port or metaport as a parameter, decides if it does / does not need upgrading due to a minor version bump or despite one? And work in parallel on a given set of installed ports or all installed ports? Maybe even rank all installed ports on "severity" of needing an upgrade. For instance, I once put together a series of pipes or scripts that had at the bottom result libraries which may need upgrading, and at the top desktop programs that may need upgrading. [And/or had a large number of ports depending upon them, in the case of library ports]. One could upgrade, say, the libraries at the start, then pick and choose ports to upgrade from the remainder, saving time and effort weekly. pkg upgrade can sort of do that, if you answer "N" and scroll back and just reinstall the ones in the center, given three sections of upgrades about to be performed. [IIRC].

Though it did not fix chrome...
 
drhowarddrfine said:
This all sounds like what /ports-mgmt/portmaster was created for and /ports-mgmt/pkg otherwise.

Nope. What this is about is adding more information to port descriptions and packages so that unnecessary rebuilds could be potentially avoided. It's not so easy as it sounds as I described above, there are very few scenarios where a rebuild could be avoided if there was a bit more knowledge about why a portversion got bumbed. Usually it is because something changed in the source code, options or dependencies changed, in those cases no amount of additional information is going to save you from a rebuild because of just how certain things work. Other changes to ports that do not touch source code/options/dependencies are usually such that there's no need for a version bump and therefor no rebuild is required.
 
Here's another idea for improvement. pkg awareness of demons. When a running demon dependency, or its own package, is updated, pkg should be aware of it and warn the user, or optionally autorestart the demon. I believe it can be done from pkg itself, for example, any package that drops a file into /usr/local/etc/rc.d is marked/remembered, so pkg will know when it might require restart. Maybe it can even be extended to be aware of all running applications, not just demons.
 
AzaShog said:
Here's another idea for improvement. pkg awareness of demons. When a running demon dependency, or its own package, is updated, pkg should be aware of it and warn the user, or optionally autorestart the demon. I believe it can be done from pkg itself, for example, any package that drops a file into /usr/local/etc/rc.d is marked/remembered, so pkg will know when it might require restart. Maybe it can even be extended to be aware of all running applications, not just demons.

From the sample pkg.conf(5) /usr/local/etc/pkg.conf.sample:

Code:
#HANDLE_RC_SCRIPTS = false;

Code:
HANDLE_RC_SCRIPTS: boolean
                  When enabled, this option will automatically perform
                  start/stop of services during package installation and dein-
                  stallation.  Services are only started on installation if
                  they are enabled in /etc/rc.conf.  Default: no.

The setting is left to false by default because defaulting to true would give too many possibilities of the rc(8) scripts doing something unexpected without informing the user, now you have to turn it on yourself and pay attention to what the scripts are going to do when the packages get installed/deinstalled.
 
Interesting, I must have skipped that part... But, that's a bit of a poor implementation. The better way would be for pkg to warn or ask the user to restart certain service, and a command line option could be provided to default that to yes... Also, pkg is just one part of the solution, what about the base system demons?
 
AzaShog said:
Interesting, I must have skipped that part... But, that's a bit of a poor implementation. The better way would be for pkg to warn or ask the user to restart certain service, and a command line option could be provided to default that to yes... Also, pkg is just one part of the solution, what about the base system demons?

The matter was discussed a while ago on the mailing lists and most of those in charge of anything were adamantly against automatic start/stop/restart of any service, port of base system. It would violate the principle of minimum amount of surprises when doing system maintainance, on a server environment the system admin is the boss and knows what he/she wants and automatic start/stop/restart on install/deinstall has no place there. It's also easy to argue that the rc(8) framework is a little too simplistic and limited to really facilitate proper automated start/stop/restart operations on install/deinstall so it's safer to keep the default to false for now. When we get something better, let's say OpenRC or even something like Apple's launchd it might become feasable to allow automated start/stop/restarts of services on install/deinstall.
 
kpa said:
It would violate the principle of minimum amount of surprises when doing system maintainance, on a server environment the system admin is the boss and knows what he/she wants and automatic start/stop/restart on install/deinstall has no place there.

That doesn't make much sense. It would be just a convenience option to pkg, I'm not suggesting absolutist all-or-nothing-restart without a question. It doesn't even have to be automatic, pkg can present a message like "Service XY was just updated, restart? (y/N)". It can be both, for instance pkg could have options like pkg update --restart-demons and pkg update --warn-demons (or whatever the better sounding options would be), the former restarting and the latter asking to restart the services.

But the biggest reason why I'm suggesting this is not convenience, but security. If a package X depends on Y and Y had a security fix, how do I know that I have to restart X? What about X depending on Y depending on Z and Z had a patch?
 
It's very hard to properly track which services should be restarted after a package upgrade. I wouldn't rely on pkg doing this for me or having me ask questions about it.

Would it understand that application X puts internal data in memcached, and that application might have changed its internal data formats, so it's better to clear memcached? Would it know if MySQL is upgraded, that afterwards PowerDNS needs to be restarted, as in some configurations it dies when its MySQL connection goes away?

I don't really want to worry about those things, so I simply restart every application after I have upgraded packages. It also makes sure I don't get any surprises the next time the system boots. I use this oneliner: service -e | grep ^/usr/local/etc/rc | xargs -J % -n 1 % restart. This might not be for everyone, but I only upgrade packages at set maintenance times so a few extra restarts are not a problem for me.
 
It's very hard to properly track which services should be restarted after a package upgrade. I wouldn't rely on pkg doing this for me or having me ask questions about it.

It might be without introducing new property values for package metadata. And why not pkg? Ultimately pkg is the package (de)installer. The ports system just compiles a package for pkg locally.

Would it understand that application X puts internal data in memcached, and that application might have changed its internal data formats, so it's better to clear memcached? Would it know if MySQL is upgraded, that afterwards PowerDNS needs to be restarted, as in some configurations it dies when its MySQL connection goes away?

Eventually, that would be really great and part of a really smart ports system.

All that matters here, though, is that the admin is somehow notified that package Z that was just updated (or is to be updated) is a dependency of X (through Y) and that X has a running daemon. pkg can then just inform, do nothing or restart the service, depending on which flags you used for pkg. Default could be do nothing, don't even inform, like it's right now, for that minimum surprise principle.

I don't really want to worry about those things, so I simply restart every application after I have upgraded packages.

Well yeah, that's one solution. Sometimes that's not an option, though. But this thread is a bunch of suggestions for a smarter ports system anyway.
 
The new PKG packages have support for annotations, free format notes that can queried for when the package is installed. Those would be ideal for any kind of metadata such the one suggested. I'm still of the opinion though that PKG shouldn't know about services if it can be avoided but leave the decisions to the service(8) infrastructure.
 
Back
Top