Expand Profile-Guided Optimization (PGO) usage across FreeBSD packages


In this issue, I want to start a discussion about several optimizations that can be useful for many FreeBSD packages. I hope this forum is the best place for a talk about additional package improvements.

Profile-Guided Optimization

I investigate Profile-Guided Optimization (PGO) effects on different kinds of software - all my current results are available at https://github.com/zamazan4ik/awesome-pgo. According to the results, enabling PGO can help with achieving better overall performance in many cases. I think trying to optimize CPU usage for the FreeBSD packages would be a valuable change for the FreeBSD users.

PGO is already a well-known compiler optimization technique. Many OSes use PGO for their packages. Some of them use it less, some of them (like ClearLinux) - more. FreeBSD also uses PGO for some packages (like "libtre"). However, after a quick search over the packages, I think PGO support can/should be expanded over more packages.

As you on the link above, PGO helps for many packages. And for many of the packages above PGO right now is not enabled in the FreeBSD packages. I am talking about compilers (Rust, Clang), LLVM-based tooling (Clangd, clang-format, etc.), and smaller utilities like difftastic. I guess more PGO-suitable packages can be found.

In FreeBSD Bugzilla, I saw some activities about enabling PGO for some packages. I am talking about the following issues:
I believe PGO is already enabled for more projects but I didn't find them (yet).

Since all of these above, I have the following questions/proposals:
  • Could anyone tell me what is the general opinion about enabling PGO across FreeBSD maintainers?
  • For which packages PGO is used right now in FreeBSD? If you have some performance numbers from enabling PGO - could you please share them?
  • If FreeBSD maintainers are not against about enabling PGO for more packages - how do we want to handle it? Do we need to create a per-package issue like "Enable PGO for Rustc", "Enable PGO for clang-format" or we can use another approach? I am not familiar with FreeBSD packaging policies and want to get your opinion about that. As far as I see, before for PGO in FreeBSD the way with per-package issue was used.
Post Link Optimization

Post Link Optimization (PLO) is an optimization technique to optimize a program additionally after PGO by some additional tweaks like improving code locality in a binary according to the runtime profile (reducing I-cache misses). In the PLO area right now there are two main tools - LLVM BOLT and Google Propeller.

According to the Facebook Research Paper (https://research.facebook.com/publi...binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for various packages like compilers and interpreters. I think it would be a good idea to enable LLVM BOLT for some packages to deliver faster binaries for users (since Propeller is less stable right now, in my opinion).

Here I got some examples of how LLVM BOLT is already integrated into other projects:
So at least for the projects above LLVM BOLT effects are tested and some preparations are already done in the upstream projects. In this case, it should be easier to enable BOLT for these packages in the future.

For some projects right now there is ongoing work on integrating LLVM BOLT into the build scripts:
More about LLVM BOLT performance results for other projects can be found in:
Some OS (like Solus) already integrated BOLT into their build scripts:
https://github.com/getsolus/packages/blob/main/packages/l/llvm/package.yml#L116 (LLVM package)
Unfortunately, right now LLVM BOLT does not work on FreeBSD for some reasons: https://github.com/llvm/llvm-project/issues/72205 . If you believe that such optimization would be valuable for FreeBSD too - please vote in the LLVM upstream for the issue/write an email to the mailing list. Maybe BOLT developers will try to implement *BSD support sooner.

Thank you a lot for reading the post!
P.S. I hope so many links are not against the forum rules - I just want to prove my words with some actual benchmarks/papers and show several existing PGO/PLO examples.