Maintaining many FreeBSD boxes effectively: how do you do it

Hey all.

I wanted to toss around some thoughts and concerns that I've been losing some sleep over lately. Any insight would be very much appreciated. WARNING: LONG :)

In the past three years, I have created a lot of FreeBSD boxes, and we now have around 40 of them (mostly 8.4, some 9.2) which is only rising. There is also some Ubuntu here and there, which feels sorta-FreeBSD-ish to me. All run misc applications.

I think we are doing a fair job at keeping the FreeBSD boxes in the air. I use a 'template' FreeBSD image with all our commonly used ports and configuration for quick installs. The FreeBSD base is easy to keep up-to-date. Applications are all installed through ports. I have machines send their pkg_info output to a central host so I know what versions are running.

It runs okay. For now. But with the number of servers growing, I see trouble rising at the horizon, and I wonder if some of you can give me some insights and experiences. :)

Some of the problems I see in the future are the following:

1) Keeping packages updated consistently.
Custom compiling ports at their bleeding edge used to be my main reason to become interested in FreeBSD, when operating systems based on binary packages were often lagging updates and having dependency issues. Many years later, I'm afraid this choice turns out not to scale that well. Building ports simply takes so long, and the process involves some manual fixing from time to time. The result is that I find it hard to keep all ports recent all the time. I focus on security vulnerabilities, and the rest is lagging. I really would like to increase consistency by keeping all machines on the same versions, but with ports on 40 machines it could be a full-time job.

Comments: I think going forward, I must simply stop building ports from source. That implies getting rid of nonstandard make options. At this point, being consistent and stable is better than having super tweaked ports. But I'm not experienced in running FreeBSD without ports. People say that pkg is going to make it better. Is it viable to live completely on binary packages from the project repository, in the way that is common on Debian/Ubuntu for instance? Or is it best to create package building infrastructure? (How much time is generally spent on setting that up and keeping it working?) What would be a good way to coordinate pushing updates to servers? I wonder how large installations handle updating their packages consistently and safely? Do they build packages themselves, test and then during a maintenance window push them out?

2) Major version upgrades.
In 2015, we will see the end of extended support for FreeBSD 8.x. While minor updates with freebsd-update are completely trouble-free, I dread doing the suggested portupgrade -f on every box. Rebuilding all ports generally takes around 4-6 hours, takes some manual attention, and feels somewhat experimental on every different box. Most important is the long maintenance time though.

Comments: I have tried this procedure on a few boxes and don't see it as really desirable. Currently I just do a clean install, have people test it, and migrate data. This can mostly be done during business hours. But that is still an enormous amount of work. I guess it partly reduces to my ports problem: if we could quickly reinstall all packages, a lot of the time and risk of major upgrades would go away. At the other hand, there is Ubuntu which has 5-year LTS versions, and that looks appealing, as at first sight it seems to lower the frequency of disruptive upgrades and it may be easier to outsource. So, I wonder what kinds of things people have considered in order to maintain larger numbers of servers over a long lifespan. Is FreeBSD the best choice in that scenario?

3) Managing configuration.
As I mentioned, I clone a template image to create a new machine. However this makes every box sorta-custom, having sorta-different versions of packages, with developers adding configuration fixes which aren't always retroactively applied to old machines (especially if they are low-priority but potentially dangerous). It's not that bad for production, but it's suboptimal for management. Some port specific parts of /usr/local/etc that change often I keep updated in git.

Comments: I wondered for a long time what people were doing to solve this problem, but now I've been looking at Puppet for some weeks, and I think such a system is the way to go. It seems not a lot of people are using Puppet with FreeBSD, but it probably can be done, and we can start it in a simple way and then expand it as more custom configuration is needed. I think also I need to ditch my template FreeBSD image with all its customizations over the years. It's monolithic and a big unknown. It's also not compatible with many public clouds which don't offer loading custom templates. It's better to start off each machine from a published FreeBSD iso. Then through Puppet I should install the minimum of packages and configuration which are needed, and at known versions. Does that sound reasonable?

Concluding
So, these are some of my thoughts and concerns. To sum it up, I think basically my current way of working is not future-proof, and I am preparing for a big operation to completely overhaul it before FreeBSD 8.4 goes EOL. There are really many unknowns, but there is still time. I hope some of you can share your experiences, and if my thinking is off, PLEASE let me know :)

Cheers!
semafoor
 
Interesting thread.

Well, I don't maintain as many servers as you but because the costs (time) for server maintenance all comes out of my own pocket I'm always looking into ways to improve this process. And it's my experience that FreeBSD is quite suitable for that.

There are a few things to keep in mind here: all my servers use the same version of FreeBSD, 9.2 at the time of writing. Another aspect is that all my servers share the same software installation. They're all web servers and as such all use a FAMP approach ;) In case you never heard of this; that's just me being silly: FreeBSD, Apache, MySQL and PHP.

But those are two important details when it comes to mass administration in my opinion. Because this allows you to treat your server park as a sort of unit instead of several individual components. And that can help you to save up time.

Also; I do use ports which I also compile myself. This has nothing to do with bleeding edge or something, it's merely that it provides me with more control over how the package should behave (for example; being able to configure the directory which Apache should allow for SuExec usage) and it also gives me a sense of security; I've seen it being build myself, I therefore know that the code which was used is 'clean' (to some extend anyway) and that nothing funny was done to, for example, enforce compilation by making the process ignore errors or such.

Please note: I'm not claiming that this is something people do who build and provide a binary repository. I'm merely saying that there's no way for me to know if they did or didn't.

So getting to the setup; the key in saving time on my end is that I build the ports only once. My backup server, which is mainly used for storage, is also my test environment for all the new packages. It shares the same (minimal) software installation which all the web servers have. So I don't build and upgrade ports on production servers without testing.

When a port is build on the backup server, using portmaster, it's also instructed to build (and maintain) a binary package of it. This repository is then provided to the other servers using SMBFS (the other servers don't contact the backup server directly, but using the "interface jail" which is running there; as such using NFS wasn't an option).

After that it is simple; the other servers are updated almost the same way (also using portmaster) with the direct exception that they're instructed to use the packages from the main repository. And because all servers share the same installation I don't have to worry about dependencies and such because those will be automatically resolved.

With a very sweet failsave: because I have plenty of disk space all servers also have a fully maintained ports tree available. So in the unlikely event that something does go wrong where a required dependency cannot be resolved using my own repository then portmaster will automatically fall back to using the ports collection.

Although this does result in some inconsistency between the servers it also protects the individual server from the risk of software not being updated or worse; not being able to run.

Best of both worlds.

Installing new servers is basically done in the same way. I start by installing ports-mgmt/portmaster on it and then provide the software list from another server. It will then install all those individual ports, but using my own repository first thus saving a lot of time which would otherwise have been spend on building.

The main exception to all this is the base system. As of version 9.2 I'm using the main source tree to keep my base system, and the kernel of course, up to date. I have been toying with the idea to build some kind of meta package or "meta port" which could consist of /boot/kernel for further distribution, but in the end decided against it.

For the simple reason that whenever such an upgrade is in order the server needs to be taken down for a short while anyway, even if you resort to using freebsd-update. And the idea of having full control over how the base system is being build, and actually building it yourself (which is a process I usually start in the evening and then look at the results the next day), is also a very important issue for me.

So basically; common updates for my servers (ports collection) is done once and then distributed as binary packages whereas the base system is maintained on an individual, per-server, basis.

Hope this can give you some ideas.
 
1) I recommend dedicating a machine for building the packages yourself and use the built packages on other machines. The official binary packages are probably good for majority of desktop users but once you start changing the options and using customized ports for whatever reason you can no longer use the official binary packages with them because the official packages are all built with the default options.

2) Once you have the infrastructure in place for 1) you can upgrade the base OS to a newer major version and build a full set of new packages on your package building system for the new OS version and reinstall everything pretty painlessly.

3) I also use devel/git for maintaining some parts of my local configurations, it's not optimal though and I can see how it wouldn't work on larger scale.
 
1/2: Don't compile ports on production machines/jails. Once you've tested out the particular options on a development machine/jail you'll want to use ports-mgmt/poudriere as a build host. When they added SSP to ports, I used a Poudriere bulk build overnight to rebuilt all my ports and came back the next day and just did a pkg upgrade -fy to force a reinstall of all the now SSP enabled ports. Took about 2 minutes per jail to reinstall all the new SSP enabled packages. Far easier to fit that into a production maintenance window than doing recompiling on each machine.

Tutorials:
http://www.bsdnow.tv/tutorials/poudriere
http://forums.freebsd.org/showthread.php?t=38859

3: Use a configuration management framework like Puppet, Chef, Cfengine, Salt, etc. I use Puppet. How to setup a Puppet master server is linked below. That doesn't cover how to use it for clients but there are plenty of resources out there.

http://forums.freebsd.org/showthread.php?t=42071
 
I currently maintain a company with about 40 FreeBSD servers. I can definitely recommend setting up poudriere and using puppet. I can reinstall a server from scratch in about 10 minutes. Puppet does most of the hard work and getting packages from our own repository means I have complete control over the options and when to update. Setting up Puppet is a bit of a pain though. It requires writing a whole bunch of specialized Puppet scripts but it does pay off in the end.
 
Wow! Lots of interesting answers, thanks :) You have helped me a lot!

To summarize what most people said, the way to go for package management would be to set up a build host. Using just packages in production should be doable, and as soon as that works, major updates shouldn't be a problem. In the meantime, configuration can be brought under control of Puppet or something like it.

Working with Poudriere looks like it won't take too much time, so I think I'll just go experiment with it and try to convert an existing box fully to packages :) If that works, I can just gradually start replacing my old ports with consistent versions from my package builder. As soon as a box is running on known port versions, it won't be such a problem to do a major upgrade. If port versions stay the same, it should just work out of the box. Excellent!

Maybe I don't even need to do clean installs of all machines after all! — Although I'd probably like to have the boxes completely configured by Puppet. I have time to learn it, fortunately.

I've played with Puppet over the weekend and I am already starting to love it. There's not much in the way of FreeBSD modules for it, though. But that seems okay. Ideally, I'd just use cross-platform modules. Well, I've already made one: a simple firewall class that works with ipfw and iptables. Now, I can configure either a clean FreeBSD or Ubuntu install with my user/authkeys, bash settings, sshd, and a firewall using just a few lines:

Code:
node 'test' {

	include homes
	include utils
	include sshd

	class { 'firewall':
		allow => {
			'http' => ['any'],
			'ssh' => ['192.168.100.0/24'],
		}
	}
}

I haven't experimented driving package updates from it yet; I think I am not going to spend time on pkg_* and will just use pkgng alone. It looks extremely promising!

Anyway! Thanks again, looks like I have lots of stuff to play with in the coming months :)
 
For Puppet you might want to puppet module install zleslie/pkgng. That will give you a package provider for PKGNG so you can use Puppet's package class. After that it's a simple matter of defining the correct class to have something installed.

Code:
package { "apache":
  name => "www/apache22", 
  ensure => installed,
}

http://docs.puppetlabs.com/references/latest/type.html#package

Before you start writing modules as I did I highly recommend getting to grips with hiera as it will simplify a lot of things.
 
Re: Maintaining many FreeBSD boxes effectively: how do you d

I'd like to give a status update and thank everyone again for their great input. :)

Puppet and Poudriere have both turned out to be essential and very much what I needed. I would definitely recommend them for anyone running more than a few machines. Together they really bring the control back that I always loved with FreeBSD when I was only responsible for a few boxes.

I've created Puppet modules for 95% of our required functionality and am now adding stuff as needed. Writing Puppet code was a serious effort, but not super crazy; I think a few hundred hours over the course of 6 months to set up everything about the OS, from firewalling to webservers, database servers, mail, DNS, etc. Basic scripting experience is enough to do it, and I get lots of suggestions and fixes just from team members reviewing my commits. I bought "Puppet 3 Beginner's Guide" which explains most that you need. It recommends to just write your own Puppet scripts so you understand fully what happens, and I would agree, especially on FreeBSD where various things are just slightly different than the assumptions made by the average Red Hat user who puts their code on Github. Most published Puppet modules have flexibility in the wrong places, are rigid in others, and would generally be a pain to make portable. As an experiment, I started making my code work for FreeBSD as well as Ubuntu, and this turned out to be realistic and cost only a little extra effort. Most of the work was in an Apache module that configures Apache, its modules and PHP extensions in exactly the same way on both OSes, including creating Debian's apache24/{conf|sites}-enabled snippet structure which I like. I am not running a puppetmaster; I don't see why I should add a single point of failure where Git is perfectly capable of maintaining text files and keeping branches. I test out stuff on a few play boxes on a dev branch, and merge to master after some testing. I haven't had huge breakages yet.

We no longer have a "golden image" which is outdated the minute you make it and can only run where they allow you to deploy it. Just do a standard ISO install or rent any cheap VPS, and run the shell script which bootstraps the package system, installs Puppet, clones the Git repository with Puppet code and runs it. Then 5 minutes of installing and the machine is done. Then it updates the base system and if necessary reboots. Pretty cool and probably all the automation I need.

I run my own Poudriere builder that builds packages for 8.4, 9.2 and 10.0. That was surprisingly quick to do. The ports and options are in a Hiera file maintained in Puppet, so it's easy to add new packages, and Poudriere takes care of the dependencies. I got rid of all my nonstandard install hacks, and am now living with a standard set of packages with the desired make options. I haven't settled on a definite schedule for building packages, but for now I build them every two weeks except when there's a Heartbleed-class vulnerability.

I am not doing unattended upgrades yet. Maybe never. To make sure that packages are always in the current version (also for instance when I change a make option, or a port is rebuilt because of a dependency change), my upgrade script now simply forces a reinstall of all packages with pkg upgrade -f -y and then rebuilds the configuration using Puppet (some ports tend to overwrite their configuration on reinstall, but Puppet deals with this, just doing the same it would do on a clean install). This goes fine, but I'm keeping a close watch on it for now.

When I started this thread, I had many outdated boxes and none had configuration management. So far, I'm now running 15 boxes on Puppet, which is better than my original goal, but this is mostly due to growth, as I wanted all new boxes to be completely Puppetized. I did a few major upgrades of old machines, which turned out to be relatively easy with pkg and Puppet. Simply do the major upgrade, force a reinstall of all packages, re-run Puppet to restore configuration files to the known good state, and it's done. It still requires careful testing of applications, but without Puppet and a Poudriere repo, it would have been an enormous task.

Another thing that has helped tremendously is getting great in-depth monitoring. If you run mostly web applications as I do, they are usually pretty easy to check from outside. Just doing shallow tests on TCP ports is not enough (although it's better than nothing). I created a small script that runs through a list of URLS for every site, but there are lots of enterprisy QA systems too that can fetch them and ping you whenever some page returns an error or an unexpected response. When you are making changes to lots of machines at the same time, it feels a lot better if you just know that the machines are still doing their job without looking at them individually.

What I always found great about FreeBSD is that it feels like a system of basic composable units in the unix way of using textfiles and simple tools for management. This turns out to be great in combination with Puppet's templates and its service/notify functions which can automatically reload daemons when their state changes. It supports the unix way of working pretty well. I'm glad that I've been able to kick our way of working with FreeBSD into the next decade, if only so that I can still ignore systemd for now!

Cheers!
 
Re: Maintaining many FreeBSD boxes effectively: how do you d

I'm glad the suggestions were useful. I'm still using Puppet and I've recently been evaluating using Foreman on a Puppetmaster to centralize both management and provisioning. I've tested doing a TFTP installation of a Linux client completely through Foreman/Puppet in a VM environment and on my real hardware I've used my existing PXEboot configuration and handed the box to Puppet at the end for provisioning. In both cases I've only done management end to end on Linux clients since that is the biggest part of my environment. FreeBSD clients only get basic config management. I haven't tied together everything yet and still have some work to do to make things work the way I want. Sounds like that might be the logical next step for you as well. Get rid of walking around with an ISO to get things started and full automate everything.
 
Back
Top