FreeBSD on Google Cloud, hardwire timer, PostgreSQL

I’m in the middle of the migration from bare metal to Google Cloud.

The first serious issue I encountered was the abysmal performance of certain PostgreSQL queries. Let me save you from reading how many hours and how much resources I wasted on finding the root of the problem.

root@xxx ~ # psql -U mage xxx -c 'explain analyze select count(1) from messages'
QUERY PLAN                                                                                             
 Aggregate  (cost=81024.35..81024.35 rows=1 width=0) (actual time=30578.241..30578.248 rows=1 loops=1)
   ->  Index Only Scan using xxx
         Heap Fetches: 0
 Planning time: 64.572 ms
 Execution time: 30578.507 ms
(5 rows)

root@xxx ~ # sysctl kern.timecounter.hardware
kern.timecounter.hardware: ACPI-fast

root@xxx ~ # sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: ACPI-fast -> TSC-low

root@xxx ~ # psql -U mage xxx -c 'explain analyze select count(1) from messages'
QUERY PLAN                                                                                           
 Aggregate  (cost=81024.35..81024.35 rows=1 width=0) (actual time=374.444..374.444 rows=1 loops=1)
   ->  Index Only Scan using xxx
         Heap Fetches: 0
 Planning time: 0.529 ms
 Execution time: 374.539 ms
(5 rows)

Yes, the first one is 100 hundred times slower. This is nothing to do with the disc cache. It’s the timer.

It’s true that not all queries are affected. The explain analyze adds profiling overhead to the execution. That’s how I figured out the issue was the timer. I wish I had noticed earlier that the system clock was screwed up too.

As you might have guessed it wasn’t me who set the timecounter to ACPI-fast. I have little idea what timecounter is. It’s in the "official" FreeBSD installation on Google Cloud Compute.

I’ve created my own version since the image is running on UFS and I wanted ZFS root. However, the system itself was copied from the image, as the image has certain daemons and packages from Google that might be better running.

I’m wondering on:

1. Am I the only one who is trying to use FreeBSD on Google Cloud in production? It sound scary. Okay, not everyone will install PostgreSQL, but sooner or later the screwed up system clock should be noticed. It might also explain why I had much higher load on another two instances than I expected, even without PostgreSQL (exim, ruby daemons).

2. Is the FreeBSD Team who is creating the images or is it Google? I have no idea where to report this issue.

3. Is the TSC-low a safe setting? I see that’s the default value on my bare metal servers, that’s why I tried it.

I got a bit discouraged. One of the main reasons of the move is that Google offers transparent encryption for my data.

(Yes, I have to trust them. The other option would be trusting the whole world. I don’t do that.)

But now I have the impression that there can’t be many people who are running FreeBSD on Google Cloud. Which means little if any resources when I encounter the next issue. Why it is so?
I had to dig into this issue. Now I know more about the ntpd and the timecounters than before (which was close to zero).

It’s more complicated.

The available timecounters on Google Compute Instances are random. As far as I noticed they can change even on the same instance on restart. For this reason, the setting from the official image won’t work properly. Anyway, it should not be in the /boot/loader.conf

The /etc/sysctl.conf on my instances end like this:
# One of these timecounters should work
kern.timecounter.hardware: ACPI-fast
kern.timecounter.hardware: ACPI-safe

Is far as I noticed, either the TSC or the TSC-low is available, but not both. That’s why I use the double setting. None of my instances has ACPI-safe. That was in the FreeBSD image’s config file, so I kept it.

I also modified the /etc/ntp.conf

server iburst minpoll 4 maxpoll 4

(I also tried tinker it but it didn’t help.)

The result is that when the timecounter is one of the TSCs then PostgreSQL behaves normally. However, the system clock gets messed up. The difference can be a few seconds in every 16s (that’s maxpoll 4). I could not make ntpd correct that.

When the timecounter is the ACPI-fast then PostgreSQL’s explain analyze gets crazy (and who knows what else suffers). However, the system clock seems to stay normal. Anyway, I will keep the maxpoll 4.

When I experienced messed up system clocks, before I applied these settings, that was because the original image contains the ACPI-safe, which was never available so far, and the one FreeBSD picked up randomly was either one of the TSCs or the ACPI-fast.

There is a related page:

I’m not sure that the lag with the ACPI-fast affects only the PostgreSQL’s explain analyze and nothing else.
I too am using the FreeBSD images in GCloud. Clocks are running WAY too fast. For every 10s I am gaining about 3.

% sysctl kern.timecounter.hardware
kern.timecounter.hardware: TSC
What setting should I be using?

My own images didn't have the problem. I didn't realize that by using 'official' images I would be introducing issues.
It would appear that two of these settings are not available on my instances...
% sudo sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: TSC
sysctl: kern.timecounter.hardware=TSC-low: Invalid argument

% sudo sysctl kern.timecounter.hardware=ACPI-safe
kern.timecounter.hardware: TSC
sysctl: kern.timecounter.hardware=ACPI-safe: Invalid argument
Leaving only...
sudo sysctl kern.timecounter.hardware=ACPI-fast
kern.timecounter.hardware: TSC -> ACPI-fast
Also updated /etc/ntpd.conf
server iburst minpoll 4 maxpoll 4
Clock seems to be running better now. Will monitor...


On a different instance, the available timecounters were different. On that one, kern.timecounter.hardware=TSC-low was available. So I guess it depends on the underlying hardware an instance lands on? I set that one to ACPI-fast too, since it seems to work well.

Clocks still running correctly now.
The available clocks are at random. At least the TSC ones. ACPI* seems working, except it makes PostgreSQL’s explain analyze useless. I can live with that.

The clock might be different on instance restart. That’s why you should to add both lines. Yes, you’ll see an error in the logs for the clock not present, but on next reboot, that might be the present one.

kern.timecounter.hardware: ACPI-fast
kern.timecounter.hardware: ACPI-safe

Put this into your /etc/sysctl.conf and delete the entry from your /boot/loader.conf (in case you used the provided images).

It will work.

$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
*   2 u    1   16  377    0.207   -0.022   0.032

If you don’t do this, FreeBSD will pick up a hardware clock at random at book. Should it chose a TSC, your system clock will be screwed.

I personally haven’t seen the ACPI-safe yet, but it can be present in other locations as the hardware is different.

The FreeBSD images at least respect Google’s advice to not mix the ntp servers. Unlike the Ubuntu images.

Once you have an instance that works fine, you can use it for creating new instances.