How to use Core i7 turbo in FreeBSD

I'm running FreeBSD 8.2-RELEASE/amd64 and I just upgraded my server to a Sandy Bridge Core i7 2600k, which out of the box runs at 3.4 GHz and turbo on 1 core to 3.8 GHz or 3.5 GHz on all 4 cores.

As I understand it, it operates by kicking in when "at 100% load". I don't know if this means the hardware detects all 4 cores or 8 threads maxed or if there is some requirement on the OS side to indicate it is "fully loaded".

What I did try was to set the turbo to a multiplier of 35 for all 4 cores, so effectively under full load it should go to 3.5 GHz on each core. I timed an ffmpeg encode which ran in 40.41 seconds according to time.

I then set the turbo multiplier on each core to 40, so effectively 4 GHz. However, timing the same ffmpeg compile, it ran in the same amount of time - 40.40 seconds.

I read in a few places that in order for this turbo thing to work, cx_lowest had to be set to C3, so I did that for all 8 logical cores, but now when I time the ffmpeg encode, it takes 62 seconds. So it's only about 65 % as fast as when cx_lowest was set to C1.

Do I need to be running powerd for the turbo to work? Or do I need to somehow tell FreeBSD to inform the CPU it's at "full load" so the turbo will kick in? If there's a guide or documentation on which loader.conf and/or sysctl.conf tunables I need to set, I'd really appreciate any insight here. I found a few mentions of turbo on some of the freebsd lists, but nothing definitive on how to properly allow turbo to work.

Thanks in advance!
 
To test whether turbo was being used at all, I timed a single threaded ffmpeg encode in the following two scenarios:

1. set BIOS turbo to a 40 multiplier (4.0 GHz), so base frequency is 3.4 GHz, turbo to 4.0 GHz

2. disabled turbo mode in the BIOS entirely

The resulting ffmpeg encode took the exact same amount of time with both configurations. I tested configuration #1 with cx_lowest set to both C1 and C3, it did not affect the performance.

So it appears to me like turbo is not working, so it seems it requires FreeBSD to do something to notify the CPU somehow, so it can kick in the turbo mode.
 
Wouldn't it be a better idea to leave things alone in BIOS, and test something that you can set the number of threads on? If the given app is able to run well in parallel, then if turbo works, running on 1 thread should be more than 1/4 of the speed of running it with 4 threads going. Or you could spawn several instances of a single threaded app.

I have a Xeon that should be able to do turbo, but I must admit that I've never tested it.
 
I'm testing with ffmpeg with 1 thread or 8 threads, and the performance is unchanged in either workload. So turbo appears to do absolutely nothing for me, but I'm hoping I'm just missing some tunable/knob in FreeBSD that communicates with the CPU to enable the turbo.

I've asked in various irc channels, forums, etc, but I cannot get a straight answer on whether this "turbo mode" can be enabled on ALL FOUR cores (e.g. 4 cores operating at 4 GHz) or if it's restricted to ONE core running at some higher frequency. If it's just one core, then I agree it's probably a pointless feature for a server. But I would think all the people used to overclocking their CPUs (gamer types, etc) would be screaming bloody murder if they couldn't overclock all of the cores...But maybe that's truly how it works - run 4 cores at stock speed or 1 core at turbo speed. If so, I don't want turbo enabled anyway. But again, I haven't gotten a straight answer on whether the turbo thing can be done on 4 cores simultaneously or not.
 
The answer is no. And maybe I'm showing my age, but how many games are going to be CPU bound, let alone bound by more than the performance of a single core?

As far as I understand it, the whole idea of turbo is that the system overclocks one core at a time, each of which rapidly gets hot. After a small time, the CPU then switches the thread to another core, that gets hot, and so on and so forth. Obviously none of this switching can go on if all cores are in use, because the chip would overheat. Turbo is basically just a bone tossed towards single threaded apps. Using 4 cores is still roughly 4 times the speed increase if you can do it.

It seems that your CPU is a higher end one where the cores are running so fast on average that there is little point to turbo mode. 17% increase - not bad, nothing to write home about. Won't be much difference in performance from the i5-750 in a single threaded app. What you are paying for is that all your cores are running much faster in stock configuration when running a parallel load. Because the cores in an i5-750 run at 2.6GHz, turbo is a big speed boost. Turbo is really only a big selling point on CPUs with low average clock speed, enabling things like games to have good performance.
 
Actually, I think the Sandy Bridge cores can turbo all 4 cores simultaneously, within the limits of the thermal and electrical envelopes, which is a vast improvement over the previous generation i7 which is exactly as you described - 1 core can turbo at a time.

So I'm not sure if it was something I fiddled with in the BIOS (this board documents this switch being in the off position but the LED indicating it's on is lit, so I'm in the "off position" but the LED is lit) or the addition of these to loader.conf:

Code:
hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1

But my ffmpeg run time has improved by about 15% for a single thread and also 15% when running with multiple threads (ffmpeg -threads 0 is automatic, not sure what # of threads it's actually using). So somehow I've properly enabled turbo at 4.0 GHz for all 4 cores. I'll have to see if it was that BIOS and motherboard switch I changed or those changes to loader.conf. I'll report back shortly.
 
jkcarrol said:
Actually, I think the Sandy Bridge cores can turbo all 4 cores simultaneously, within the limits of the thermal and electrical envelopes, which is a vast improvement over the previous generation i7 which is exactly as you described - 1 core can turbo at a time.
Ah, you are probably overclocking then. In that case I'm not sure what applies. I leave everything at the stock clock. I like reliability and low power consumption. The most powerful game I have played was old 3 years ago.
 
Yes, only to see if I can determine if the turbo is actually kicking in or not. I plan to run it stock once I figure this out. And stock will kick in turbo at 3.8 GHz for 1 core or 3.5 GHz for all 4 cores (the stock frequency is 3.4 GHz). I could disable turbo, but I wanted to see if it was actually working.

What's odd is the dev.cpu.0.freq sysctl is at 2900 when I set the turbo to 4.0 GHz (40x multiplier), but when I set the turbo down to 3.6 GHz (36x multiplier), dev.cpu.0.freq is 3300. Setting it (in either case) to 3400 then allows the turbo to kick in. The results pretty clearly indicate turbo IS working and is operating on all 4 cores:

Code:
time for ffmpeg (multi-thread) encode (no turbo): 40.55 seconds
time for ffmpeg (multi-thread) encode (3.6 GHz turbo): 37.37 seconds
time for ffmpeg (multi-thread) encode (4.0 GHz turbo): 34.72 seconds

time for ffmpeg (single-thread) encode (no turbo): 25.87 seconds
time for ffmpeg (single-thread) encode (3.6 GHz turbo): 24.40 seconds
time for ffmpeg (single-thread) encode (4.0 GHz turbo): 22.07 seconds

I guess the act of setting the cpu freq explicitly to the highest frequency available from dev.cpu.0.freq_levels causes the CPU's turbo to properly kick in. :)
 
jkcarrol said:
I guess the act of setting the cpu freq explicitly to the highest frequency available from dev.cpu.0.freq_levels causes the CPU's turbo to properly kick in. :)

This is documented feature. TurboBoost works only when CPU running in the P0 state (highest frequency).
 
mav@ said:
This is documented feature. TurboBoost works only when CPU running in the P0 state (highest frequency).

Hi,

Yes, that was the piece I was missing. I had assumed by default not running powerd that it would default to this P0 state.

What I actually observed was the following:

- with the normal base frequency multiplier of 34 (3.4 GHz)
- And the turbo multiplier set to 40 (4.0 GHz)
- dev.cpu.0.freq defaulted to 2900

So is the CPU/BIOS basically tricking the OS into thinking "2900" is the base (3.4 GHz)? And then setting it to anything above 2900 will set the turbo accordingly? E.g. setting it explicitly to 3401, it uses the full 4.0 GHz turbo, right?

So we know these two relationships:

1. actual clock = 4000, sysctl freq = 3400
2. actual clock = 3400, sysctl freq = 2900

If I extrapolate for the freq_levels between 2900 and 3400 (calculate the slope, etc), it looks like:

Code:
freq    actual_freq
2900    3400
3100    3640
3200    3760
3300    3880
3400    4000

Of course, the way it actually works is by bumping the multiplier, so it wouldn't actually scale along a linear curve (y = mx + b), but instead would step up:

Code:
mult    actual_freq
34      3400
35      3500
36      3600
37      3700
38      3800
39      3900
40      4000

So I'm not exactly sure how the frequency scaling between the base frequency and the turbo would work.

So given all this and that setting the freq via sysctl to "3400" (4 GHz) work, does this mean I could configure powerd to scale under demand to set the freq to "3400"? Would there be much of a latency penalty for how quickly powerd would ramp to the full turbo?
 
Why would you want to use powerd in this scenario? I believe you would want to do this on a laptop to conserve batteries. The idea of Turbo Boost is to make your CPU appear faster for shorter periods of time (until it overheats at that frequency). That is, if you have light loads, it would run at higher clock for short bursts.

What happens if you are not overclocking and not tuning anything in FreeBSD?
 
danbi said:
Why would you want to use powerd in this scenario? I believe you would want to do this on a laptop to conserve batteries. The idea of Turbo Boost is to make your CPU appear faster for shorter periods of time (until it overheats at that frequency). That is, if you have light loads, it would run at higher clock for short bursts.

What happens if you are not overclocking and not tuning anything in FreeBSD?

I thought I would need powerd in order for it to "request the P0 state". In other words, I thought powerd was the solution to the dynamic speed request the chip needs to kick in the turbo. If there's some other way without powerd, I'm all ears. Without either powerd or explicitly setting dev.cpu.0.freq, the top speed is never requested, thus turbo doesn't kick in.

So to answer your question about what it does if I leave everything at auto (3.4 GHz + 3.8 GHz turbo on 1 core or 3.5 GHz on >= 2 cores), what I see is the following:

- dev.cpu.0.freq sits at 3000 (or it might have been 3100)
- performance is indicative of the cores operating at a consistent 3.4 GHz (i.e. no turbo boost)
- dev.cpu.0.freq_levels shows 3200, 3300 and 3400 as higher profiles than the default, but setting any of those causes turbo to kick in.
 
Back
Top