Hi All,
I wanted to ask if anyone here knows about how the CPU topology detection works in ULE, specifically I'm running 8.1-STABLE:
My KERNCONF is just GENERIC with a few additions:
I don't have mptable in there, as you can see, as I wasn't sure if it was required for optimal performance.
For the sake of this discussion, my particular processor is an Intel Q9550, which has two 2-way 6 MB caches. However, the topology reported looks to me like it thinks there is a single 4-way cache:
Correct me if I'm wrong, but that means it thinks the L2 cache is shared by all 4 cores, right?
Currently the kern.smp.topology sysctl is 0, which I guess means auto?
Can someone help me with any ULE tunings I should set for this specific CPU? I have to believe a multi-threaded application will run better when ULE knows there are two separate caches each shared by 2 cores vs. 1 single shared L2 cache.
So I'm looking for what sysctl.conf/loader.conf tunables I should set and/or if I need mptable or anything else in my kernel for it to take advantage of the CPU above what I already have here.
And tips would be greatly appreciated!
I wanted to ask if anyone here knows about how the CPU topology detection works in ULE, specifically I'm running 8.1-STABLE:
Code:
FreeBSD pflog.net 8.1-STABLE FreeBSD 8.1-STABLE #0 r208898M: Sat Oct 30 16:59:12 PDT 2010
[email]root@pflog.net[/email]:/usr/obj/usr/src/sys/PFLOG amd64
Code:
device pf
device pflog
device coretemp
device uchcom
device sound
device snd_hda
option NETATALK
option ALTQ
option ALTQ_CBQ
option ALTQ_HFSC
option ALTQ_NOPCC
option ALTQ_PRIQ
option ALTQ_RED
option ALTQ_RIO
option COMPAT_LINUX32
option GEOM_MIRROR
option LIBICONV
option LIBMCHAIN
option NETSMB
option NULLFS
option SMBFS
option UDF
I don't have mptable in there, as you can see, as I wasn't sure if it was required for optimal performance.
For the sake of this discussion, my particular processor is an Intel Q9550, which has two 2-way 6 MB caches. However, the topology reported looks to me like it thinks there is a single 4-way cache:
Code:
% sysctl kern.sched.topology_spec
kern.sched.topology_spec: <groups>
<group level="1" cache-level="0">
<cpu count="4" mask="0xf">0, 1, 2, 3</cpu>
<children>
<group level="2" cache-level="2">
<cpu count="4" mask="0xf">0, 1, 2, 3</cpu>
</group>
</children>
</group>
</groups>
Correct me if I'm wrong, but that means it thinks the L2 cache is shared by all 4 cores, right?
Currently the kern.smp.topology sysctl is 0, which I guess means auto?
Can someone help me with any ULE tunings I should set for this specific CPU? I have to believe a multi-threaded application will run better when ULE knows there are two separate caches each shared by 2 cores vs. 1 single shared L2 cache.
So I'm looking for what sysctl.conf/loader.conf tunables I should set and/or if I need mptable or anything else in my kernel for it to take advantage of the CPU above what I already have here.
And tips would be greatly appreciated!