ZFS Not even an answer

It's not unusual for traditional lists to lack attention.
I know. Some topic-lists are entirely dead.
And this happens on a much larger scale, not limited to FreeBSD or IT in general, but all over the Internet, where newsgroups/mailinglists and later also webforums are loosing the relevant, sophisticated part of their content.

This doesn't bother me so much in IT - if I don't get answers, I just read the source and do what I like to do.
It bothers me much more in my other specialised realms of interest, where I very much suffer from lack of quality exchange.
.
Bear in mind, the ratio of automated email to non-automated email.
Well, one might ask: if automated mail implies that it is not read, then what is the point of it? ;)
 
I suspect it was read, and then everyone intentionally ignored it. In 1-on-1 email, if someone sends me something that I'll ignore, I'll at least reply with "acknowledge receipt". On e-mail lists, that would be rude (since there would be dozens or hundreds of empty ACK messages).

Suggestion: Ask the same question again, but this time asking what difference it makes, why it is a meaningful improvement, and explaining why people should care.
 
I suspect it was read, and then everyone intentionally ignored it. In 1-on-1 email, if someone sends me something that I'll ignore, I'll at least reply with "acknowledge receipt". On e-mail lists, that would be rude (since there would be dozens or hundreds of empty ACK messages).
What You mean is, to acknowledge that you received and read a message, but have no own background knowledge or specific thing to say about that matter so far - right?

Suggestion: Ask the same question again, but this time asking what difference it makes, why it is a meaningful improvement, and explaining why people should care.
Well, I don't know why people might care. I know why I care, but that is not a typical usecase.
Also, filesystems is not the very best addressee. NUMA would be best - but I also do not know who there is or if there is anybody, anywhere.

So, background for the layman, what is this all about: You know NUMA affinity is, when you have multiple CPU sockets and want to keep the memory (and probably I/O) near to that socket where you do the compute, because that makes things a bit faster.

Then, as You may know, I am running a scrap heap. It has four Numa domains and assorted memory - that means, whatever I got cheap on e-bay: not the same type, not the same size.
Then what I want to do is, run some applications within some Numa domain, with numa affinity. Then, use the remainder of memory for ZFS ARC, distributed over all Numa domains, without Numa affinity. Because the ARC is just a filesystem cache, it is expected to be faster than the disks, but not necessarily needs the very last bit of speedup from Numa affinity.

But, when the ARC runs in firsttouch (which is the default), it will throw the data into the memory near whatever CPU it currently happens to compute - it is almost impossible to plan that. With roundrobin, the ARC will be (a bit slower but) evenly distributed over all Numa domains - and so one can do some math from there on, in order to utilize all the memory best.

So that's the usecase. Now comes the fun part, if you read uma(9):
round-robin and first-touch policies for NUMA systems
but there is nothign specific on how to achieve this!

Then, there is a bunch of flags given in the manpage, like UMA_ZONE_NOTOUCH etc. There is, however, no flag UMA_ZONE_ROUNDROBIN mentioned anywhere. But, the source shows, that flag does exist.
Even more fun: you can put it in, and the compiler swallows it. And still more fun: the crap runs!

So now I have a beatifully even distributed ARC in roundrobin:

Code:
$ sysctl vm.uma | egrep "(dmu_buf_impl_t|dnode_t|abd_chunk|zio_buf_comb_[0-9]+)..*(flags|pages):"
vm.uma.dmu_buf_impl_t.keg.domain.3.pages: 55552
vm.uma.dmu_buf_impl_t.keg.domain.2.pages: 55992
vm.uma.dmu_buf_impl_t.keg.domain.1.pages: 55523
vm.uma.dmu_buf_impl_t.keg.domain.0.pages: 56007
vm.uma.dmu_buf_impl_t.flags: 0x1020000<CTORDTOR,ROUNDROBIN>
vm.uma.dnode_t.keg.domain.3.pages: 117411
vm.uma.dnode_t.keg.domain.2.pages: 117657
vm.uma.dnode_t.keg.domain.1.pages: 117522
vm.uma.dnode_t.keg.domain.0.pages: 117540
vm.uma.dnode_t.flags: 0x1a20000<CTORDTOR,VTOSLAB,OFFPAGE,ROUNDROBIN>
vm.uma.abd_chunk.keg.domain.3.pages: 1970458
vm.uma.abd_chunk.keg.domain.2.pages: 1971578
vm.uma.abd_chunk.keg.domain.1.pages: 1973933
vm.uma.abd_chunk.keg.domain.0.pages: 1969957
vm.uma.abd_chunk.flags: 0xa20000<VTOSLAB,OFFPAGE,ROUNDROBIN>
...
vm.uma.zio_buf_comb_131072.keg.domain.3.pages: 81472
vm.uma.zio_buf_comb_131072.keg.domain.2.pages: 83360
vm.uma.zio_buf_comb_131072.keg.domain.1.pages: 80672
vm.uma.zio_buf_comb_131072.keg.domain.0.pages: 81120
vm.uma.zio_buf_comb_131072.flags: 0xa20000<VTOSLAB,OFFPAGE,ROUNDROBIN>
...

And I have no idea why that isn't even documented, and what might be the downsides or lingering perils, etc.
Machine runs stable so far, so if nothing bad happens, I'm done with this.
 
Back
Top