max value of ARG_MAX

antolap · Dec 13, 2017

I need to recompile the kernel in FreeBSD 11.1
Which is the max value I can set in ARG_MAX?

Is it possible to have unlimited value?

antolap · Dec 14, 2017

I tried with ARG_MAX = 4294967296
but make buildworld failed

which is the max value I can use?

acheron · Dec 14, 2017

Code:

/usr/include/sys/syslimits.h:#define    ARG_MAX                 262144  /* max bytes for an exec function */

antolap · Dec 14, 2017

I need to increase that value, which is the max value I can set in syslimits.h ??

tobik@ · Dec 14, 2017

Why do you need to increase ARG_MAX?

acheron · Dec 14, 2017

The size of an integer (int nchr)

SirDice · Dec 14, 2017

To prevent an XY Problem, why do you want to increase ARG_MAX?

tobik@ · Dec 14, 2017

Previous discussion: Thread 54898

SirDice · Dec 14, 2017

Previous thread also doesn't really explain why ARG_MAX needs to be increased.

antolap · Dec 14, 2017

now I need to run wget with tons of files as argument

SirDice · Dec 14, 2017

I would really suggest fixing the original problem instead of constantly trying to combat the symptoms.

antolap · Dec 14, 2017

acheron said:
The size of an integer (int nchr)

how to get that value?
with getconf INT_MAX ?

SirDice said:
I would really suggest fixing the original problem instead of constantly trying to combat the symptoms.

It would be easier if kernel accept unlimited text as argument than finding a workaround for each program

p.s.: In FreeBSD, how to get the get list of all variables with getconf, like getconf -a in Linux ?

acheron · Dec 14, 2017

antolap said:
how to get that value?
with getconf INT_MAX ?

Sorry, badly phrased. I meant the max value an "int" can hold cf https://en.wikibooks.org/wiki/C_Programming/limits.h

It would be easier if kernel accept unlimited text as argument than finding a workaround for each program

Sure, FreeBSD is an open source OS, you can patch it and submit it so the whole community can benefit of it.

p.s.: In FreeBSD, how to get the get list of all variables with getconf, like getconf -a in Linux ?

It's a FreeBSD board don't assume everybody knows what getconf is supposed to do (personnaly I don't)

tobik@ · Dec 14, 2017

antolap said:
It would be easier if kernel accept unlimited text as argument than finding a workaround for each program

PR 58803 and https://reviews.freebsd.org/D6999 as a starting point.

Then wblock@'s advice from the related thread still rings true:

wblock@ said:
If you want that to happen, pitch in. Test the last revision of that patch. Find problems, and fix them, or hire someone to fix them. Update the bug report. Contact the mailing lists about it, asking about current problems and how to get it included.

SirDice · Dec 14, 2017

antolap said:
It would be easier if kernel accept unlimited text as argument than finding a workaround for each program

Combating symptoms might be easier short-term. Long-term however it will mean you will be constantly trying to work around the issue. If you fix the original problem you will never have to worry about those workarounds any more.

You've already spent quite a bit of time trying to work around the symptoms, and you will constantly need to do that. On the other hand, if you spent that time fixing the issue you would be done for now and the foreseeable future.

antolap · Dec 14, 2017

I have tried with ARG_MAX = 2147483647

make buildworld is OK, while make kernel gives this error:

can you help me to calculate max value of ARG_MAX to compile the kernel?

antolap · Dec 14, 2017

I successful compiled kernel and base system with 1.000.000.000 as ARG_MAX value
It's a good starting point.
Do you know if changing only few lines somewhere in kernel I can get rid of INT value and use LONG value?

ralphbsz · Dec 15, 2017

Before you change ARG_MAX to ludicrously large values: Do you understand what it does, and how it is used, and what the side-effects of the change are going to be? Please read the kernel source, and the library source for where programs get executed, and understand what really happens. I only know some small aspects of this, so I can't give you the full solution without spending several hours researching it.

Let me repeat what I just said: Please spend 4 hours studying the current design, and be ready to explain how your proposed change of a huge ARG_MAX fits into it. Your next post in this thread should be about 5 paragraphs long, and should contain a detailed design review of the existing code base.

My hunch: Once you do that, you will understand why ARG_MAX can not be very large, without creating massive performance problems. Personally, I've never liked the way all Unixes (since Dennis and Ken) have implemented passing parameters to programs, because globs are expanded too early, which makes passing parameters with globbing metacharacters harder. Why do you have to escape the star in the shell command "find . -name *.txt"? Because the shell insists in expanding *.txt; a better design would be to pass the parameters unmolested as strings to a program, and then the program could select to perform file name glob expansion where appropriate. This would in particular make things like "grep foo*bar *.txt" more logical. But alas, that is NOT the way Unixes (including FreeBSD) are designed, and we'll have to live with that.

And most importantly: Please explain to us, in considerable detail, what the real problem you are trying to solve is. In the previous thread, you said that you have very large directories, with up to 500,000 directory entries (probably files). In and of itself, directories of such size are not illegal, and much larger directories do happen on large compute clusters. I vaguely remember that one recent customer demanded that the file system be able to create 30,000 files per second, and sustain that rate. But: People who build systems of that scale understand the design of the underlying system, and adjust their usage at scale to the design. One aspect of that is: never use globbing or complete argument lists to pass a list of all directory entries; instead loop over the directory. In scripts, that is typically done with find and xargs, although more often systems at this scale write programs that use readdir() and process the result, usually in parallel on multiple computers.

For a design example, think about how iterators work in programming languages. Python is a good example: Stupid people think that a call to the range() function returns a list (of consecutive entries, like range(4) is the list [0,1,2,3]). But that's not really true. If you write the following code:

Code:

for i in range(1000000000):
    do_something(i)
print('Done with one billion things.')

then the implementation is not that python creates a list of a billion integers, and then the "for" primitive runs over them. Instead range() returns an iterator, which gets called repeatedly. That is exactly the same design pattern that is used by find and xargs in scripts.

Finally, let's talk about the sociological aspects of your interaction with people here. You are demanding that people spend hours researching the kernel and exec interface, to do your homework for you. You are unwilling to spend some time changing your scripts, but expect that others invest time for you to solve your problem. And you are unwilling to give back to the community, by not contributing to a flexible solution yourself. The reason you are getting so much pushback is that you are actually being obnoxious, probably not intentionally.

antolap · Dec 16, 2017

Thanks for the reply
I haven't studied computer science, I work in other sector
I just *use* FreeBSD, I'm not a developer, and I can write and modify little C programs, but I'm unable to make patch to the the kernel.

Please spend 4 hours studying the current design

Please indicate some books to read, about kernel design

You are demanding that people spend hours researching the kernel and exec interface, to do your homework for you.

No, I'm not demanding that people don't have to sleep to help me, I was only asking if there was a *little* patch to kernel to compile it with large value of arg_max, something like commenting one line or something like that, not writing tons and tons of C code..
I didn't need a kernel to install on 100 machines, only in one to accomplish a job.
I have spent a lot of hours trying to compile kernel with variuos patch, and with some patch I have reached an arg_max of 2 billions.

ralphbsz · Dec 17, 2017

You still haven't answered why you don't switch to using find / xargs. It is the sensible and common solution to this problem.

I understand that you can't do the kernel and API changes yourself. In that case, you're at the mercy of others; and those others have had this problem for decades, and seen no need to address it, because the problem is best worked around. As I said above: when working with large directories (my experience is file systems for supercomputing, so a million files in a directory and a billion files in a file system is what I deal with), the answer is to never use command line globbing and arguments, but to iterate over directories: Either use find/xargs, or write scripts or programs that implement this. I just looked it up: Python has a nice directory walking utility; look up the "dircache" module.

I think switching arg_max from int to long in the kernel and API is outside your skillset, given your description. There are people who can do it (I won't volunteer for it, because I've never done FreeBSD kernel work, and for legal/political reasons I shouldn't start). You could just try it and see what explodes, but I suspect it is a waste of time, since something is very likely to explode. Still, if you feel like playing with the kernel, try it; even if you waste your time, you have learned something.

What I would suggest: You should read the "daemon book", a.k.a. "The Design and Implementation of the 4.4 BSD Operating System" by Kirk McKusick and others. Not because you need to learn it by heart, and not because you'll immediately become a kernel coder, but to have some background. I just looked it up on Amazon, and found that it is remarkably expensive (over $50), and it doesn't even exactly match FreeBSD 11.X, but it still gives an excellent overview of how kernel work is done, and how major subsystems work.

Snurg · Dec 17, 2017

antolap said:
I have spent a lot of hours trying to compile kernel with variuos patch...

antolap said:
...arg_max of 2 billions.

I am quite sure nobody before you earnestly thought of doing that (at least practically), and kept so determined.

antolap said:
I was only asking if there was a *little* patch to kernel to compile it with large value of arg_max, something like commenting one line or something like that, not writing tons and tons of C code..

I find it amazing that you apparently didn't attempt to think a different approach to your problem for the last two years...

antolap said:
...I can write and modify little C programs...

Then you should maybe create some file database, or a blob manager (quasi file-in-file) or the like. Or make/use a database.
If your CS knowledge is too small, then improve that first. Learn, make small things first, and with time comes experience and you can make bigger projects.
Read ralphbsz' posts two, three times to make a start.

antolap said:
... to accomplish a job...

What "job"?
Retrieving millions of files from the internet of whose characteristics you do not reveal the slightest bit.
That alone sounds, humm, already a bit fishy to me, to be polite.

If it is some legit data you want to leak or such, I'd suggest you to contact competent people who have experience dealing with a lot of files, like Wikileaks.
Otherwise I have real doubts that people here would like to support your "project", if they knew what it is actually about.

max value of ARG_MAX

antolap

antolap

acheron

antolap

tobik@

acheron

SirDice

Administrator

tobik@

SirDice

Administrator

antolap

SirDice

Administrator

antolap

acheron

tobik@

SirDice

Administrator

antolap

antolap

ralphbsz

antolap

ralphbsz

Snurg