If you want that to happen, pitch in. Test the last revision of that patch. Find problems, and fix them, or hire someone to fix them. Update the bug report. Contact the mailing lists about it, asking about current problems and how to get it included.
Combating symptoms might be easier short-term. Long-term however it will mean you will be constantly trying to work around the issue. If you fix the original problem you will never have to worry about those workarounds any more.
You've already spent quite a bit of time trying to work around the symptoms, and you will constantly need to do that. On the other hand, if you spent that time fixing the issue you would be done for now and the foreseeable future.
I successful compiled kernel and base system with 1.000.000.000 as ARG_MAX value
It's a good starting point.
Do you know if changing only few lines somewhere in kernel I can get rid of INT value and use LONG value?
Before you change ARG_MAX to ludicrously large values: Do you understand what it does, and how it is used, and what the side-effects of the change are going to be? Please read the kernel source, and the library source for where programs get executed, and understand what really happens. I only know some small aspects of this, so I can't give you the full solution without spending several hours researching it.
Let me repeat what I just said: Please spend 4 hours studying the current design, and be ready to explain how your proposed change of a huge ARG_MAX fits into it. Your next post in this thread should be about 5 paragraphs long, and should contain a detailed design review of the existing code base.
My hunch: Once you do that, you will understand why ARG_MAX can not be very large, without creating massive performance problems. Personally, I've never liked the way all Unixes (since Dennis and Ken) have implemented passing parameters to programs, because globs are expanded too early, which makes passing parameters with globbing metacharacters harder. Why do you have to escape the star in the shell command "find . -name *.txt"? Because the shell insists in expanding *.txt; a better design would be to pass the parameters unmolested as strings to a program, and then the program could select to perform file name glob expansion where appropriate. This would in particular make things like "grep foo*bar *.txt" more logical. But alas, that is NOT the way Unixes (including FreeBSD) are designed, and we'll have to live with that.
And most importantly: Please explain to us, in considerable detail, what the real problem you are trying to solve is. In the previous thread, you said that you have very large directories, with up to 500,000 directory entries (probably files). In and of itself, directories of such size are not illegal, and much larger directories do happen on large compute clusters. I vaguely remember that one recent customer demanded that the file system be able to create 30,000 files per second, and sustain that rate. But: People who build systems of that scale understand the design of the underlying system, and adjust their usage at scale to the design. One aspect of that is: never use globbing or complete argument lists to pass a list of all directory entries; instead loop over the directory. In scripts, that is typically done with find and xargs, although more often systems at this scale write programs that use readdir() and process the result, usually in parallel on multiple computers.
For a design example, think about how iterators work in programming languages. Python is a good example: Stupid people think that a call to the range() function returns a list (of consecutive entries, like range(4) is the list [0,1,2,3]). But that's not really true. If you write the following code:
for i in range(1000000000):
print('Done with one billion things.')
then the implementation is not that python creates a list of a billion integers, and then the "for" primitive runs over them. Instead range() returns an iterator, which gets called repeatedly. That is exactly the same design pattern that is used by find and xargs in scripts.
Finally, let's talk about the sociological aspects of your interaction with people here. You are demanding that people spend hours researching the kernel and exec interface, to do your homework for you. You are unwilling to spend some time changing your scripts, but expect that others invest time for you to solve your problem. And you are unwilling to give back to the community, by not contributing to a flexible solution yourself. The reason you are getting so much pushback is that you are actually being obnoxious, probably not intentionally.
Thanks for the reply
I haven't studied computer science, I work in other sector
I just *use* FreeBSD, I'm not a developer, and I can write and modify little C programs, but I'm unable to make patch to the the kernel.
No, I'm not demanding that people don't have to sleep to help me, I was only asking if there was a *little* patch to kernel to compile it with large value of arg_max, something like commenting one line or something like that, not writing tons and tons of C code..
I didn't need a kernel to install on 100 machines, only in one to accomplish a job.
I have spent a lot of hours trying to compile kernel with variuos patch, and with some patch I have reached an arg_max of 2 billions.
You still haven't answered why you don't switch to using find / xargs. It is the sensible and common solution to this problem.
I understand that you can't do the kernel and API changes yourself. In that case, you're at the mercy of others; and those others have had this problem for decades, and seen no need to address it, because the problem is best worked around. As I said above: when working with large directories (my experience is file systems for supercomputing, so a million files in a directory and a billion files in a file system is what I deal with), the answer is to never use command line globbing and arguments, but to iterate over directories: Either use find/xargs, or write scripts or programs that implement this. I just looked it up: Python has a nice directory walking utility; look up the "dircache" module.
I think switching arg_max from int to long in the kernel and API is outside your skillset, given your description. There are people who can do it (I won't volunteer for it, because I've never done FreeBSD kernel work, and for legal/political reasons I shouldn't start). You could just try it and see what explodes, but I suspect it is a waste of time, since something is very likely to explode. Still, if you feel like playing with the kernel, try it; even if you waste your time, you have learned something.
What I would suggest: You should read the "daemon book", a.k.a. "The Design and Implementation of the 4.4 BSD Operating System" by Kirk McKusick and others. Not because you need to learn it by heart, and not because you'll immediately become a kernel coder, but to have some background. I just looked it up on Amazon, and found that it is remarkably expensive (over $50), and it doesn't even exactly match FreeBSD 11.X, but it still gives an excellent overview of how kernel work is done, and how major subsystems work.
Then you should maybe create some file database, or a blob manager (quasi file-in-file) or the like. Or make/use a database.
If your CS knowledge is too small, then improve that first. Learn, make small things first, and with time comes experience and you can make bigger projects.
Read ralphbsz' posts two, three times to make a start.
Retrieving millions of files from the internet of whose characteristics you do not reveal the slightest bit.
That alone sounds, humm, already a bit fishy to me, to be polite.
If it is some legit data you want to leak or such, I'd suggest you to contact competent people who have experience dealing with a lot of files, like Wikileaks. Otherwise I have real doubts that people here would like to support your "project", if they knew what it is actually about.