Solved Where should "help" output go?

ralphbsz · Jun 16, 2023

kpedersen said:
Git is weird.

Yes. And it makes perfect sense for it to be weird. To begin with, it was created by a very unpleasant person, who has no respect for tradition and for good engineering, and likes to offend people (Linus). Matter-of-fact, its design goal was the "principle of the most surprise": do everything in a fashion most backwards and unusual, deliberately the opposite of cvs. And it was created as a deliberate reaction to two unpleasant reactions Linus had with people (Andy T and Larry M). The name "git" is a deliberate insult to either Andy and Larry (that's their interpretation), or to himself (that's his excuse).

and it is a general rule that good software should do the opposite of what Windows does.

No. In some cases, Windows made really good design decisions. Those we should copy. But only those. For example, the "dir /?" auto paging should go. On the other hand, SMB/CIFS is a much better file sharing protocol than NFS. The attitude of "let's do the exact opposite of what has been done before", when used uncritically, leads to exactly the kind of disaster that git has become.

Anyway, to the topic at hand. Here are the ways I like to do things:

If the program runs successfully, stderr will have zero bytes of output. If the program fails, it will have error messages and helpful suggestions on stderr. This immediately implies that length(stderr) > 0 happens if and only if status != 0.
stdout contains the output that can be processed by a downstream program. If the output is in a specific format (for example a table where every line has exactly 7 columns), every line on stdout will be in that format.
This implies that it is quite possible for there to be output on both stdout (real results) and stderr, if the program fails after it started working.
It also means that if you print a brief usage message (the one that explains what options exist), usually in reaction to a "-h" flag, it violates the rule about formatting above. The way I work around this is: if -h is given, it must be the only option, and the program will output a brief help message (usually explaining the options and parameters only), perhaps with a link to more extensive documentation, for example "See man elephant" or "See /usr/local/share/elephant/readme.txt" or "See http://localhost/elephant/readme.html".
That brief usage message is also given (on stderr!) in case the parsing of command line options and parameters doesn't work.
There is a significant difference between a brief help message (which you should get from "elephant -h"), a user manual (which is available as "man elephant"), and in-depth documentation (which might be in a readme file or web page). The extensive documentation will contain the brief help message and user manual as subparts, but it should also include things like extensive examples, usage patterns, theory of operations, and software design and debug / maintenance guides.
Most command-line Unix programs are written as filters or generators: They may read information (on stdin by default, from files when necessary), and they send results to stdout. For those, there is no reason to support sending the output to a file, nor is there a need to redirect it to a pager. If the user wants to run the output through more, they're free to do so, and it is easy to do.
Another reason to not automatically page the output is this: Today, we universally use virtual terminals; nobody use a Wyse or a VT100 any longer. That means that nearly all CLI users have a scroll bar. For an output of 100 or 200 lines, some users (including me) prefer to have the output simply show up on the screen, and instead of saying "elephant | more", I just scroll the window around. Forcing me to hit the space bar 20 times to get all the output just causes pain, for no gain.
The exception to this rule is a whole other class of programs, namely those that process commands internally. The standard mail program or debugger is an example: You start them, and they give you a prompt. At that prompt you are not dealing with a shell, but with a built-in command parser (often built around readline). Here, the user doesn't have easy facilities available to run the output of a command through more (or grep or awk), so automatically paging the output (based on the $PAGER variable) is a good idea.
On human-readable output, we can today assume that all terminals can handle ansi color/rendering escape sequences, and common Unicode characters. So it's perfectly fine to report a temperature as 69.2°F, or a measurement and regression accuracy as "54.7 ± 0.7 psi χ²ₙ=1.37", because I'm sure that degree, chi, plus minus and subscript n will be available on all terminals (including the console). Similarly, a diff tool can use red and green text to mark changed and identical sections, and boldface changed words. But if you do that, make sure to provide a command line switch to turn escape sequences and unicode off, for processing the output with downstream tools (such as awk) that don't like these. And to help users who are color-blind or neuro-diverse and don't like to deal with rendering.
Changing output formats (including turning a pager on or off) depending on whether stdout goes to a terminal device or a pipe or a file is evil. It is an example of making decisions based on heuristics that are sometimes right, but often enough wrong. For example, whether I run "diff elephant hippo" or "diff elephant hippo | more" should not change whether the results are colorized or not. Too many programs violate that rule.

Just a historical note: The Unix shell was never really designed, but it was an accident. DMR and Ken wrote a quick hack for the first few versions, and they wanted to show off the ability for programs to be piped into each other, which was really novel at the time. A lot of decisions from that time have become traditions, and we now have to stick to those traditions to honor POLA. In contrast, when Digital wrote the VMS operating system, it actually designed it coherently, and learned from the mistakes of products (including Unix) that came before it. That's why all VMS programs that can create interestingly complex output have two command-line options: /page (to enable automatically running the output through the standard pager), and /output=..., to save the output in a file. You can even use the obvious /nooutput to suppress output, so there is no need to do "> /dev/null". The standard implementation of the /page and /output switches is provided as a library (as is the options parser, and actually the complete command-line parser), so implementing programs and making them consistent is easier than on Unix.

zirias@ · Jun 17, 2023

ralphbsz after all this rambling against git and Linus, there's probably no point going into detail too much, therefore just very short: Even with all this verbose reasoning you give, the design of man is (still) the first contradiction. There's a reason isatty(3) and $PAGER exist, and man made (intended) use of them for a very long time.

So, let's better focus on the part I agree with: Unix (POSIX?) is lacking both in conventions and (library) tooling to implement them for sure. The design of command-line arguments is already a good example. In POSIX, getopt(3) is offered, a minimal and clumsy helper to parse a command-line that only supports single-letter flags. And POSIX convention recommends to only use them. This quickly leads to pretty cryptic stuff when tools get a bit more complex, so GNU's extension for "long" flags actually made sense (and is indeed adopted a lot outside GNU world as well). But whether you use that or not, for a full-featured command-line parser, you're on your own, which of course leads to a lot of differences in implementation details... even traditional Unix tools use a variety of alternative syntax for their command-line.

How a daemon should work is yet another example. It should make sure to properly detach from the parent process and the controlling terminal, it should make sure not to keep any files/directories opened, it should properly handle and lock some pidfile, it should signal success or failure to start by the exit code of its parent process (so, this process must not exit early, only when whatever startup has to be done completed), and so on. And this makes sense, as a daemon following all of this can be easily integrated with any init and rc system and doesn't need any "tight coupling". BUT: There's no library support for all of that either. Every single daemon is expected to "get it right" with quite some complex custom code. At least, BSD offers daemon(3), which is definitely a good direction, but it's a bit incomplete and other Unix flavors don't have it....
(edit: random thought about that: If a good and complete library function to daemonize existed e.g. in POSIX, maybe this would have spared the world idiotic ideas like "systemd"?

)

You could probably find many more examples.

kpedersen · Jun 17, 2023

ralphbsz said:
No. In some cases, Windows made really good design decisions. Those we should copy. But only those.

I probably should have included "ignore all Windows / NT designs other than those David Cutler borrowed from VMS"

zirias@ · Jun 17, 2023

Adding yet another thought: If Unix would have been designed with more consistent guidelines, conventions and library functionality to support them, we wouldn't have the chance for endless debates today about how things should be done

Still, you won't convince me that offering some automatic paging for "large" output to a terminal is a bad idea. There are tools doing it (even the traditional man), it doesn't break anything you might want to do with the output, and just the fact that few tools do it isn't really convincing. It's most likely just caused by not having supporting library functionality in the system, so people rarely bother to spend time implementing it...

That said, you did convince me requiring flags like -h/--help being the only argument given is a good idea, leading to a more consistent user interface. In my generic parser, they have a special "action" type (they don't add any value to the configuration but instead trigger some immediate action), so it wasn't too hard to implement exactly that in a generic way

Now with these latest changes in my lib, my "tlsc" tool using the same configuration description as linked above is able to generate output like that:

Code:

$ tlsc -Vug frob floo --help -X host:lalala:bar:23i:r=0:s=17
Usage: tlsc [-fnv] [-g group] [-p pidfile] [-u user] tunspec [tunspec ...]
       tlsc -V
       tlsc -h

Flag -V must be the only argument
Unknown user: frob
Unknown group: floo
Flag --help must be the only argument
Unknown flag: -X
Argument `lalala' for key `port' is not a valid integer
Argument `23i' for key `remoteport' is not a valid integer
subsection: unknown key `r'
subsection: invalid boolean value `17'

astyle · Jun 17, 2023

kpedersen said:
I probably should have included "ignore all Windows / NT designs other than those David Cutler borrowed from VMS"

so, instead of windows, we get jails?

Solved Where should "help" output go?

ralphbsz

zirias@

kpedersen

zirias@

astyle