ls -R behavior

Why does
Code:
ls -R
not work with wildcards? For example if a number of subdirectories have .txt files why does
Code:
ls -R *txt
not show any results? To show all the .txt files I have to involve grep.
 
Ah, you don't understand how the Unix shell and globs (a.k.a. wildcards work).

Suggestion: Find a book that explains Unix basics. No, I don't know what book that would be.

Here is a nutshell explanation, perhaps hard to understand, and highly incomplete. Wildcard expansion is done by the shell, before calling the program. Say in my current directory /home/ralphbsz/, I have 3 things: a subdirectory ax/, a file ay.txt, and a python source file az.py. If I say "ls", the shell starts the ls executable (which is a program), which lacking any particular input will show a listing of files in the current directory. If I say "ls ay.txt", the shell will start the ls program with argument ay.txt, and ls will show a listing (perhaps with details) of file ay.txt. If I say "ls *.py", the shell will expand that by searching THE CURRENT DIRECTORY for all files that match *.py, and it will run the ls program with argument az.py, which will show me exactly that one file.

Where it gets interesting: If I say "ls a*", the shell will translate that into "ls ax ay.txt az.py". The ls program will show me all the content of directory ax, plus the two files ay.txt and az.py. So for example if the directory ax/ contains files ax/1.foo and ax/2.bar, the output will be something like "ax/1.foo ax/2.bar ay.txt az.py" (in reality, ls formats the output a little more nicely, but you get the basic idea).

OK, so what would happen in my example if I do "ls *.txt"? I would get ax.txt. What if I do "ls *.foo"? It would output NOTHING, because the glob expansion of *.foo just doesn't work, so the ls program would be told to please give details of the file *.foo, which does not exist. (Footnote: A file name *.foo might exist, it is not illegal to create files whose names contain * or ?, but it is also a very bad idea, unless you want to confuse beginners).

All the "-R" in the world doesn't help you here: The shell takes the command line "ls *.foo" and tries to expand it right here, and that does not give the result you want.

There are many ways to get what you want. The most common one (I think?) would be to not use the -R option on ls, but instead to use the command that's designed for this usage: "find . -name \*.foo" (yes, there is a backslash in front of the * which I will explain in a moment). The find command automatically recurses from all starting point you listed, so it will read all files and directories in *. It then can do wildcard (glob) expansion and checking internally, so if it finds any files whose names match the pattern *.foo, it will print their name. Why the backslash? To protect the * character from shell wildcard expansion: If the current directory contained a file named blatz.foo, the shell would expand the above command line to "find . -name blatz.foo", which is not what you want.

Hope this helps. It takes a while to internalize this, but is vitally important knowledge.

P.S. Personally, I think the fact that the shell auto-expands globs, and that called programs can not even go back to the shell and ask "what did the user really type" is a giant misfeature of Unix. It was a quick hack which allowed two overworked computer science researchers (Dennis and Ken) to get their quick hack operating system running faster, without having to implement globbing in all the called programs, or in a shareable library. If Unix had been designed by real software engineers (like VMS or VM or MVS or RSX-11 or Primos or ... were), instead of computer science researchers, this would have been handled sensibly. Alas, Unix was not intended to be a production environment that is user friendly, but it was intended as a research tool. Rant off.
 
ls */*.txt

I'll let you figure out why this works on your own ;)
 
As explained by ralphbsz, ls(1) does not work with wildcards, the sh(1)ell does. ? When designing your command, you can “preview” the expanded wildcards:​
Bash:
echo *txt
The output of (the shell built‑in) echo is passed to ls(1), not the string *txt (keep in mind time of check vs. time of use). ? To actually see what command is executed, invoke​
Bash:
set -o xtrace   # and disable with   set +o xtrace
If you know in advance the directory nesting of assorted txt files and it is guaranteed there is at least one txt file present in specified directories, you may write something like:​
Bash:
ls */*.txt */*/*.txt */*/*/*.txt
With respect to the number of matched pathnames I’m not sure which limit applies: At any rate everything should work fine if the expanded command line does not exceed getconf LINE_MAX.​
 
You might want to switch to zsh as your login shell. It has a huge number of useful features.
In this particular case, you can use its recursive globbing feature (“**”). It can replace certain typical use cases of find(1).
For example,

ls -l **/*.txt

will list all *.txt files in all subdirectories. When you enable the option GLOB_STAR_SHORT, you can do the same thing even easier:

ls -l **.txt

If you want to see which files are actually mtched before executing the command, then you can simply hit the Tab key. This causes zsh to replace the wildcard expression with all the file names that it expands to. Beware, this might lead to a very long command line. You can press Ctrl-U to quickly remove the whole line and start over.

Zsh provides many more features for globbing, like approximate matching. See the section “Filename Generation” in the zshexpn(1) manual page.
 
Back
Top