Shell Learning shell scripting: List or arrays?

JohnK · Wednesday at 5:07 PM

I apologize for the vagueness but I'm trying to learn some shell scripting--NOTE: currently on line ~200 of 10,000--but I sort of hit a wall with "lists/arrays". From what I can tell from some of the reading resources I found is that /bin/sh does not have a list so-to-speak.

Basic support:

Bash:

#! /bin/sh

set -- 1 \
       2 \
       3
# print each item in list...
#for item in "$@"; do echo "$item"; done

# add items to list
set -- "$@" 4 \
            5 \
            6 \
            7
# print each item in list...
for item in "$@"; do echo "$item"; done

The above works but at this point in the post, I will use some hybrid syntax (sorry) to demonstrate my goal.

From sh(1) I see a brief on "lists" with curly braces (but I obviously need a slightly better explanation than that because I'm making a mess here).

From my test something like this isn't possible.

Bash:

#! /bin/sh
dog={
        mkdir -p ./test/level1;
        mkdir -p ./test/level2;
}

cat={
        mkdir -p ./test/level3;
        mkdir -p ./test/level4;
}

set -- $dog
# ... decision, condition, etc.
set -- "$@" $cat

# eval each item in list
for item in "$@"; do $item; done

Question:
1. What is "{ item; }" intended use etc?
2. Can I create a named array/list?
2a. Can I evaluate each item in a list?

Sub question/request: if there is any sort of "good resource" you like, please post.

covacat · Wednesday at 6:06 PM

like this ?

Bash:

#!/bin/sh

dog() {
echo  "woof!"
}
cat() {
echo  "meow!"
}
for i in "$@"
do
$i
done


$sh t.sh  dog cat dog dog cat cat
woof!
meow!
woof!
woof!
meow!
meow!

zirias@ · Wednesday at 6:26 PM

There are no lists or arrays in (POSIX/bourne) shell script, with one specific exception: You have a list of positional parameters for the script itself (from the command line) and for each shell function. These can be accessed as $1, $2, $3 etc individually, as $@ as a whole (where "$@" will quote each single value), and $# gives you the number of parameters.

The shell builtin set¹ is a jack of all trades, one of its functions is to manipulate the positional parameters list as used in your examples.

So, no, you can't have some named list in POSIX sh, all you can do is to re-purpose the only "real" list available, the parameters...

Note that you can have "fields" in a variable, it's split on expansion. How exactly it is split depends on the value of the special variable $IFS, which defaults to the canonic whitespace characters space, tab and newline. But beware fiddling with $IFS is a slippery slope and can have "funny" consequences.

Also note that quite some shells do have a notion of arrays (like bash, zsh, ...), but that's outside the POSIX standard.

--
¹ https://pubs.opengroup.org/onlinepubs/009696799/utilities/set.html

mer · Wednesday at 6:48 PM

zirias@ said:
Also note that quite some shells do have a notion of arrays (like bash, zsh, ...), but that's outside the POSIX standard.

This is a very important point.
Bash has some very useful constructs, but they are only portable to bash.
But most sh scripts can run just fine under bash (think bash is superset of sh).
bash script can run under sh only if it sticks to the sh subscripts.

JohnK · Wednesday at 7:44 PM

covacat, Very interesting. Thank you. That's nice thread for me to pull on for a bit.

zirias@, thank you for the info and warning; I will NOT be messing with `$IFS` because I can get myself in enough trouble with simple loop constructs alone.

mer, I caught the point too. `sh` only.

cracauer@ · Wednesday at 9:23 PM

You won't get very far that way. To use any kind of collections of values in sh you either need to use temp files or you need to use strings to store them, in which case $IFS is needed in most cases.

JohnK · Wednesday at 11:02 PM

So this was an interesting concept/experment. The code below is probably horribly wrong to do (and most likely very easy to trip up) but it sort of demonstrates the concept I was going for. I doubt I would get very far--past concept stages--using something like this but I will try out this idea on a slightly bigger scale and see what I can break.

cracauer@, noted. I'll nonetheless approach `$IFS` cautiously. I'm still very, very new to `sh` code.

Bash:

#!/bin/sh
# test.sh
# ./test.sh cow
# ./test.sh cat dog
# ./test.sh hi john cat dog
# ./test.sh dog hi john cat hi tom
bark() {
  echo "Woof!"
}
wag() {
  echo "wag!"
}
meow() {
  echo "Meow!"
}
trip() {
  echo "*trip* down the stairs..."
}
hello() {
  local name=$1
  echo "Hello, $name"
}
for item in "$@"
do
   case "$item" in
      dog) { bark; wag; shift; } ;;
      cat) { trip; meow; shift; } ;;
      hi) { hello $2; shift; } ;;
      *) shift ;;
   esac
done

ralphbsz · Thursday at 12:23 AM

May I ask a meta-question: Why are using sh?

If this is a somewhat complex program, there are much better programming languages, many sharing the interpreted character. I'm partial to Python, but there are other choices.

If the integration with shell-based tools is too intense for using a language where "x=`foo`" isn't quite as easy, then why not switch to a modern shell, like zsh or bash?

Yes, I've seen (and worked on) systems that had tens of thousands of lines of shell scripts (the largest script I worked on was 18K lines), using an intentionally old standard of sh (pretty much the public domain Korn shell), but it was a maintenance nightmare, requiring dedicated engineers.

zirias@ · Thursday at 7:07 AM

IMHO, the canonic reasons to script in sh are:

The job is simple enough to be straight forward in shellscript
You want to maintain maximum portability with minimum dependencies and decide the job is still manageable in shellscript

I wouldn't start scripting in some "extended" shell (bash, zsh ...) because when you add a specific runtime dependency anyways, why not make it a scripting language designed for (complex) programming? But that's just a personal opinion.

Actually, although POSIX sh has lots of limitations and strange pitfalls, it's still surprisingly "powerful". In case portability is the reason to use it, I'd recommend testing your script in multiple shells. Especially pbosh from shells/bosh is IMHO very helpful (it claims to implement POSIX and really nothing else), I used it last to verify this little monster, and it indeed found something to fix: My code to iterate over the characters of a string was broken in any locale other than C, so I had to add temporarily switching the locale.

JohnK · Thursday at 10:06 PM

Firstly, I am sorry. The "10,000 lines" reference was to something stupid; a mentor once told me: "STFU until you've written 10,000 lines of code in X. (meaning: you don't have enough experience so, sit back and learn)" So, I should have said: "I'm new to writing shell scripts." instead, and not made some stupid remark that no one would ever have understood. I apologize.

The actual example I have in mind may only be a hundred or so lines of code. Nothing like what your situation was. But I agree with you; should I go C or Shell is kind of what I'm evaluating at the moment. I like the method above (but that's only because I'm ignorant at the moment) and I could save myself tons of backend problems (like compiling, makefile, packaging, etc.).

ralphbsz · Friday at 6:14 AM

C or shell is a false dichotomy. Those are the extremes. There is lots of stuff in the middle. Scripting languages (perl and python are popular, but there are many others). Or a shell programming style where you mostly rely on programs to do the heavy lifting, and leave things in intermediate files or pipes. Awk, join, sed are remarkably powerful when put together.

JohnK · Friday at 1:21 PM

Yes of course, there are other languages but, mostly, I do not know them. ...Again, I'm just getting back into managing my server so it will take some time for me. The current need/project is Shell or bust at this point.

If you've got any good links/tips/whatnot on Awk, sed, and join I will take them! I would absolutely love to have that knowledge. Although, I'm hesitant towards learning from other people's code though because it doesn't teach the "why" they choose THIS over THAT and that leads to incomplete knowledge.

ralphbsz · Friday at 7:29 PM

JohnK said:
Yes of course, there are other languages but, mostly, I do not know them. ...Again, I'm just getting back into managing my server so it will take some time for me. The current need/project is Shell or bust at this point.

If you've got any good links/tips/whatnot on Awk, sed, and join I will take them! I would absolutely love to have that knowledge. Although, I'm hesitant towards learning from other people's code though because it doesn't teach the "why" they choose THIS over THAT and that leads to incomplete knowledge.

There is a really good awk book, incidentally written by Aho, Weinberger and Kernighan. I have a copy somewhere in a cardboard box. There are also lots of good sh (or bash) books and tutorials around. In general, look for O'Reilly books.

One thing I don't know a good resource for is how to idiomatically take all the tools, and string them together in sh. For example, there is a famous relational database implementation that only uses grep, sed, awk and join, organized by shell scripts. So by using each tool in its comfort zone, you can build powerful systems. But as far as I can see, everyone learns that by themselves, or by example from other people.

JohnK · Friday at 7:58 PM

Okay, now you're starting to sound like one of my mentors. "On large strings, loop instructions can be significant. Try unrolling the loop a bit..." *my: jaw-drops*

So many (better/worse/alternate) methods! I'd like to be a fly on some of you guy's wall's.

Erichans · Friday at 9:27 PM

In addition to the above, my 2 cents, from a slightly different UNIX point of view.

If you choose to "align" with FreeBSD's sh(1) usage (i.e. almost a POSIX conformant shell; ref: POSIX - entry point -> shell & utilities), to start: have a look at the the Grymoire's (Bruce Barnett) Sh - the POSIX Shell and its excellent related tutorials on quoting & regular expressions to get you started. There are no books that I know of that focus exclusively on POSIX sh programming. If "alignment" of your scripts and/or portability is important, be aware that fine books on bash are ok, but confusing POSIX sh programming with bash leads to the phenomenon of what is known as "bashism" (Linux distros are known to have bash as their "implementation" of sh).

Being able to combine the standard UNIX/FreeBSD tools, including sed & awk, constitutes a powerful toolbox; it takes some time to get comfortable with it. Basically, as an analogy, you have to get comfortable with individual Lego building blocks and how to combine them to build and combine your scripts; usually there are many different ways that lead to Rome. A then much younger Brian Kernighan phrased it as a general UNIX style way, as a lead in to the UNIX "pipe" construct:

[...] and what you can do is think of these UNIX system programs basically as in some sense the building blocks with which you can create things. And the thing that distinguishes the UNIX system from many other systems is the degree to which those building blocks can be glued together in a variety of different ways, not just obvious ways, but in many cases very unobvious ways to get different jobs done; the system is very flexible in that respect. I think the notion of pipelining is the fundamental contribution of the system is you can take a bunch of programs, two or more programs, and stick them together end to end so that the data simply flows from the one on the left to the one on the right and the system [...]

For a jump start to get some feeling what that might be like, perhaps have a look at Combining the Bourne-shell, sed & awk in the UNIX environment for language analysis.

When, as mentioned, you need or are creating scripts longer than, say, a few pages, you may want to consider alternatives, python comes to mind as it ranges from scripting to full blown programming, but other options exist as well. As for AWK, the recent The AWK Programming Language, 2nd Edition, I can recommend highly as it will also get you introduced in the way of programming (read: scripting) AWK was meant for.

fraxamo · Saturday at 11:40 AM

Erichans said:
There are no books that I know of that focus exclusively on POSIX sh programming

The closest book I know of that attempts to teach POSIX sh programming is the following: https://www.oreilly.com/library/view/beginning-portable-shell/9781430210436/. This line is from the beginning of the book: "This book is about programming in the Bourne shell and its derivatives; the Korn shell, the Bourne-again shell, and the POSIX shell are among the obvious relatives."

Erichans · Saturday at 10:25 PM

It seems that OS X (as it is POSIX certified) is a pretty good candidate to search for as well. Based on a quick scan, this also seems a book to consider: Shell Programming in Unix, Linux and OS X: The Fourth Edition of Unix Shell Programming, 4th Edition