Shell simple sed in a bash script.

1. I got this script I wrote in Linux, and finding out, I need absolute path to bash?
Having to change my scripts from #!/bin/bin to #!/usr/local/bin/bash

is there a way to put bash in the path so to not have to change all of my scripts like that?

more importantly.

this one script I wrote in Linux has a call to sed, and I get this error.

Code:
$ fetchwp

sed: 1: "/home/userx/bin/usersta ...": invalid command code u


when this scripts calls changePrams in that function is uses the runscript= path and it looks like it has a length issue? Because it only gets to half of the path it needs to look into to call that other script to put it into play to sed it so it can change/update a pram within that script before it gets called to run again.

I have no idea the why behind this.

Code:
#!/usr/local/bin/bash

runscript="$HOME/bin/userstartups/wallhaven.sh"

ResizeImages=$HOME/bin/userstartups/resizeImages
PathToImages=$HOME/Images/wallhaven-papers
SaveNumber=$PathToImages/.savenumber

mkdir -p $PathToImages

#to pick up where it left off on a next run
touch "$SaveNumber"

[[ -f "$SaveNumber" ]] && { x=$(cat "$SaveNumber") ; } || { x=1 ; }

DealWithAmountOfImages()
{
    workingFile1=$HOME/.RemoveImages1
    workingFile2=$HOME/.RemoveImages2
    touch $workingFile1 $workingFile2
    
    sizeOfDir=$(du -sh $PathToImages | awk '{print $1}')
    num=${sizeOfDir::-1} 
    #if greater than 500 MB of files, remove half of that amount
    [[ $num -gt '500' ]] &&
    {     amountOfFiles=$( find $PathToImages \( -type f -iname "*.jpg" -o -type f -iname "*.png" \) | wc -l) ;
        find $PathToImages \( -type f -iname "*.jpg" -o -type f -iname "*.png" \) >> $workingFile1 ;
        cat $workingFile1 | shuf > $workingFile2 ;
        
        while read f
        do 
             rm "$f" 
            
            [[ $((d++)) -eq $((amountOfFiles/2)) ]] && break
        
        done < $workingFile2 ; } ||
        { echo "
            Nothing to be done.
            $(du -sh $PathToImages | awk '{print $1}')" ; }
#reset d
d=0
}

changePrams()
{
    sed -i  '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"
    x=$((x>64?0:x+2))
    #save last number used
    echo "$x" > "$SaveNumber"
}
#frist run
changePrams
"$runscript"
#sleep 1
#"$ResizeImages" "$PathToImages"

    echo "out-and going into while loop."
    while sleep 1800
    do
        changePrams
        #echo "ran $x"
        "$runscript"
       # sleep .5
        DealWithAmountOfImages
        "$ResizeImages" "$PathToImages"

    done
 
is there a way to put bash in the path so to not have to change all of my scripts like that?
Bash is saved to /usr/local/bin/, which is in the default PATH. You're still going to need to change the shebang line. But this way should work for both Linux and FreeBSD.
Code:
#!/usr/bin/env bash
 
Is your script working on Linux?
If yes, try to use gsed or ssed instead of the original sed installed on FreeBSD.
Can you please send us the expanded line pass to sed when the error occurs?
 
this one script I wrote in Linux has a call to sed, and I get this error.
Code:
sed: 1: "/home/userx/bin/usersta ...": invalid command code u
Both Linux (GNU) and BSD have implementations of the sed(1) command that are supposed to be POSIX-compliant, but both of them have added their own features that are not portable.
This is the line from your script:
Code:
    sed -i  '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"
On Linux, the -i flag takes an optional argument. On FreeBSD, the argument is mandatory. Since you didn't specify it, it is taking your sed script argument as the argument for -i, and your file name ($runscript) as the sed script. The latter probably isn't a valid sed script, so it's causing the syntax error.
To fix that, replace -i with -iBAK or similar.
 
For what it's worth bortzy's suggestion will work. We have a couple of scripts we use that include (Linux) sed -i. I just installed gsed, changed all references to sed to gsed and this solved the problem (which is the same one, that FreeBSD's sed -i option requires you to use sed -i '' to show that there is no backup file being created). The other method would be to change sed -i to sed -i ''
 
For what it's worth bortzy's suggestion will work. We have a couple of scripts we use that include (Linux) sed -i. I just installed gsed, changed all references to sed to gsed and this solved the problem (which is the same one, that FreeBSD's sed -i option requires you to use sed -i '' to show that there is no backup file being created). The other method would be to change sed -i to sed -i ''
If you have to modify the script anyway, why would you want to install gsed? Just fix the -i option in the script and be done with it.

By the way, specifying -i '' is not a good idea, because it's not portable either (won't work if you move the script back to Linux). It's better to use -iBAK (or a different suffix) which will work with BSD and Linux. Having a backup copy of the file won't hurt anyway.
 
If you have to modify the script anyway, why would you want to install gsed?

In this case, you are right but with awk the differences are subtle with gawk and are about the code itself.
I prefer to use the original commands but sometime when you must find a "bug" on an hundred lines of code, the temptation is to install the GNU version.
 
In this case, you are right but with awk the differences are subtle with gawk and are about the code itself.
That's true, I've installed gawk because it has a few useful features that are missing from our BSD awk.
For example, I often find myself having to sort an array. GNU awk has a sort function built-in. In BSD awk you would either have to implement it yourself (for example, implementing quick sort isn't too difficult), or pipe your array data into the external sort(1) command and then parse it back. Both of those work-arounds are rather awkward ;) and inefficient.

Other useful things in GNU awk are the time functions (strftime, systime, mktime), and the ability to open TCP sockets via pseudo file names /inet/tcp/....
 
There is also mawk. I like it. It is much closer to gawk than awk.

One strange thing with awk:
Bash:
echo '2*3' | awk '{ n = split($0, arg, "[/+-*]")
                    for(k = 1 ; k <= n ; k++)
                    { print "arg[" k "] = " arg[k]
                    }
                  }'

This problem occurs only with the sequence "-*".
 
Is your script working on Linux?
If yes, try to use gsed or ssed instead of the original sed installed on FreeBSD.
Can you please send us the expanded line pass to sed when the error occurs?
yes, it works purfectly in Linux as it is written there.

to me it looks like it cannot read the complete path to the file it needs to sed. his being the parth to and file and file name.
Code:
runscript="$HOME/bin/userstartups/wallhaven.sh"
which equates to
/home/userx/bin//userstartups/wallhaven.sh
the error showing only part of that path.

sed: 1: "/home/userx/bin/usersta ...": invalid command code u

the line it is sed'ing is this finding and chaining the number part. in the first instance at the very top of that script, written by someone else, I got it off github.
Code:
# What page to start downloading at, default and minimum of 1.
STARTPAGE=46
 
There is also mawk. I like it. It is much closer to gawk than awk.
Yes, there are even more awk variants, each with its own set of non-standard features and subtle differences. Personally I try to restrict my scripts to the subset implemented by our BSD awk (which should work on Linux with GNU awk, too). Only in rare circumstances I resorted to using GNU-specific features like the sort functions. However, recently I tend to port the stuff to Python instead. :)

One strange thing with awk:
Bash:
echo '2*3' | awk '{ n = split($0, arg, "[/+-*]")
[...]
This problem occurs only with the sequence "-*".
That's because +-* within a bracket expression is a range expression that includes all characters from + to * (inclusive). Since + comes after * in the ASCII order (see the ascii(7) manual page), the behavior is undefined. GNU awk exits with an error message (“fatal: Invalid range end”), as does sed(1) (“RE error: invalid character range”). BSD awk does not exit, instead it ignores the invalid bracket expression, so the whole regular expression matches the empty string, which is probably not the expected behavior at all. I think that's a bug in our awk; it should exit with an error, too.

If you need to specify a literal - within a bracket expression, it must be the first character, for example: [-+*/]
 
to me it looks like it cannot read the complete path to the file it needs to sed.
Read posts #4 and #5, again:
On Linux, the -i flag takes an optional argument. On FreeBSD, the argument is mandatory. Since you didn't specify it, it is taking your sed script argument as the argument for -i, and your file name ($runscript) as the sed script. The latter probably isn't a valid sed script, so it's causing the syntax error.
To fix that, replace -i with -iBAK or similar.

sed(1) is parsing it as a command and /home/u uses an invalid "u" command. Due to the slashes it's parsed as a search pattern.
 
Both Linux (GNU) and BSD have implementations of the sed(1) command that are supposed to be POSIX-compliant, but both of them have added their own features that are not portable.
This is the line from your script:

On Linux, the -i flag takes an optional argument. On FreeBSD, the argument is mandatory. Since you didn't specify it, it is taking your sed script argument as the argument for -i, and your file name ($runscript) as the sed script. The latter probably isn't a valid sed script, so it's causing the syntax error.
To fix that, replace -i with -iBAK or similar.nge
yeah that is it, Linux sed automaticlly changes the backup file without the BAK added using -i

I changed the code to this:
Code:
    case $OSTYPE in
    freebsd* )
        sed -iBAK  '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"
        ;;
    Linux )
        sed -i  '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"
    ;;
    esac

I ballparked the Linux as the output to OSTYPE for now. when I get on Linux I'll have to check that if that is the actual output of that Var.

thanks.
 
Cleaner:
Code:
case $(uname -s) in
    FreeBSD)
        BAK="BAK"
        ;;
    *)
       BAK=""
       ;;
esac
sed -i $BAK '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"

If you need to make a change to the sed(1) line you only have to do it once. Less risk of issues, if you forget to edit the other options for example.
 
I ballparked the Linux as the output to OSTYPE for now. when I get on Linux I'll have to check that if that is the actual output of that Var.
Actually the -iBAK variant works on Linux (GNU sed), too, so there's no need to make a difference here.
 
Read posts #4 and #5, again:


sed(1) is parsing it as a command and /home/u uses an invalid "u" command. Due to the slashes it's parsed as a search pattern.
yes, I was working my way down the list, I got it fixed one post #6
thanks for the follow up..
 
Cleaner:
Code:
case $(uname -s) in
    FreeBSD)
        BAK="BAK"
        ;;
    *)
       BAK=""
       ;;
esac
sed -i $BAK '/^STARTPAGE/s/\(^S.*\)/STARTPAGE='$x'/' "$runscript"

If you need to make a change to the sed(1) line you only have to do it once. Less risk of issues, if you forget to edit the other options for example.
thanks, that cuts out the I for got to change the other line mistake.
 
Actually the -iBAK variant works on Linux (GNU sed), too, so there's no need to make a difference here.
that too is correct, but then there is the need to have BAK between does not need to have BAK, argument that now gets put into place. 'need' being the key work there.
 
If you need to specify a literal - within a bracket expression, it must be the first character, for example: [-+*/]
Or the last.

GNU awk exits with an error message [...] BSD awk does not exit
And mawk understand it is not a range as the range is impossible as you explain it.

But what about that?
Code:
echo '2,3' | awk '{ n = split($0, arg, "[%-/]")
                    for(k = 1 ; k <= n ; k++)
                    { print "arg[" k "] = " arg[k]
                    }
                  }'
 
I think you should create a symlink: ln -s /usr/local/bin/bash /bin/bash and install textproc/gsed. Then replace sed with ${SED} and in the beginning of the scripts use uname to check, if Linux then export SED=sed, if BSD then export SED=gsed ;)
 
And mawk understand it is not a range as the range is impossible as you explain it.
So, does it exit with an error message? That would be the correct behaviour.
But what about that?
Code:
echo '2,3' | awk '{ n = split($0, arg, "[%-/]")
                    for(k = 1 ; k <= n ; k++)
                    { print "arg[" k "] = " arg[k]
                    }
                  }'
What do you mean? It works as expected, at least in the POSIX locale (a.k.a. “C” locale). It might behave differently in other locales, because range expressions are locale-dependent.
 
So, does it exit with an error message? That would be the correct behaviour.
No, as said before, mawk understand is it not a range but a char sequence. Like gawk if you protect the hyphen with backslash.

What do you mean? It works as expected, at least in the POSIX locale (a.k.a. “C” locale).
Yes, you are right, it is correct with C. My LANG was not set to C.
So:
Code:
echo '2,3' | LANG=en_US.UTF-8 awk '{ n = split($0, arg, "[%-/]")
                                     for(k = 1 ; k <= n ; k++)
                                     { print "arg[" k "] = " arg[k]
                                     }
                                   }'

What I mean is that the behavior is not the same as gawk and mawk.
 
Yes, you are right, it is correct with C. My LANG was not set to C.
So:
Code:
echo '2,3' | LANG=en_US.UTF-8 awk '{ n = split($0, arg, "[%-/]")
                                     for(k = 1 ; k <= n ; k++)
                                     { print "arg[" k "] = " arg[k]
                                     }
                                   }'

What I mean is that the behavior is not the same as gawk and mawk.
I guess that means that gawk and mawk ignore the locale setting. In particular, they ignore the collation rules for range expressions. I'm not sure if that's a bug or a feature, but it is a POSIX violation. On the other hand, at least gawk contains quite a lot of POSIX violations that are intentional. The --posix option fixes some, but not all of them.

You can also experiment with this variant:
Code:
$ echo 'aAbBcCdDeE' | LANG=C.UTF-8     awk '{gsub(/[b-d]/, "-"); print}'
aA-B-C-DeE
$ echo 'aAbBcCdDeE' | LANG=en_US.UTF-8 awk '{gsub(/[b-d]/, "-"); print}'
aA-----DeE
For similar reasons you should not use tr 'a-z' 'A-Z' to convert from lower case to upper case. The manual page tr(1) strongly advises against that, because it will break with non-C locales. Also, it's probably a good idea to avoid range expressions completely, if possible.

By the way, if you only want to be able to use UTF-8 characters, it's better to just set LC_CTYPE and leave LANG unset (or set to “C”). Alternatively, set LANG=C.UTF-8 (same effect). It is almost always a bad idea to set LANG or (even worse) LC_ALL to anything other than “C” or “C.UTF-8”. It can cause all kind of unexpected behaviour because scripts will be unable to parse date/time output from programs, or even simple numbers.
For example:
Code:
$ LANG=de_DE.UTF-8 awk 'BEGIN {print 1 / 2}'
0,5
Note that awk prints the number with a decimal comma: 0,5, not 0.5. If that output is fed into another program or script, things are likely to break.
Another example:
Code:
$ LANG=de_DE.UTF-8 ls -l /tmp/test
drwx------   3 olli  olli       512  4 März 17:33 /tmp/test
Same thing. Yes, there are scripts that call ls(1) and then try to parse its output. Chances are that they'll become confused by that output.

For the above reasons, many of my own scripts start with export LC_ALL=C.UTF-8 in order to make them more resistant against such problems.

If your're interested to see the actual definitions of the collation rules, they can be found in /usr/src/share/colldef.
 
Thanks olli@ for your complete answer. It's a pleasure to learn from you.
Robust code is not so simple to write...

I find here an explanation similar to yours. It is an old page but now I understand.
I will try to respect your recommendations for LC setting as I must use accents.
In my scripts, I use character classes and have never problems as it seems classes are done to answer to this problem.

If your're interested to see the actual definitions of the collation rules, they can be found in /usr/src/share/colldef.
What can I see here? I can't find my locale fr_FR.UTF-8 and the files content list of code.
 
Another question:
this page says that the FreeBSD awk is the bwk also know as the nawk.
But I can find a nawk version on the packages.
Is it a new version?
 
Back
Top