Loading a variable

JoeSchmuck · Oct 19, 2012

This is going to seem like a very stupid question to a lot of you out there but I am a novice and just starting to very slowly learn FreeBSD. I have minimal experience with the language but none the less I am slowly writing scripts.

My question is how do I take the results from a 'find' and place them in a variable without line feeds and even more fun, add quotes around each line?

The results from the find command look like this example (the real life use could have dozens of directory listings which also would change over time and is why automating this is key):

Code:

/media
/media/Movies A-M
/media/Movies N-Z

And the output I desire is:

Code:

"/media" "/media/Movies A-M" "/media/Movies N-Z"

Here is the script I'm trying to make work...

Code:

find /media -type d > temp.dat
# somehow convert temp.dat into a continuous string and save as variable 'directories'
wait_on -w $directories

The goal is to search in all sub-directories for any file change. Wait_on performs this task but if I can't feed it all the sub-directories then it will not work properly.

I'm not necessarily looking for the complete solution although it would be appreciated, but pushing me in the right direction would be helpful.

chatwizrd · Oct 19, 2012

This should work:

# directories=$(find /media -type d -exec echo "\"{}\"" \; | tr -s "\n" " ")

So your script might now look like:

Code:

#!/bin/sh

directories=$(find /media -type d -exec echo "\"{}\"" \; | tr -s "\n" " ")

wait_on -w $directories

wblock@ · Oct 19, 2012

That does not work for me. find(1) suggests using -print0 and piping output into xargs(1) -0.

JoeSchmuck · Oct 20, 2012

Holy Cow! Now that is something to study.

Both versions appear to work to some extent.

Here is what is going on now...

Code:

#!/bin/sh
directories=$(find /media -type d -exec echo "\"{}\"" \; | tr -s "\n" " ")
echo "Directories: "${directories}
echo " "
wait_on -w ${directories}

Which produces this result:

Code:

Directories = "/media" "/media/Movies A-M" "/media/Movies N-Z"
wait_on: can't open ""/media"" for reading: No such file or directory

Now if I replace the main line with this one...

Code:

directories=$(find /media -type d -print0 | xargs -0)

The results are:

Code:

directories=/media /media/Movies A-M /media/Movies N-Z
wait_on: can't open "/media/Movies" for reading: No such file or directory

If I remove the space in the directory names the second version works fine but I don't want to have to remove any spaces as this won't be for just my use.

I can flat out define the variable and it works, not sure why the difference but I'm working it. All I can think about is there is some character slipping by.

Code:

directories="/media" "/media/Movies A-M" "/media/Movies N-Z"

So I'm working on it.

wblock@ · Oct 20, 2012

Run the find by itself to see the problem. Get that step working first. Then move on to xargs(1), and read the man page for it.

JoeSchmuck · Oct 20, 2012

Been looking at this for hours, I must be very thick headed.

This code lists all the sub-directories but they are all separated by a null character. No quotes around anything but I didn't expect it here. I can see I need to insert a quote at the beginning and end of each sub-directory and insert a space between each of those.

Code:

find /media -type d -print0

And when I'm here I can see that the nulls have now become a space. So I'm half way there. I was hoping xargs would have the "-i" parameter like some other versions of it does so it would treat a space as a space and the entire sub-directory as a single entry. I would hope that ends up putting quotes around the entries.

Code:

find /media -type d -print0 | xargs -0

I'll look into this again tomorrow with fresh eyes.

Thank guys for pushing me into the right direction. I will learn something before the weekend is out.

wblock@ · Oct 20, 2012

See -I in xargs(1), which lets you set a string to be replaced in the xargs command string. Notice also that the final argument to xargs is the program it should run on all these arguments.

JoeSchmuck · Oct 20, 2012

I think I need a book because finding materials to learn FreeBSD as a novice if proving difficult. Maybe I'm not cut out to do this simple stuff or maybe there is just an issue with wait_on not accepting the quoted string.

Here is where I sit right now and I actually think I understand a few minor things now and I believe I simplified the code, or condensed it, whatever you would call it.

This code will place single quotes around the directory names however wait_on refuses to operate on it. Same error message as before but it's now "'media'" can't be found.

Code:

#!/bin/sh
wait_on -w $(find /media -type d -exec echo "'{}'" \; | tr -s "\n" " ")

This code will work fine on directories without a space in the name and I thought it would place single quotes around the directory but it doesn't.

Code:

#!/bin/sh
wait_on -w $(find /media -type d | xargs -0 -I "'{}'")

See, I thought the {} means the value is in the brackets and whatever is on the outside gets added. I thought about using backslashes to retain the special characters but that didn't work out well for me either, although it did work some.

I was thinking some of my problems might be because I'm working inside a jail on FreeNAS which is really a customized FreeBSD. I may load a VM with FreeBSD 9.0 and give it another try but I need to play with the kids and do the family thing. I can't spend all day and night on a computer trying to figure out a small piece of code because I very well would never know if I'm barking up the wrong tree and it might not ever work. 35 years ago I knew Fortran IV, Basic, Macro Assembler, I was building device drivers and eventually learned C. I hadn't had the need to program in quite a few years to say the least. I really feel out of my element. If someone could recommend a good book to start moving me forward it would be appreciated. I'll see what the local library has in stock, they might have FreeBSD or some version of Linux that would help me out. And at this point, if someone wanted to give me the answer, I'll gladly take it. It would have been nice to figure it out myself.

Cheers

wblock@ · Oct 21, 2012

A book would say to get one command working before trying to combine it all into one thing. There are a couple of non-obvious things going on, and trying to make one big fancy command before each piece works is making it more difficult.

Each of those highlighted command names is a link to that command's manual page. Click those, and read them. In particular, see xargs(1). Again, xargs does not do anything by itself, you have to tell it a command to run.

find(1) will replace {} it finds with the found names, but xargs(1) does not. -I lets you define a string to use. For example, %:
% find /media -type d -print0 | xargs -0 -I % echo "'%'"

JoeSchmuck · Oct 21, 2012

I understand about running pieces of the items to get them working but I have a severe lack of knowledge with operators and the like. I got find working perfectly the other day, but trying to figure out how to put quotes around the data was way out of my area. Reading the manual page for xargs didn't help because I don't understand what "utility" is nor the syntax.

So the percent sign is the identifier for a string passed to it. See, that actually helps a lot and so does your example. The manual page for xargs doesn't state that but I'm sure it's basic stuff if your programming in this language.

Thanks for your help. I'm still going to find a book on FreeBSD programming basics, of FreeBSD for Dummies might be a better start. I really need to learn the basic syntax.

Edit: The example didn't work, or should I say 'wait_on' appears to not like any variable passed to it that contains quotes (single or double). This includes setting up a variable with predefined directory information. I wrote the programmer of 'wait_on' an email to see if it is a limitation or if there is a way to pass a variable with quotes to 'wait_on'.

wblock@ · Oct 21, 2012

No, xargs(1) does not use % as the string to be replaced--unless you tell it to, with -I. As an example:
% find /media -type d -print0 | xargs -0 -I magicfilenamestring echo "'magicfilenamestring'"

So we're telling xargs that input will be separated with nulls, when it sees magicfilenamestring in the command, it should replace that with the input, and then we tell it the command to run on each input string, echo.

This is not really FreeBSD stuff, it's shell stuff. sh(1) tells some, but it's not really a tutorial. There are books and websites with sh(1) tutorials. Be careful, Linux uses bash for sh, and it's not the same.

JoeSchmuck · Oct 23, 2012

sh vs. csh --- trying to run a csh command

I agree completely that it's a shell thing. I now have a partial solution that actually works but now how to implement it properly is the next question.

My test.sh code is:

Code:

#!/bin/sh
echo "Waiting for a change"
wait_on -w "`find /media -type d -print `"
echo "A change occurred"

But I have to run it like this to work "tcsh test.sh" or "csh test.sh", it must be one of the C shells, and it properly detects the sub-directories.

Here is the complete code I've been working on and getting wait_on to function properly is key. Please note that I have made some changes, 'directories' variable isn't used in this version, shouldn't need it and I can slim a ton of this down if I can get this to work.

Code:

#!/bin/sh
# File name 'scanmedia'
# Place this file into /etc/rc.d
# Edit /etc/rc.conf to include scanmedia_enable="YES"

. /etc/rc.subr

name="scanmedia"
rcvar=scanmedia_enable

PATH="$PATH:/usr/local/bin"

start_cmd="${name}_start"
stop_cmd=":"

load_rc_config $name
eval "${rcvar}=\${${rcvar}:-'NO'}"

scanmedia_start()
{
  while :; do
# Things to know...  The purpose of this script is to trigger a rescan of the media for the
# minidlna plugin.  This uses the "wait_on" command which although works, it has it's limitations.
# Limitations are: Will not recurse into sub-directories, triggers on first event (does not wait
# for completion of a write for instance).

# First lets locate all the subdirectories.  Since the minidlna plugin only allows one path for
# media we will use the default path of /media for this example and find all the sub-directories.
# NOTE: This currently only works if your sub-directories do not have any spaces " " in them.
# If you have a space in the sub-directory name, you will need to manually define the folders.

  directories=$(find /media -type d -print0 | xargs -0)

# To manually define the sub-directories, comment out the above line with a hash # and remove
# the hash from the below command.  Edit the paths as needed.

 # directories="/media" "/media/Movies A-M" "/media/Movies N-Z"

# Wait_on will trigger on any event which write to the directories listed.  Wait_on does not
# trigger on subdirectories.

echo "Waiting for a change"
csh -c "wait_on -w "`find /media -type d -print `""

echo "Change Detected"

# Since wait_on triggers on the start of a write event and not the ending, sleep long enough to
# ensure the changes have been written, if not this script will only repeat itself but we should
# try to minimize it.  5 Minutes is reasonable so 300 seconds=5 minutes.  You pick your own time.
# For testing I suggest a value of 10 seconds.

   sleep 10
# We must use pkill, to stop the minidlna service nothing else quite works properly.

   pkill minidlna

# We wait 10 seconds to allow the service lots of time to stop.

   sleep 10

# And now to start up the service again.

   service minidlna start

done
}

run_rc_command "$1"

Is there a better forum thread I should be posting this problem in?

Thanks,
Mark

EDIT: Something I forgot to mention is when I use 'csh -c "wait_on -w "`find /media -type d -print `"" that it only senses the first directory in this script. I was hoping that would be the fix but going this route the find instruction doesn't pass anything more than /media.

usdmatt · Oct 23, 2012

I thought I'd have a quick look at this as it looked like a simple problem but damn, that was a right PITA. I tried wrapping the arguments in quotes (but not /media) and even prefixing the spaces in the folder names with a '\' to escape them.

Even though these exact commands work great from the command line, when run from a shell script, it interprets the " and \ characters as 'actual' characters in the arguments, so it ends up looking for folders called "/media" and /media/Movies\ A-M. Something's trying to be too clever and automatically escaping these rather than passing it directly. (i.e, when there's a '\' in the variable it assumes I actually want a '\' in the argument value, not that I put it there to escape a space).

In the end the following seemed to work for me in a simple test script. (No errors, it blocked like it's supposed to until I created a file in one of the folders. I even tried the last folder on the argument list first to make sure it wasn't just watching the first).

Code:

DIR=$(find /media -type d -print0 | xargs -0 -I % echo -n '"%" ')
echo "$DIR"
sh -c "wait_on -w $DIR"

This probably isn't the way to do it. There's got to be a simple way to say "I'm passing a variable to this command which contains special characters but I want them to be treated as such, not as part of the arguments", but I can't find it. I've looked through a bunch of the default FreeBSD shell scripts hoping for a trick somewhere and searched the net to no avail.

JoeSchmuck · Oct 23, 2012

Looks like I was on the right track but honestly, not sure I would have gotten there.
Thanks for the help, it is greatly appreciated.

As for if it can be done a better way, this is very little code and the application sits and waits until a change is detected so even if it would be considered not optimized, the frequency it loops is minimal so it doesn't make a difference.

And yes I laughed when you said it was a PITA because I know it was, and I'm a nub with respect to FreeBSD and Unix in general.

wblock@ · Oct 24, 2012

Let's back up a second. Knowing sh(1) is useful, but it is a poor, poor alternative to a modern language. Quoting is a continual problem, and there are severe limits that have to be solved with hacks like xargs(1). Because sh(1) is not really a language, but really a way of calling multiple other commands, you end up having to learn all those other commands.

All of that is less of a problem with Perl, Ruby, or Python. They are all vastly more powerful than sh(1), and easier to use because of that.

There really aren't many reasons to use sh(1). People mention performance differences, but benchmarks show the modern scripting languages are sometimes faster. In many cases, it's insignificant anyway.

usdmatt · Oct 24, 2012

I probably would of written a simple Perl script myself. In fact, I knew nothing of wait_on, so probably would of just written something to find files modified in the last 300/600 seconds and run it from cron every 5/10 minutes. A quick find -> wc -> awk and you'd have a simple count of the number of changed files.

Problem is, Perl and all the other languages mentioned require installation. There's a reason the rest of the rc system is written in sh, because it's simple, lightweight and pretty much guaranteed to be available and working. I'd be pretty p*ssed off if some startup script required Perl just because it wanted to do something clever, that could of been worked around. (If fact I'm started to get a bit miffed at the number of ports reliant on Python these days that I have no other use for)

I consider sh to be the correct language for rc scripts such as the one here. I can't believe that no-one has ever written a shell script that produced a list of files/directories and needed to pass that list to another command. Maybe 'sh -c' is the correct way to do this... In the case here, maybe a different language is a viable option (it's not like this script is going to be distributed), but I don't really consider 'use a different language (that might not be installed)' a valid answer to a question which is basically 'how do I pass a list of arguments that includes spaces to a command in a shell rc script'. We've already identified a workaround which requires no third party software and probably adds practically 0 overhead.

wblock@ · Oct 24, 2012

Look around you and count how many FreeBSD systems you have that don't already have Perl. It's difficult to install any ports without getting Perl very quickly.

It wasn't that many years ago that the startup scripts actually did use Perl. They were rewritten in sh. It's a long story involving porting, licensing, and importing (AFAIR).

Sometimes the best answer to "how do I use this hammer to drive that screw" is to suggest rethinking the choice of tool.

UNIXgod · Oct 24, 2012

It is funny. No matter what port you install first it will have perl as a dependency. IIRC mergemaster() was originally a system level perl script. There may have been more which where replaced with posix shell scripts when perl was removed from base.

Either way I'm not sure what the purpose of the script in this thread is. Using posix commands are great for automation and reporting for quick single threaded `unix way` programming.

Some commands which may be worth looking into for are dirname() and basename().

There is no reason to use find() when you know what directory where your working in.

Finally as it's been mentioned once you hit the point where xargs() is needed either a refactoring step needs to be taken or consider a different tool( or language).

wblock@ · Oct 24, 2012

Certainly big and complicated programs can be written successfully in sh(1). ports-mgmt/portmaster, for example. The trick is finding the spot where working around the lack of features starts costing more time than it's worth, or a reason that it must be in sh(1) rather than something easier to write and maintain.

JoeSchmuck · Oct 24, 2012

Hey I appreciate all the help. The purpose of the script was to find a way to scan for changed files within sub-directories that had a space in their name. This is for a stop gap solution to stop and restart MiniDLNA in the FreeNAS program. The FreeNAS build is based on nanobsd and runs from a USB Flash drive. Resources are tight so I really believe the solution presented is the best one until I can see if I can rebuild MiniDLNA with automatic scanning and rebuilding of the database, something which could be outside my programming abilities but I'm going to give it a shot.

un_x · Nov 9, 2012

JoeSchmuck said:
Hey I appreciate all the help. The purpose of the script was to find a way to scan for changed files within sub-directories that had a space in their name.

It's "retarded" to use spaces in names. It was never allowed until Microsoft wanted to make their "retarded" users happy - you know - the ones that want to save a sentence as a filename. After that, everyone else had to follow, but it is still a fairly retarded thing to do

Spaces are intended to separate words/commands, and they really aren't "proper" in filenames.

Without getting into details of your program, as a 20 year FreeBSD user, I would offer you this advice: study sh(1) and awk(1). There is almost nothing that can't be scripted with those 2 languages very effectively. Mawk(1) is one of the fastest languages around, scripted or compiled. I have NEVER had a need for more than sh(1) and awk(1) ... sometimes sed(1) is valuable because it is small and fast, but awk can do everything sed can do. I disagree with much of what was said in this thread, most memorably, that "sh is not needed" - I wouldn't even know where to begin bashing that comment. Study sh(1) and awk(1), and you will be able to script almost anything you can imagine.

un_x · Nov 9, 2012

Without thinking too much or too deeply about your problem, the 1st thing that comes to my mind would be using ls(1) with the appropriate timestamp option to dump a listing of the directories into a file (ls -options > file1), and then repeating the process periodically, while using diff(1) to detect any differences between the 2 files (diff file1 file2). The timestamp of a modified file will change, and diff(1) will dump the records that have different timestamps (the modified files).

JoeSchmuck · Nov 9, 2012

un_x said:
Without thinking too much or too deeply about your problem, the 1st thing that comes to my mind would be using ls(1) with the appropriate timestamp option to dump a listing of the directories into a file (ls -options > file1), and then repeating the process periodically, while using diff(1) to detect any differences between the 2 files (diff file1 file2). The timestamp of a modified file will change, and diff(1) will dump the records that have different timestamps (the modified files).

The purpose of using "wait_on" is to prevent spinning up the hard drives just to check to see if a change had occurred so the use of ls(1) would provide to opposite effect. Also it's safe to expect users with thousands of files. I had one person state he had over 20K files in over 2K directories and since most of that was music there was a lot of long file names and directory names. I think I should be able to create an alternate version of the scanning for folks that hit the magic maximum input length of wait_on and just let them know that the drives will be spinning all the time. That will be fine for some and not fine for others but I can't please everyone all the time. The other way I could go might be to run multiple instances of the script I've already created but I haven't gone down that path yet.

I agree with your comment about spaces in directories and file names. The underscore works well for me and since I've been using BSD & Linux I have started changing my naming convention in Windoze.