awk's rand() not really random?

I'm trying to select a random item from a list with awk, but I'm finding that awk's rand() feature isn't quite working as expected.

Code:
% awk 'BEGIN{ srand(); print rand(); }'
0.889922
% awk 'BEGIN{ srand(); print rand(); }'
0.88993
% awk 'BEGIN{ srand(); print rand(); }'
0.88993
% awk 'BEGIN{ srand(); print rand(); }'
0.889938
% awk 'BEGIN{ srand(); print rand(); }'
0.889938
% awk 'BEGIN{ srand(); print rand(); }'
0.889946
% awk 'BEGIN{ srand(); print rand(); }'
0.889946
% awk 'BEGIN{ srand(); print rand(); }'
0.889954
% awk 'BEGIN{ srand(); print rand(); }'
0.889962
% awk 'BEGIN{ srand(); print rand(); }'
0.889962
% awk 'BEGIN{ srand(); print rand(); }'
0.889969
% awk 'BEGIN{ srand(); print rand(); }'
0.889969

...which isn't really random.

Am I doing something wrong? Is there a different way I should be using this?
 
I'm not sure if this will help or not (because the documentation is for - and I am using - gawk), but:

http://www.gnu.org/software/gawk/manual/html_node/Numeric-Functions.html#Numeric-Functions

In your example, if you're re-running the command at a quick pace I suppose the seed (based on date/time if one is not explicitly provided) could be very similar.

Here's an example session I just tried, waiting ~4 seconds between each press of Enter:
Code:
$ awk '{ srand() ; print rand() }'

0.669459

0.291102

0.614256

0.286518

0.362619

0.485851
^C
 
Oui, I just confirmed. Look what happens if I hit Enter rapidly:
Code:
$ awk '{ srand() ; print rand() }'

0.372023

0.372023

0.372023

0.372023

0.372023

0.602575

0.602575

0.602575
^C

So you are probably going to need to either provide a seed or slow things down a bit.
 
Well, I have it kind of working with gawk, but I would really prefer to have a solution using awk from base for portability.

Code:
gawk 'BEGIN { srand(systime() + PROCINFO["pid"]); print rand() }'

(taken from here)

Unfortunately, systime() is only provided with gawk. Any good ideas on how to do this with /usr/bin/awk?
 
This time with awk:
Code:
$ awk --version
awk version 20070501 (FreeBSD)

OK for a single run:
Code:
$ _seed=`date +%s` ; awk '{ srand('${_seed}') ; print rand()}'

Each time that entire command is invoked, you'll have Epoch seconds assigned to _seed.
 
Awesome, thanks. But now it seems like awk doesn't respect PROCINFO like gawk does.

Code:
% gawk 'BEGIN{ print PROCINFO["pid"]}'
10580
% awk 'BEGIN{ print PROCINFO["pid"]}'

According to the gawk documentation, PROCINFO["pid"] returns the process ID of the current process. Any idea how to get something like this working?
 
Dunno. All I can think of is to use the Bourne shell variable to get its pid (assuming you're going to be running this from Bourne shell / or from a Bourne shell script).
Code:
$ awk '{ print '$$' }'

228
 
Alt said:
I tried following construct and seems it works ok

If you run it just once, yes. If you run it multiple times in a row, the "random" numbers aren't really all that random. This is because the seed that srand() uses is the system time.

I also found out that this whole thread seems kind of pointless, since srand() actually takes no arguments like the srand() of gawk. Even though I can pass it these same variables as for gawk, it doesn't make any difference.
 
Then that be much harder huh :p
Next version is
awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }"

Tested with
echo "Test" | awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }" ; echo "Test" | awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn ); } {print rand(); }"
 
Wow, that works much nicer. How exactly does that work, and how are you able to give arguments to srand()?
 
Code:
awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }"

jot -r 1 1 10000000 - gives random number from 1 to 10000000 =)
-v rn=.... - this inserts variable to awk interpreter
srand() - really it takes seed number and returns old seed. So, cus "old seed" is systime, we just add an 'rn' variable

UPD: it can be simplified to
Code:
awk "BEGIN { srand(srand()+`jot -r 1 1 10000000`); } {print rand(); }"
 
I think the problem is that srand() is being called for every call to rand(). When srand(3) is initialized with the same value the psuedo random number sequence is repeated (awk's srand() is probably implemented as srand(time(0)).).

Try: awk 'BEGIN {srand()} {print rand()}'
 
Back
Top