Shell How to do lookups in a shell script

Could someone suggest the easiest way of doing lookups using a shell script?

For example, I have file containing:-

abc
***
1234

xyz
***
5678

I want to lookup 'xyz' and return '5678'. What would be an appropriate tool? awk?
 
Bash:
#!/bin/sh
sed -ne "/^${1}$/ {
    N
    N
    s/.*\n//
    P
}" ${2}

Code:
$ ./fetch.sh xyz test
5678

edit: I prefer sed. Pretty sure it could be solved with awk as well. You might want to add matching the line containing *** as well to make it more robust....
 
I trying to use sed and have got to a point where I have produced so output such as:

some_test_file

abc/1234_xyz
abc/9911_pozit
zzz/2222_qrty
zzz/974g_sofff

and I'd like to create two output file abc and zzz containing

abc:-

1234
9911

zzz:-

2222
974g

I'm attempting to create abc using:

sed /^abc/s/_.*$//p' some_text_file >abc
but what I get in abc is

1234
9911
zzz/2222
zzz/974g

I can't figure out to ignore the lines that don't start with 'abc'...
 
Try :
1) create file a.pl with following content:
Perl:
my $fn = shift @ARGV;
open(F_INP,$fn) or die "unable to open $fn $!";
my @arr = <F_INP>;
close(F_INP);
unless(@arr){die "file $fn is empty"}
for my $a (@arr){
    if( $a =~ /^(\w+?)\/(\w+?)_/ ){
        my $_1=$1;
        my $_2=$2;
        open (F_OUT, ">>", $_1);
        print F_OUT $_2,"\n";
        close(F_OUT);
    }
}

2) and run it in terminal: perl ./a.pl some_test_file

Edit: Sorry, this is not a shell script
Edit2: My script is inefficient with a large number of input records. It's better to sort 'arr',
open the output file, and keep it open until the value of '$1' changes, then close the current output file, and open the next one...
 
Code:
cat << EOF | perl -ne 'BEGIN {my %recs} push @{$recs{$1}},$2 if /(\w+)\/(\w+)_/; END {for my $e (sort keys %recs) {print "$e:-\n\n".join("\n",@{$recs{$e}})."\n\n"}}'
> abc/1234_xyz
> abc/9911_pozit
> zzz/2222_qrty
> zzz/974g_sofff
> EOF
abc:-

1234
9911

zzz:-

2222
974g

PS: I hate both of your record formats.

Edit: Turns out my Perl was buggy. This is my shocked face.

Edit 2: Or if you want it shorter and even harder to follow
Code:
cat << EOF | perl -ne 'push @{$recs{$1}},$2 if/(\w+)\/(\w+)_/;END{for $e (sort keys %recs){print"$e:-\n\n".join("\n",@{$recs{$e}})."\n\n"}}'
> abc/1234_xyz
> EOF
abc:-

1234

Edit 3: Remove useless cat as suggested by schweikh. Slightly less readable too
Code:
perl -ne 'push @{$r{$1}},$2 if/(\w+)\/(\w+)_/;END{for $e (sort keys %r){print"$e:-\n\n".join("\n",@{$r{$e}})."\n\n"}}' << EOF
> abc/1234_xyz
> abc/9911_pozit
> zzz/2222_qrty
> zzz/974g_sofff
> EOF
abc:-

1234
9911

zzz:-

2222
974g
 
  • Like
Reactions: _al
Edit2: My script is inefficient with a large number of input records. It's better to sort 'arr',
open the output file, and keep it open until the value of '$1' changes, then close the current output file, and open the next one...
This is a more efficient (than my previous version) solution:
Perl:
my $fn = shift @ARGV;
open(F_INP,$fn) or die "unable to open $fn $!";
my %h;
while (<F_INP>){
    if( /^(\w+?)\/(\w+?)_/ ){
        push @{$h{$1}},$2;
    }
}
close(F_INP);
unless(scalar(keys(%h))){die "no pattern match found in file $fn or file is empty"}
for my $f (keys %h){
    open (F_OUT, ">", $f);
    for my $s (@{$h{$f}}){
        print F_OUT $s,"\n";
    }
    close(F_OUT);
}

Edit: sorting is not required here.
 
Last edited:
These aren't terribly efficient since they use multiple tools to do the job, rather than relying on a single tool. But I think they end up being a bit more readable, at least for me.

For your first post, presuming this is a set format and there is only one instance of each "key":
Code:
#!/bin/sh
grep -A2 "${1}" "${2}" | tail -1

Code:
./fetch.sh xyz some_input_file
5678

If there are multiple copies of "xyz" then this would only return the last one.

For your second post:

Code:
#!/bin/sh
while read line; do
    basename ${line} | sed 's/_.*//' >> $(dirname ${line})
done < ${1}

Code:
foo %>ls
format.sh*      some_test_file
foo %>./format.sh some_test_file
foo %>ls
abc             format.sh*      some_test_file  zzz
foo %>cat abc
1234
9911
foo %>cat zzz
2222
974g

One of the problems/features here is that if the files abc or zzz already exist then this script will append to them, so multiple runs will cause duplicates:

Code:
foo %>ls
format.sh*      some_test_file
foo %>./format.sh some_test_file
foo %>./format.sh some_test_file
foo %>./format.sh some_test_file
foo %>ls
abc             format.sh*      some_test_file  zzz
foo %>cat abc
1234
9911
1234
9911
1234
9911

If this is a problem then something like this may work for you:

Code:
#!/bin/sh
rm $(awk -F/ '{print $1}' ${1} | sort | uniq | tr '\n' ' ')

while read line; do
    basename ${line} | sed 's/_.*//' >> $(dirname ${line})
done < ${1}

None of the solutions have zero error handling, leaving that to the reader to implement :)

Similar to what Jose said - the formats you're working with are weird, and I'm sure you have a bunch of edge cases that your examples don't show and my examples don't handle...
 
Back
Top