Shell IFS / BASH_REMATCH multiple lengths separators

Is it possible to define multiple characters as delimiter with IFS / BASH_REMATCH?

Code:
string="[house-b 10] [river-a 5] [wood-e 15] [car-c 20]"
split="] ["
result0="[house 10"
result1="river 5"
result2="wood 15"
result3="car 20]"

Code:
str="[house-b 10] [river-a 5] [wood-e 15] [car-c 20]"
while [[ $str =~ ([\[][-|0-9 a-zA-Z.:]+[\]]) ]]; do echo ${BASH_REMATCH[1]:1:${#BASH_REMATCH[1]}-2}; str=${str:${#BASH_REMATCH[1]}}; done
Error on the first string part does not complete return.
 
From bash(1): "The shell treats each character of IFS as a delimiter".

The components are easy to split out with sed(1), but you need to tell us more about what you want to achieve:
Code:
$ string='[house-b 10] [river-a 5] [wood-e 15] [car-c 20]'
$ echo "$string" | sed -e 's/ *\[\([^]]*\)\]/\1:/g'
house-b 10:river-a 5:wood-e 15:car-c 20:
 
Is there another possibility without while in this code?
Code:
str="[house-b 10] [river-a 5] [wood-e 15] [car-c 20]"
while [[ $str =~ ([\[][-|0-9 a-zA-Z.:]+[\]]) ]]; do echo ${BASH_REMATCH[1]:1:${#BASH_REMATCH[1]}-2}; str=${str:${#BASH_REMATCH[1]}}; done

Want always "] [" these three characters as complete string delimiters.
 
So long as you want to use a multi-character delimiter, I think that you are going to have to iterate with a shell.

There are lots of pretty good examples available.
Want always "] [" these three characters as complete string delimiters.
Your problem definition is fraught by your view that "] [" is a delimiter. It's not. What you actually have is "[data-field] ..."

Why bash? It's string handling is clumsy, at best. It's also a resource pig.

Perl split and Python explode are designed for this sort of problem, e.g. to gather the data fields into @words array in perl:
Code:
$ perl
use Data::Dumper qw(Dumper);
my $str = "[house-b 10] [river-a 5] [wood-e 15] [car-c 20]";
$str =~ s/ *\[([^]]*)\]/$1:/g;
print Dumper $str;
my @words = split /:/, $str;
print Dumper \@words;
$VAR1 = 'house-b 10:river-a 5:wood-e 15:car-c 20:';
$VAR1 = [
          'house-b 10',
          'river-a 5',
          'wood-e 15',
          'car-c 20'
        ];
 
got this
Code:
[[ "[house-b 10] [river-a 5] [wood-e 15] [car-c 20]" =~ [\[]([-a-z]+ [0-9]+)[\]]( )[\[]([-a-z]+ [0-9]+)[\]]( )[\[]([-a-z]+ [0-9]+)[\]]( )[\[]([-a-z]+ [0-9]+)[\]] ]]
echo ${BASH_REMATCH[1]}" "${BASH_REMATCH[3]}" "${BASH_REMATCH[5]}" "${BASH_REMATCH[7]}

Do you have an idea how to make it less cumbersome with BASH_REMATCH without a loop?
 
As you have shown above, if you know the exact number of data items, you can unroll the loop, in-line.

I don't know of any generic solution to read an array when the data delimiter is more than a one character.

It's possible to clarify your code a little:
Code:
str="[house-b 10] [river-a 5] [wood-e 15] [car-c 20]"
RE='[[]([-a-z]+ [0-9]+)[]]'
[[ $str =~ $RE( )$RE( )$RE( )$RE ]]
echo ${BASH_REMATCH[1]} ${BASH_REMATCH[3]} ${BASH_REMATCH[5]} ${BASH_REMATCH[7]}
house-b 10 river-a 5 wood-e 15 car-c 20

You may find this observation handy (it's data dependent, but may work for your specific content):
Code:
$ str="[house-b 10] [river-a 5] [wood-e 15] [car-c 20]"
$ IFS='][' read -ra items <<<"$str"
$ for i in ${items[@]}; do echo \"$i\"; done
"house-b"
"10"
"river-a"
"5"
"wood-e"
"15"
"car-c"
"20"
 
Back
Top