BSD sed command does an unexpected behavior

I found an unexpected behavior about the BSD sed command.

Try the following command on your FreeBSD host and you will probably see the following response.
Code:
$ seq 1 10 | awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'"]
1,A
2,A
3,A-4,A
5,A-6,A
7,A-8,A
9,A
10,A
$

However the GNU sed command, which is available by textproc/gsed (ports) or
on a Linux host, returns a different response. That is as follows,
Code:
$ seq 1 10 | awk '{print $1 ",A"}' | gsed '3,4N; s/\n/-/g'"]
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
$

I don't know why the two versions of sed commands return different responses. I suspect that the GNU sed works correctly. Because the sed "3,4N" orders to concatenate with the next line, only from the line #3 to the line #4, but NOT TO line #5.

The BSD sed has something wrong? Or that is just my misunderstanding?
 
Oh, sorry! Those("]) are typos.:r
Thank you for your indicating.
Your indication is certainly true.


But the unexpected behavior is not going to go away by erasing the typo characters.
Don't you know the reason?
 
Personally I don't know. But different implementations might have different behavior.
I think you should ask on @stable mailinglist.

Currently It looks like a bug, but I'm not that much into sed.
 
On OpenBSD the output is as follows:
Code:
[cmd=$]jot 10 1 |  awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'[/cmd]  
1,A
2,A
3,A-4,A
5,A-6,A
7,A-8,A
9,A-10,A
 
Thank you for your advise.
You also think that looks like a bug, don't you?

I will ask on the mailinglist.
Thanks again.
 
J65nko said:
On OpenBSD the output is as follows:
Code:
[cmd=$]jot 10 1 |  awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'[/cmd]  
1,A
2,A
3,A-4,A
5,A-6,A
7,A-8,A
9,A-10,A

That confuses me furthermore!
Hmm, isn' it a complex problem?:(
 
Hm, can't comment on the sed part , but for the sake of comparison I'm attaching results from HPUX (all 11i versions - 11.11/11.23/11.31):

# printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n" | awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'
Code:
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
There's no seq nor jot in hpux, so I had to improvise.
 
Just to pitch in: it could be a bug, or it could be a difference in semantics between GNU sed and BSD sed. Does the man page provide any hints as to what is expected BSD sed behaviour in this case?
 
Thank you for reporting, everyone.

I also report the other implement of sed.
  1. sed on AIX 6.1.0.0
  2. sed on HP-UX B.11.23
  3. sed on SunOS 5.9(Solaris 9)
They all return the same responses as the GNU sed. Those implementations are probably different from GNU's.

> matoatlantis

How about the following command for the OSs with neither seq nor jot.
$ yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'
 
Isn't sed in Solaris 9 the same as GNU sed? I ask, because I know there is GNU stuff on Solaris (on newer versions). Don't know about other Unixes.
 
richmikan said:
How about the following command for the OSs with neither seq nor jot.
$ yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'

# yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'
Code:
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A

Output is from 11.31 as other releases had the same output.
I highly doubt sed in HPUX is GNU sed. But according to docs it follows following standards:

Code:
 STANDARDS CONFORMANCE
      sed: SVID2, SVID3, XPG2, XPG3, XPG4, POSIX.2

On solaris 10 you can choose different sed depending on standard:

# printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n" | awk '{print $1 ",A"}' | /usr/bin/sed '3,4N; s/\n/-/g'
Code:
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
Output is the same even with using /usr/xpg4/bin/sed instead.
 
The gsed man page at http://www.freebsd.org/cgi/man.cgi?....0-RELEASE+and+Ports&arch=default&format=html describes the N command as follows:
Code:
n N    Read/append the next line of input into the pattern space.
The FreeBSD description from sed(1):
Code:
     [2addr]N
	     Append the next line of input to the pattern space, using an
	     embedded newline character to separate the appended material from
	     the original contents.  [color=blue]Note that the current line number
	     changes.[/color]

The OpenBSD man page also has this note.
 
But the man page of plan9port sed states the same, while not giving the same result as FreeBSD sed:

Code:
`N    Append the next line of input to the pattern
      space with an embedded newline.  (The current
      line number changes.)'

Mark.
 
Thans for everyone, again.

I suppose that...
Even if the behavior of the BSD sed is not a bug but a semantic,
I can't concretely understand and explain the reason of the behaior.

I don't know why
"3,4N" suggests "3,A-4,A" and "5,A-6,A", "7,A-8,A"
while "3,5N" suggests only "3,A-4,A",
on the FreeBSD sed.:(
 
sed(1) works on lines. And a line is a sequence of non-linefeed characters followed by a linefeed character ("\n"). The example code that we have been looking at, "messes around" with that critical linefeed. We replace it with a "-":

Code:
3,4 {
N
s/\n/-/
}
So we change the marker that defines the chunk of data sed(1) is working with.

In the following attempts I use this text file:
Code:
[cmd=$] cat 1-10a.txt[/cmd]
1,A
2,A
3,A
4,A
5,A
6,A
7,A
8,A
9,A
10,A

The sed(1) command file:
Code:
[cmd=$] cat cmd4.sed[/cmd]
3,4 {
H
}

5 {
x
s/\n/-/g
}
Lines 3-4 are transferred from pattern space to Hold space. At line 5 we swap Hold space and pattern space, and substitute the newline with the hyphen.

Code:
[cmd=$]sed -f cmd4.sed 1-10a.txt[/cmd]
1,A
2,A
3,A
4,A
3,A-4,A
6,A
7,A
8,A
9,A
10,A
Now line 3-4 are still being displayed and line 5 is missing.
An ugly hack is instruct sed(1) not to echo the lines with the -n option. and explicitly to use p to print:

Code:
[cmd=#] cat cmd5.sed[/cmd]                            
1,2 {
p
}

3,4{
H
}

5 {
x
s/\n/-/g
p
x
p
}

6,10 {
p
}
Yes, it is ugly, but produces the wanted output:

Code:
[cmd=$]sed -nf cmd5.sed 1-10a.txt[/cmd] 
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
 
richmikan said:
Thans for everyone, again.

I suppose that...
Even if the behavior of the BSD sed is not a bug but a semantic,
I can't concretely understand and explain the reason of the behaior.

Because GNU wrote gsed afterwards, and changed the behaviour.
 
Back
Top