Other BSD awk regexp atoms with {n,m} bounds

mbb · Oct 7, 2017

I notice that FreeBSD's (and OpenBSD's) awk treats a repetition qualifier like "{2,4}" in a regular expression as a literal string instead of as it's documented in the manual pages. Below are some examples. The FreeBSD awk() manual refers to grep() to define the regular expressions it supports, but as you can see, awk treats the same pattern differently than egrep (and perl). GNU awk also behaves like egrep and perl.

Code:

$ uname -v
FreeBSD 11.0-RELEASE-p9 #0: Tue Apr 11 08:48:40 UTC 2017     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC 
$ echo aaa | awk '/a{2,4}/'
$ echo 'a{2,4}' | awk '/a{2,4}/'
a{2,4}
$ echo aaa | awk '/a+/'        
aaa
$ echo aaa | egrep 'a{2,4}'    
aaa
$ echo aaa | perl -wne 'print if /a{2,4}/'
aaa

Looking at the awk source code, where I would expect to find handling for brace expressions in unary() in /usr/src/contrib/one-true-awk/b.c, I don't find it. Forgive me if this has come up before, but this seems like a bug. Is there a historical reason it's like this?

aragats · Oct 9, 2017

According to the man page awk(1):

Code:

....
STANDARDS
     The awk utility is compliant with the IEEE Std 1003.1-2008 (“POSIX.1”)
     specification, except awk does not support {n,m} pattern matching.
....

I guess because, in particular, braces have different meaning in awk's syntax.

mbb · Oct 9, 2017

Thank you. I don't know how I missed that.

Other BSD awk regexp atoms with {n,m} bounds

mbb

aragats

mbb