fgrep excessive time increase after upgrade from 12.4 to 14.3 - regex library issue?

This is a weird one. Since upgrading FreeBSD (jumping a couple of major versions), a script that used to run in less than a minute now takes literal hours. I've managed to distill the delay down to this use of fgrep:

cat filename.txt | fgrep --text -if filter.txt

- filename.txt contains 64 million lines, 2.4 GB
- filter.txt contains 615 lines, 6 KB

fgrep CPU usage is at 100% consistently.

I do have a backup of the 12 system, and I did notice one possible explanation:

fgrep 12.4: libgnuregex.so.5
fgrep 14.3: libregex.so.1

I cannot find any information about this change, or anyone else mentioning a regression. Hope someone has some insight. Thanks.
 
Yes, this is normal. Unfortunately. I think it might be included in the foundation's SIMD efforts for string functions in FreeBSD.

There is a port for the GNU grep versions.
 
-i sucks in the bsd lib
because instead of a normal no case comparison it expands the search text from fubarbaz to [Ff][uU][bB][aA][rR][aA][zZ]
Ouch, that might explain it.

I use grep -i in so many places, routinely, I'm now wondering where else may have been impacted...

In this instance, I've installed gnugrep from packages, and the filtering now takes 25 seconds. A massive difference.

grep: 41 lines/sec
ggrep: 4,000 lines/sec
 
Back
Top