/usr/bin/file works very slow

file(1) from base work very-very-very slow with some type of files (usually text files), from ports work fast
Code:
[root@ns /var/soft/realtek/rtl_bsd_drv_v189]# ls -la if_re.c
-rw-rw-rw-  1 root  wheel  1066579 Aug 24 10:55 if_re.c
[root@ns /var/soft/realtek/rtl_bsd_drv_v189]# time /usr/bin/file if_re.c
if_re.c: C source, ASCII text

real  0m18.582s
user  0m18.559s
sys  0m0.014s
[root@ns /var/soft/realtek/rtl_bsd_drv_v189]# time /usr/local/bin/file if_re.c
if_re.c: C source, ASCII text

real  0m0.189s
user  0m0.181s
sys  0m0.008s

on big archives work well:
Code:
[root@ns /usr/ports/distfiles/KDE/Qt/5.4.1]# ls -la qtbase-opensource-src-5.4.1.tar.xz
-rw-r--r--  1 root  wheel  46132220 Aug  9 18:23 qtbase-opensource-src-5.4.1.tar.xz
[root@ns /usr/ports/distfiles/KDE/Qt/5.4.1]# time /usr/bin/file qtbase-opensource-src-5.4.1.tar.xz
qtbase-opensource-src-5.4.1.tar.xz: XZ compressed data

real  0m0.005s
user  0m0.005s
sys  0m0.000s
[root@ns /usr/ports/distfiles/KDE/Qt/5.4.1]# time /usr/local/bin/file qtbase-opensource-src-5.4.1.tar.xz
qtbase-opensource-src-5.4.1.tar.xz: XZ compressed data

real  0m0.003s
user  0m0.000s
sys  0m0.008s

my computer and os:
Code:
FreeBSD 10.2-STABLE #0 r287023M: Sat Aug 22 22:31:32 IRKT 2015
  root@ns.el.local:/usr/obj/usr/src/sys/el-new amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
CPU: Intel(R) Core(TM) i3-2105 CPU @ 3.10GHz (3093.04-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206a7  Family=0x6  Model=0x2a  Stepping=7
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x1d9ae3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,POPCNT,TSCDLT,XSAVE,OSXSAVE,AVX>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 6442450944 (6144 MB)
avail memory = 6130532352 (5846 MB)
 
Interesting:
Code:
time /usr/bin/file /usr/src/sys/dev/re/if_re.c
/usr/src/sys/dev/re/if_re.c: C source, ASCII text

real   0m2.522s
user   0m2.395s
sys   0m0.010s
Code:
time file /usr/src/sys/amd64/conf/GENERIC
/usr/src/sys/amd64/conf/GENERIC: ASCII text

real   0m0.440s
user   0m0.425s
sys   0m0.003s
Has nothing to do with the size, I guess has something to do with the paths. (Seems it needs longer to find it).
 
Seems my guess was wrong. Take a look into man file about tests and performing order.
Where is in man page info about performance?

and more tests on my computer:
Code:
[root@ns /usr/src]# time /usr/bin/file /usr/src/ObsoleteFiles.inc
/usr/src/ObsoleteFiles.inc: ASCII text

real  0m5.375s
user  0m5.364s
sys  0m0.008s

[root@ns /usr/src]# time /usr/local/bin/file /usr/src/ObsoleteFiles.inc
/usr/src/ObsoleteFiles.inc: ASCII text

real  0m0.028s
user  0m0.028s
sys  0m0.000s

[root@ns /usr/src/sys/amd64/conf]# time /usr/bin/file /usr/src/sys/amd64/conf/GENERIC
/usr/src/sys/amd64/conf/GENERIC: ASCII text

real  0m0.235s
user  0m0.217s
sys  0m0.008s

[root@ns /usr/src/sys/amd64/conf]# time /usr/local/bin/file /usr/src/sys/amd64/conf/GENERIC
/usr/src/sys/amd64/conf/GENERIC: ASCII text

real  0m0.098s
user  0m0.098s
sys  0m0.000s
so file from base very slow processes text files, it also slow down the viewing and editing in Midnight Commander.
At a small file size difference is almost imperceptible, but on large files already significant.

I delete /usr/bin/file and link /usr/local/bin/file (from ports) to /usr/bin, now MC again quickly open large files.
 
There is no man page about performance.
But
Code:
If a file does not match any of the entries in the magic file, it is
  examined to see if it seems to be a text file.  ASCII, ISO-8859-x, non-
  ISO 8-bit extended-ASCII character sets (such as those used on Macintosh
  and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded Unicode, and
  EBCDIC character sets can be distinguished by the different ranges and
  sequences of bytes that constitute printable text in each set.  If a file
  passes any of these tests, its character set is reported......

If a file is a text file it needs the longest time to test.
 
I don't have enough knowledge about file() and how base's implementation differs from the sysutils/file, but if I had to figure it out I would use devel/valgrind to profile /usr/bin/file and identify where it is spending most of this time. Note: probably compiling a debug version of this utility
would help too.
 
Back
Top