HDD problems or something else?

OP
D

Dendros

Member

Reaction score: 7
Messages: 73

I did some more tests and during this I got an idea: what if I use the time command with pkg?
Result:

Code:
# time pkg install flat-remix-icon-themes
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
    flat-remix-icon-themes: 20201017

Number of packages to be installed: 1

The process will require 290 MiB more space.

Proceed with this action? [y/N]: y
[1/1] Installing flat-remix-icon-themes-20201017...
[1/1] Extracting flat-remix-icon-themes-20201017: 100%
17.530u 36.655s 31:15.55 2.8%    2631+29066k 413+344842io 0pf+0w


As you can see, it took a long time to complete.

Here is top:

Code:
last pid: 89522;  load averages:  0.04,  0.11,  0.10    up 0+00:46:41  23:29:24
67 processes:  1 running, 66 sleeping
CPU:  1.0% user,  0.0% nice,  1.1% system,  0.2% interrupt, 97.6% idle
Mem: 1149M Active, 229M Inact, 1549M Wired, 1154M Buf, 13G Free
Swap: 3598M Total, 3598M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
18441 user         35  22    0   252M   108M select   3   0:40   5.44% Xorg
44837 user          5  20    0    84M    51M select   5   0:21   3.74% wrapper-
72156 root          1  20    0  1050M   893M biowr    7   0:31   2.65% pkg
31887 user         20  20    0   224M   105M select   0   0:12   1.69% xfwm4
53467 user          4  20    0   121M    56M select   6   0:11   0.87% xfce4-te
79511 root          1  20    0    11M  2868K nanslp   0   0:02   0.31% vmstat
67684 root          1  20    0    13M  3920K CPU7     7   0:01   0.06% top
84637 root          1  20    0    11M  2776K pause    7   0:00   0.01% iostat
61730 root          1  20    0    11M  2264K select   2   0:00   0.01% powerd
36101 user          4  20    0   111M    77M select   6   0:01   0.00% xfdeskto
34744 user          4  20    0    75M    43M select   3   0:01   0.00% xfce4-pa
19258 user         19  20    0   193M    79M select   6   0:01   0.00% xfce4-se
36864 user          4  20    0    75M    44M select   7   0:00   0.00% wrapper-
46092 user          4  20    0    72M    43M select   4   0:00   0.00% wrapper-
33926 user          4  20    0    60M    34M select   1   0:00   0.00% xfsettin
46854 user          4  20    0    71M    43M select   2   0:00   0.00% wrapper-


What does biowr state mean for pkg?

Here is vmstat:

Code:
vmstat -w 1
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
0 0 0 5.0G   12G     2   0   0   0   544  504   0   0  382  1979  2762  0  0 100
0 0 0 5.0G   12G     0   0   0   0   528  499   0   0  357  1541  2443  0  1 99
0 0 0 5.0G   12G     2   0   0   0   464  505   0   0  422  3097  3708  0  1 98
0 0 0 5.0G   12G     3   0   0   0    48  500   0   0  363  6580  5701  1  1 98
0 0 0 5.0G   12G   413   0   0   0  1017  497   0   0  549  5555  5084  1  1 98
0 0 0 5.0G   12G     2   0   0   0   681  501   0   0  212  4921  2353  0  0 99
0 0 0 5.0G   12G     0   0   0   0   384  502   0   0  310  2592  2358  0  0 100
0 0 0 5.0G   12G     2   0   0   0  1081  455   0   0  499  3562  4305  1  1 99
0 0 0 5.0G   12G     3   0   0   0   161  507   0   0  430  6637  5974  1  2 97
0 0 0 5.0G   12G     2   0   0   0   560  500   0   0  400  2479  3204  0  1 99
0 0 0 5.0G   12G     0   0   0   0   432  507   0   0  307  1239  2059  0  0 99
0 0 0 5.0G   12G     1   0   0   0   544  454   0   0  359  1267  2282  0  1 99
0 0 0 5.0G   12G     1   0   0   0    56  509   0   0  219  3817  3191  1  0 98
1 0 0 5.0G   12G     3   0   0   0   576  502   0   0  638  6906  7066  1  1 97
1 0 0 5.0G   12G     1   0   0   0   304  504   0   0  255  1606  2068  0  0 99
0 0 0 5.0G   12G     0   0   0   0   616  508   0   0  359  1340  2348  0  1 99
0 0 0 5.0G   12G     1   0   0   0     8  509   0   0   81   459   653  0  0 100
0 0 0 5.0G   12G     2   0   0   0   529  491   0   0  515  4880  5270  1  1 98
0 0 0 5.0G   12G     3   0   0   0   216  494   0   0  453  5768  5517  1  1 98
0 0 0 5.0G   12G     0   0   0   0   576  499   0   0  324  1150  2095  0  0 100
0 0 0 5.0G   12G     0   0   0   0   368  507   0   0  279   868  1720  0  0 99


And finally iostat:

Code:
iostat -w 1
       tty            ada0             ada1             ada2             cpu
tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   158  0.00   0  0.00   0.00   0  0.00  15.54  84  1.28   0  0  0  0 99
   3   461  0.00   0  0.00   0.00   0  0.00  18.35 249  4.46   0  0  0  0 99
  12   173  0.00   0  0.00   0.00   0  0.00  15.94 322  5.02   0  0  0  0 100
   4   162  0.00   0  0.00   0.00   0  0.00  15.76 203  3.12   1  0  0  0 99
   0   160  0.00   0  0.00   0.00   0  0.00  16.26 305  4.85   1  0  0  0 99
   0   160  0.00   0  0.00   0.00   0  0.00  16.43 279  4.48   0  0  1  0 99
   0   159  0.00   0  0.00   0.00   0  0.00  16.75 100  1.64   0  0  0  0 99
   0   157  0.00   0  0.00   0.00   0  0.00  16.10 330  5.18   1  0  1  0 99
   0   157  0.00   0  0.00   0.00   0  0.00  16.40 257  4.12   0  0  0  0 100
   0   162  0.00   0  0.00   0.00   0  0.00  12.97 292  3.69   1  0  0  0 99
   0   160  0.00   0  0.00   0.00   0  0.00  15.72 104  1.59   1  0  0  0 99
   0   160  0.00   0  0.00   0.00   0  0.00  15.52 372  5.63   0  0  0  0 99
   0   159  0.00   0  0.00   0.00   0  0.00  11.88  79  0.92   0  0  0  0 100
   0   157  0.00   0  0.00   0.00   0  0.00  14.52 343  4.87   0  0  0  0 99
   0   216  0.00   0  0.00   0.00   0  0.00  15.33 302  4.51   1  0  1  0 98
   0   162  0.00   0  0.00   0.00   0  0.00  13.60 209  2.78   2  0  1  0 97
   0   153  0.00   0  0.00   0.00   0  0.00  16.98 253  4.20   1  0  1  0 99
   0   163  0.00   0  0.00   0.00   0  0.00  14.71 304  4.37   1  0  0  0 99
   0   159  0.00   0  0.00   0.00   0  0.00  14.39 197  2.77   0  0  1  0 99
   0   161  0.00   0  0.00   0.00   0  0.00  15.34 245  3.67   1  0  1  0 99
   0   158  0.00   0  0.00   0.00   0  0.00  14.35 277  3.88   1  0  1  0 97


These reports are just samples (my FreeBSD disk is ada2). Do you see something amiss?
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

I did some more tests and during this I got an idea: what if I use the time command with pkg?
Result:

Code:
# time pkg install flat-remix-icon-themes
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
    flat-remix-icon-themes: 20201017

Number of packages to be installed: 1

The process will require 290 MiB more space.

Proceed with this action? [y/N]: y
[1/1] Installing flat-remix-icon-themes-20201017...
[1/1] Extracting flat-remix-icon-themes-20201017: 100%
17.530u 36.655s 31:15.55 2.8%    2631+29066k 413+344842io 0pf+0w


As you can see, it took a long time to complete.

Oh well. Not really. That piece is 160'000 files.

I remember Usenet, there was a thing when there were really many files in a directory - I think it's called "indirect blocks", and it became ugly slow. This can also happen with not so very many files, but long filenames (as is the case here).
Then, I do not precisely know what pkg has to do, but probably it reads all the files back again to fill it's database. And the inode cache is probably exhausted with such a number.
Then, we have a mechanical disk with one single filesystem 1GB long, and it's a deskstar-alike that might take up to 18 ms to walk it.

So, lets suppose these indirect blocks make us need 4 seeks per file, and average 5 ms per seek (at least 2ms to settle in on a track, anyway), and we get 4 * 5 * 160'000 * 2 = 106 minutes.

As a matter of fact, most filesystems do not scale too well with 5- and 6-digit file numbers. There begins the realm where databases are better off. It does not hurt so much to access individual files from such a set, but moving them all as a bulk will become exponentially slower.

What does biowr state mean for pkg?

Something along the line of <waiting for a write command to complete>.

Code:
   3   461  0.00   0  0.00   0.00   0  0.00  18.35 249  4.46   0  0  0  0 99
  12   173  0.00   0  0.00   0.00   0  0.00  15.94 322  5.02   0  0  0  0 100
   4   162  0.00   0  0.00   0.00   0  0.00  15.76 203  3.12   1  0  0  0 99

So there is about 250 transactions per seconds, which means some 4 ms seek time. And we're moving about 4 MB/sec. That is interesting: the whole set is supposed to be 290 MB, but 4*60*31 = 7440 MB. And that is probably the overhead from the indirect blocks read again and again.

There is some info about optimizing UFS for such use-cases in tuning(7). But since installation is usually a one-time action and not a use-case, that may or may not be worthwhile.
 
OP
D

Dendros

Member

Reaction score: 7
Messages: 73

Well, I said that I have another system with FreeBSD installed on and which also has a spinning HDD, not a SSD. It's a HP SFF with a single 250 GB HDD and with FreeBSD 12.1-p10/amd64 as its sole OS.
Some info about its HDD, camcontrol:

Code:
# camcontrol identify ada0
pass0: <WDC WD2500AAKX> ATA8-ACS SATA 3.x device
pass0: 600.000MB/s transfers (SATA 3.x, UDMA5, PIO 8192bytes)

protocol              ATA8-ACS SATA 3.x
device model          WDC WD2500AAKX
firmware revision     
serial number         
WWN                   
additional product id
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 512, offset 0
LBA supported         268435455 sectors
LBA48 supported       488397168 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA5
media RPM             7200
Zoned-Device Commands no

Feature                      Support  Enabled   Value           Vendor
read ahead                     yes    yes
write cache                    yes    yes
flush cache                    yes    yes
Native Command Queuing (NCQ)   yes        32 tags
NCQ Priority Information       yes
NCQ Non-Data Command           no
NCQ Streaming                  no
Receive & Send FPDMA Queued    no
NCQ Autosense                  no
SMART                          yes    yes
security                       yes    no
power management               yes    yes
microcode download             yes    yes
advanced power management      no    no
automatic acoustic management  no    no
media status notification      no    no
power-up in Standby            no    no
write-read-verify              no    no
unload                         no    no
general purpose logging        yes    yes
free-fall                      no    no
sense data reporting           no    no
extended power conditions      no    no
device statistics notification no    no
Data Set Management (DSM/TRIM) no
Trusted Computing              no
encrypts all user data         no
Sanitize                       no
Host Protected Area (HPA)      yes      no      488397168/488397167
HPA - Security                 yes      no
Accessible Max Address Config  no


dumpfs:

Code:
# dumpfs /dev/ada0s1 | grep fsize
fsize    4096    shift    12    mask    0xfffff000


gpart:

Code:
# gpart show ada0
=>       63  488397105  ada0  MBR  (233G)
         63          1        - free -  (512B)
         64  488397104     1  freebsd  [active]  (233G)


And diskinfo:

Code:
# diskinfo -v -c ada0
ada0
    512             # sectorsize
    250059350016    # mediasize in bytes (233G)
    488397168       # mediasize in sectors
    0               # stripesize
    0               # stripeoffset
    484521          # Cylinders according to firmware.
    16              # Heads according to firmware.
    63              # Sectors according to firmware.
    WDC WD2500AAKX    # Disk descr.
                # Disk ident.
    No              # TRIM/UNMAP support
    7200            # Rotation rate in RPM
    Not_Zoned       # Zone Mode

I/O command overhead:
    time to read 10MB block      0.081489 sec    =    0.004 msec/sector
    time to read 20480 sectors   2.168983 sec    =    0.106 msec/sector
    calculated command overhead            =    0.102 msec/sector


So, I did a test on this HP system, by installing the same flat-remix-icon-themes package and measuring the time taken to complete it. Here are the results:

time

Code:
# time pkg install flat-remix-icon-themes
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
    flat-remix-icon-themes: 20201017

Number of packages to be installed: 1

The process will require 290 MiB more space.
38 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching flat-remix-icon-themes-20201017.txz:   3%    1 MiB   1.5MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  11%    4 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  18%    7 MiB   2.9MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  25%   10 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  33%   13 MiB   2.9MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  40%   15 MiB   2.9MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  47%   18 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  55%   21 MiB   2.9MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  62%   24 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  70%   27 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  77%   30 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  85%   32 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  92%   35 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz:  99%   38 MiB   3.0MB/s    0[1/1] Fetching flat-remix-icon-themes-20201017.txz: 100%   38 MiB   2.9MB/s    00:14   
Checking integrity... done (0 conflicting)
[1/1] Installing flat-remix-icon-themes-20201017...
[1/1] Extracting flat-remix-icon-themes-20201017: 100%
18.914u 44.916s 3:08.39 33.8%    2664+28948k 287+346107io 1pf+0w


As you can see, installing this package was much faster on the HP system despite it having a smaller and presumably older HDD.

iostat

Code:
# iostat -w 1

      tty            ada0              cd0            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   134 15.78 2889 44.51   0.00   0  0.00   0.00   0  0.00   2  0  8  1 89
   0   235 15.09 3264 48.10   0.00   0  0.00   0.00   0  0.00   1  0  9  1 89
   0   133 14.65 3375 48.27   0.00   0  0.00   0.00   0  0.00   0  0  7  1 92
   0    80 15.27 3526 52.58   0.00   0  0.00   0.00   0  0.00   1  0  6  1 92
   0    79 11.48 3151 35.32   0.00   0  0.00   0.00   0  0.00   0  0  4  2 94
   0    75  5.86 2624 15.00   0.00   0  0.00   0.00   0  0.00   0  0  1  1 99
   0   143 16.43 2272 36.46   0.00   0  0.00   0.00   0  0.00   0  0  5  2 93
   0    79 17.15 3289 55.08   0.00   0  0.00   0.00   0  0.00   1  0  8  1 89
   0   134 17.16 3303 55.34   0.00   0  0.00   0.00   0  0.00   1  0 11  1 87
   0    79 17.15 3099 51.91   0.00   0  0.00   0.00   0  0.00   1  0 10  0 88
   0   134 19.35 2147 40.56   0.00   0  0.00   0.00   0  0.00   6  0  8  1 86
   0   189 18.27 2294 40.93   0.00   0  0.00   0.00   0  0.00   4  0  9  1 86
   0   134 17.89 2558 44.68   0.00   0  0.00   0.00   0  0.00   4  0  8  1 86
   0   134 17.25 3026 50.97   0.00   0  0.00   0.00   0  0.00   1  0  8  2 89
   0   134 14.85 2891 41.93   0.00   0  0.00   0.00   0  0.00   1  0  6  1 93
   0    79 14.95 3344 48.83   0.00   0  0.00   0.00   0  0.00   1  0  7  1 92
   0   134 16.07 3438 53.95   0.00   0  0.00   0.00   0  0.00   2  0  7  1 90
   0    79 14.71 3487 50.09   0.00   0  0.00   0.00   0  0.00   0  0  6  2 92
   0   134 15.15 3440 50.88   0.00   0  0.00   0.00   0  0.00   0  0  7  2 91
   0    79 15.33 3651 54.65   0.00   0  0.00   0.00   0  0.00   1  0  7  2 91
   0   134 15.23 3628 53.95   0.00   0  0.00   0.00   0  0.00   1  0  8  1 90


What do you think about these differences? I think it's clear that something is not OK with my current system, it should be faster or at least on par with the HP system regarding HDD speeds yet it's not.
 

ralphbsz

Son of Beastie

Reaction score: 1,717
Messages: 2,675

I think your answer is here:
17.530u 36.655s 31:15.55 2.8% 2631+29066k 413+344842io 0pf+0w
That tells me that the pkg command took 31:15 minutes, or 1875 seconds. Of those, 54 seconds were spent using the CPU, or 2.8%. Which means lack of CPU power was completely not your bottleneck.

Next: Number of IOs. It did 413 read operations and 345K write operations. Let's ignore the reads (there are too few of them), the writes work out to about 180 write operations per second. Above you had measured that you can create small files in a little less than 1ms, or about 1000 per second. So 180 writes/second (assuming they are nearly all file creates) is in the ballpark, but still a factor of 5 slower than expected.

But we also know that pkg said it would write a total of 290 MiB. Divide that by the 180K write operations: average write size is 0.86KiB. That means our 1ms per file test (which ran at 1/2 KiB) was pretty close. Clearly, the file size difference can not explain the factor of 5. The total amount of IO won't save us: Modern disks run at 100-200 MByte/s, and writing 290 MiB total should take no more than 3s.

Summarizing what we know so far: There is a factor of 5 between expected performance and measured performance is not explained by anything.

And then you add the measurement on the other machine, which is in a reasonable range (actually a little faster than expected). That indicates that something is serious wrong with your slow system. However, it is neither the disk nor the file system itself, as the small benchmark demonstrated that those can run at full speed. Honestly, I have no idea what it could be.
 

olli@

Aspiring Daemon
Developer

Reaction score: 863
Messages: 827

Code:
gpart show ada2
=>        63  1953525105  ada2  MBR  (932G)
          63           1        - free -  (512B)
          64  1953525104     1  freebsd  [active]  (932G)
That looks ok. But you’re using an “old-style” MBR partitioning scheme (not GPT) with a separate BSD partition table, which is not visible in the above output. So would you please show the output from gpart show ada2s1? It’s unlikely that it causes misalignment, but it doesn’t hurt to check.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

I tried to unpack that archive, and I also see ~3 minutes when writing it into pre-existing directories. So something seems to be wrong indeed, but from here I do not see what it might be.
 

olli@

Aspiring Daemon
Developer

Reaction score: 863
Messages: 827

BTW, just to make you drool ;) This is how long it takes to extract those 156.944 files to my NVMe SSD:
Code:
tar xf    1.76s user  8.42s system  99% cpu  10.174 total  6520kB RSS

Admittedly it’s twice as slow as my RAM disk (TMPFS) where it takes just 5 seconds. (This is an AMD Ryzen with 32 GB RAM.)

But seriously… I think that 3 minutes is a very good value for a modern HDD. But I wouldn’t be worried either if it took 10 minutes. Extracting a 6-digit number of small files is really a special case that exercises a lot of parts of the hardware, firmware, kernel (drivers, I/O scheduler, bus subsystem and so on) and userland software. Small details can make a huge difference.

For example, someone mentioned atime updates. These should be switched off for the /usr filesystem (mount option noatime), unless you need it for specific purposes. Do not switch it off on the /var filesystem, or otherwise certain things will break, for example the detection of unread mail in /var/mail (this is just one of the reasons why you should partition your disk, instead of putting everything into a single filesystem).

When you use UFS (UFS2, to be exact), be sure to enable “soft updates” on the file system (-U option). I’m not sure if the FreeBSD installer does this by default nowadays, or if it still asks the user. Conversely, do not enable journaling on the file system because it will increase the number of writes for meta data.

Another thing that might be worth a look is the sysctl vfs.ufs.dirhash_maxmem. Basically it specifies the maximum amount of “dirhash” memory, which is related to caching of directory entries. When extracting nearly 160.000 files, this will play a role. If you have plenty of RAM, you might consider increasing that sysctl. There are more tuning tips in the tuning(7) manual page.

By the way, I recently read in an article that a certain HDD manufacturer sold “7200 rpm class drives” that actually had only 5400 rpm. The rpm value has a significant impact on the random access speed of a drive. Also, some drives have “cheap” NCQ implementations in their firmware – basically just good enough to claim that NCQ is supported, but in practice they support only 2 tags (instead of the standard of 32 tags).
 
OP
D

Dendros

Member

Reaction score: 7
Messages: 73

Again, thanks to all for your help and patience, I really appreciate this.

@oll@, here is gpart show ada2s1:

Code:
# gpart show ada2s1
=>         0  1953525104  ada2s1  BSD  (932G)
           0  1946157056       1  freebsd-ufs  (928G)
  1946157056     7368048       2  freebsd-swap  (3.5G)


Yes, I do use MBR and a simple partitioning scheme (root partition + swap partition) but I don't think it influences the results because I have the same partitioning scheme applied to both systems, the "slow" one and the HP SFF yet there is a significant difference between the regarding HDD speeds.

I found something strange: on the "slow" system I manually unpacked the .txz archive of the flat-remix-icon-themes package and it was quite fast, much faster than when it was unpacked by pkg[B]:[/B]

Code:
time tar xf flat-remix-icon-themes-20201017~816bbd6223.txz -C /home/user/Pictures/remix
tar: Removing leading '/' from member names
       28.33 real         4.62 user         7.56 sys


It seems as if the problem is exhibited only by pkg when it extracts a lot of small files. It's possible that pkg's performance is somehow influenced by the system's configuration?
 

CyberCr33p

Well-Known Member

Reaction score: 29
Messages: 311

I had issues with random 4K read/writes with Toshiba disks. Check this thread:

 

ralphbsz

Son of Beastie

Reaction score: 1,717
Messages: 2,675

All these ideas about it being SMR, ultra-slow Toshiba drives, and 5400 rpm drives pretending to be 7200 have to deal with one ugly fact: The OP has run a benchmark of creating 1000 small files, and that benchmark runs really fast (better than 1ms per file). On the other hand, when running pkg (which from the file system end is probably doing nothing other than creating small files), performance drops to ~5.5 ms per file created. What is pkg doing the other 4.5 ms? Or is it doing the file creates in a fashion that causes the disk or file system to get really bad?

I've met experienced storage people who can always break a system (Hi Andy!). You tell them how your system works, they spend an hour coding up a little benchmark, and if you get lucky, your system runs really slow ... if you're unlucky, it crashes. It seems that pkg manages to exactly hit the achilles' heel of the OPs system. But we don't know why, we can't reproduce it on other systems, so this for now remains a mystery.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

I had issues with random 4K read/writes with Toshiba disks. Check this thread:

Now this is interesting, as this is supposed to be a server disk. From what we know, Toshiba has aquired the Fujitsu enterprise disk technology in 2009, and the IBM/HGST/WD 3.5" consumer technology (aka Deskstar) in 2012.
But what do they actually build? From the marketing babble, it is not an enterprise disk, it's an enterprize capacity disk. Whatever that would mean.

But then, the pictures seem familiar. And what I believe is what I see:

ada0 = Western Digital Caviar Blue (~2011)
ada2 = Seagate Barracuda 3.5 (?? 2017 ??)
ada5 = Hitachi HDS5C1010CLA382 (aka Deskstar, ~2012)

Code:
        NAME            STATE     READ WRITE CKSUM
        bm              ONLINE       0     0     0
          raidz1-0      ONLINE       0     0     0
            ada0p5.eli  ONLINE       0     0     0
            ada2p1.eli  ONLINE       0     0     0
            ada5p1.eli  ONLINE       0     0     0

# ( cd /usr/ports ; tar cf - ) | tar xf -
L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    1    239     56    223    3.9    183   6870    0.5   26.7| ada0
    0    240     57    231    3.4    183   6858    0.5   24.6| ada2
    1    135     54    219    8.6     82   6082   12.4   57.6| ada5


These Deskstar can nicely run 24/7, and they are reliable - they just don't perform on transactional workload.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

All these ideas about it being SMR, ultra-slow Toshiba drives, and 5400 rpm drives pretending to be 7200 have to deal with one ugly fact: The OP has run a benchmark of creating 1000 small files, and that benchmark runs really fast (better than 1ms per file).
That is easy to deal with: these 1000 files will appear on the same track. No seek involved. UFS has a maxcontig parameter so large files will be distributed. AFAIK small files will not be distributed.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

I found something strange: on the "slow" system I manually unpacked the .txz archive of the flat-remix-icon-themes package and it was quite fast, much faster than when it was unpacked by pkg[B]:[/B]

Certainly pkg does a lot more than just unpack the archive - there are checksums logged and such things.

I tried similar things today (but didn't get a real clue from that). What I noticed is: if I just unpack such a big collection to some empty directury, then that runs rather fast (less than a minute).
But if I unpack it into a directory tree that does already exist, then that can take a couple of minutes.

The explanation would be: if all directories and files are newly created, then they will all be created on the same disk tracks (which happen to be free). But when the directories do already exist, they can be anywhere on the disk (depending on how the filesystem was used before) and must be accessed where they are, and the seeks become longer.
But it didn't appear like this could make the process reach half an hour over-all time.

If there were some spare disk space on Your system. one could create a fresh filesystem there and direct pkg to install there, and see how that behaves.
 

ralphbsz

Son of Beastie

Reaction score: 1,717
Messages: 2,675

But: all the writes, in total, only amount to 290 MiB. I don't remember how much memory the OP's system has, but modern systems start at a handful of GB, so the total write footprint of the pkg command should be in RAM. That's why the write performance of the small benchmarks is so good: creating 1K or 100K files or directories can all be done without actually writing to the disk synchronously.

Could it be that pkg is using some fsync calls or something similar? What I'm trying to find out is: What is the bottleneck here? It's clearly not the CPU, nor should it be networking. The ~5.5ms per write indicates that seeks are being done for a large fraction of all IO operations, or we're waiting for platter rotation (either cache miss reads, or synchronous writes), and IO is the bottleneck. But that makes no sense, it should all be cacheable.
 

shkhln

Daemon

Reaction score: 714
Messages: 1,746

Could it be that pkg is using some fsync calls or something similar? What I'm trying to find out is: What is the bottleneck here? It's clearly not the CPU, nor should it be networking. The ~5.5ms per write indicates that seeks are being done for a large fraction of all IO operations, or we're waiting for platter rotation (either cache miss reads, or synchronous writes), and IO is the bottleneck. But that makes no sense, it should all be cacheable.
Pkg uses SQLite, so likely one transaction per package installed. That shouldn't create noticeable overhead, though.
 
OP
D

Dendros

Member

Reaction score: 7
Messages: 73

My "slow" system's configuration: AMD FX 8350, 16 GB DDR3, RX 470 graphics and as I said 4 HDDs (they're independent, no RAID configuration whatsoever). And I configured pkg to use the -latest repo.

Someone suggested to use DTrace, it's possible to use it with pkg and could this help? I don't know how to use it but I'm willing to learn and give this a try too, as I'm quite baffled about this whole situation.
 

shkhln

Daemon

Reaction score: 714
Messages: 1,746

Someone suggested to use DTrace, it's possible to use it with pkg and could this help? I don't know how to use it but I'm willing to learn and give this a try too, as I'm quite baffled about this whole situation.
Start with pkg's own debug output: env DEBUG_LEVEL=4 pkg install <the-slowest-to-install-package-you-can-think-of>. The log unfortunately lacks timestamps, but it's still better than nothing.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

Someone suggested to use DTrace, it's possible to use it with pkg and could this help? I don't know how to use it but I'm willing to learn and give this a try too, as I'm quite baffled about this whole situation.
What about truss -D 2>LOGFILE pkg install <whatver> ?
 

shkhln

Daemon

Reaction score: 714
Messages: 1,746

I suggested DTrace because these are the types of problems that DTrace and eBPF (Linux) were developed to solve. <appeal to authority removed>
That only would help you if the file system performance is indeed the bottleneck. You did not do the necessary work to establish that. This problem could easily be caused by pkg running an enormous file list through some n^2 algorithm, for example.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

I suggested DTrace because these are the types of problems that DTrace and eBPF (Linux) were developed to solve. Brendan Gregg who is a senior performance engineer at Netflix has created a number of DTrace scripts for FreeBSD.
I know, and You're welcome. But these things go quite deep into the muscles of the kernel.

In our case, it looks like pkg runs two cycles of operations. In the first it repeats creating temporary files::

Code:
0.000063517 openat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",O_WRONLY|O_CREAT|O_EXCL ...
0.000028340 ftruncate(10,0x4df8)                 = 0 (0x0)
0.000014510 fstat(10,{ mode=-rwxr-xr-x ,inode=6144,size=19960,blksize=19968 }) = 0 (0x0)
0.000038017 write(10,"\^?ELF\^B\^A\^A\t\0\0\0\0\0\0\0"...,19960) = 19960 (0x4df8)
0.000014902 close(10)                            = 0 (0x0)
0.000038922 utimensat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh", ....
0.000031613 fchownat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",0,0, ...
0.000036272 fchmodat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",0100755,...
0.000013764 ioctl(1,TIOCGETA,0x7fffffffd1c0)     = 0 (0x0)


And later are repeats of this:

Code:
0.000024589 fstatat(8,"usr/local/bin/raw2tiff",{ mode=-rwxr-xr-x ...
0.000067371 unlinkat(8,"usr/local/bin/raw2tiff",0x0) = 0 (0x0)
0.000071079 renameat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",8,"usr/local/bin/raw2tiff"... 


Now if these lines sum up to the major part of the run time, we don't need to look at anything else. And then we have the problematic syscall(s), and can start to analyse that with DTrace.
 

shkhln

Daemon

Reaction score: 714
Messages: 1,746

In our case, it looks like pkg runs two cycles of operations. In the first it repeats creating temporary files::

Code:
0.000063517 openat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",O_WRONLY|O_CREAT|O_EXCL ...
0.000028340 ftruncate(10,0x4df8)                 = 0 (0x0)
0.000014510 fstat(10,{ mode=-rwxr-xr-x ,inode=6144,size=19960,blksize=19968 }) = 0 (0x0)
0.000038017 write(10,"\^?ELF\^B\^A\^A\t\0\0\0\0\0\0\0"...,19960) = 19960 (0x4df8)
0.000014902 close(10)                            = 0 (0x0)
0.000038922 utimensat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh", ....
0.000031613 fchownat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",0,0, ...
0.000036272 fchmodat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",0100755,...
0.000013764 ioctl(1,TIOCGETA,0x7fffffffd1c0)     = 0 (0x0)


And later are repeats of this:

Code:
0.000024589 fstatat(8,"usr/local/bin/raw2tiff",{ mode=-rwxr-xr-x ...
0.000067371 unlinkat(8,"usr/local/bin/raw2tiff",0x0) = 0 (0x0)
0.000071079 renameat(8,"usr/local/bin/.pkgtemp.raw2tiff.Ou9TA21KzOOh",8,"usr/local/bin/raw2tiff"... 


Now if these lines sum up to the major part of the run time, we don't need to look at anything else. And then we have the problematic syscall(s), and can start to analyse that with DTrace.

Something like this?
Code:
DBG(3)[79822]> Scripts: executing
--- BEGIN ---
set -- papirus-icon-theme-20200602
/usr/local/bin/gtk-update-icon-cache  -q -f /usr/local/share/icons/Papirus 2>/dev/null || /usr/bin/true
/usr/local/bin/gtk-update-icon-cache  -q -f /usr/local/share/icons/Papirus-Dark 2>/dev/null || /usr/bin/true
/usr/local/bin/gtk-update-icon-cache  -q -f /usr/local/share/icons/Papirus-Light 2>/dev/null || /usr/bin/true
/usr/local/bin/gtk-update-icon-cache  -q -f /usr/local/share/icons/ePapirus 2>/dev/null || /usr/bin/true
Scripts: --- END ---

Yes, gtk-update-icon-cache performance should be tested.
 

PMc

Aspiring Daemon

Reaction score: 383
Messages: 886

Ok, I take this back. I didn't realize you selected the tiff package for testing (instead of one of the icon theme packages OP mentions).
Never mind. I just grabbed the next best that was installed in that jail where I was logged in. I only wanted to see what it does.
 
Top