Hard drive work freezing all system

Have:
FreeBSD 12.3 amd64 intel xeon e5-2690 x2, chip X79 HDD 1Tb x2 mirror raid (gmirror label -vb round-robin gm0)
On the raid GPT with boot, ufs, swap partitions. When copying large files, at the time of copying, all the work of reading and writing from and to disk for the entire system freezes almost completely. I've never seen this before on my servers, but so far I've only had 8.4. I've updated them. It became almost impossible to work. Why is this happening and is there any way to fix it?

PS: the kernel loaded a standard set of virtualbox modules: vboxdrv.ko vboxnetadp.ko vboxnetflt.ko
 

Nothing. Silence. It slows down only during disk operation and only with the disk itself. As a single-task system. If it copies, then only one process. Everyone else is working with the disk at the speed of a drunken turtle. Although sometimes the software slows down, not only when working with the disc during copying process. But as soon as the file is copied everything works fast as before.
 
Maybe there simply not enough RAM left after your virtual machines?

You can check swapouts in `vmstat 3` during copy.
 
I did miss that this was happening under virtualbox.

My experience with round robin gmirror was not positive.
Give the default 'load' setting a try. See if it helps.
 
round robin

It was once considered the fastest algorithm. Has something changed now? Drives is models from one part, full identical.

/dev/ada0

The syntax is fine, I just didn't finish writing the command here.
# gmirror list
Geom name: gm0
State: COMPLETE
Components: 2
Balance: round-robin

this was happening under virtualbox

No. This is a server with a virtualbox, not a virtual server. I wrote that virtualbox modules are loaded into the server kernel. There are 3 virtual workstations running on it (vboxheadless).

Give the default 'load' setting

Ou... Can this be done on existing raid? I can't change everything from beginning right now, there won't be enough disk space to transfer. I've been using round-robin since 2012 (I think)... it has worked fine so far. True, the newest version before this upgrade is FreeBSD 8.3))
 
Switching balancing to load is useful, but not a panacea. This can increase the speed of work, but it cannot completely solve the problem with interrupts and load balancing on the side of the operating system and hardware. It became faster, but not much. It was also found that without a software raid, the speed of overwriting to disk is up to 100 MB / sec, and with a raid it drops to 30. With 2 processors and a total of 32 virtual, 16 physical and 8 threads, this is not very serious. I would like to make an experiment with ubuntu server, my colleague has already reported that on the server of the same configuration as mine, he has Windows running without problems and freezing. It's sad. Of course, experiments should be carried out on the same hardware, but this will only be available when I get another kit for the next server. Perhaps my hardware has some problems with the chips. I really don't want to believe that Windows works better than FreeBSD.
 
It was also found that without a software raid, the speed of overwriting to disk is up to 100 MB / sec, and with a raid it drops to 30.
I did some testing upon setting up a fanless fileserver on E3826 Atom
gmirror of two 2TB Samsung mSATA drives running in SATA2 mode
Single mSATA drive was ~270MB/sec
gmirror pair of the same. A paltry loss of 2MB/sec. to 268MB/sec in RAID1

Code:
/dev/mirror/gm0
    512             # sectorsize
    2000398933504    # mediasize in bytes (1.8T)
    3907029167      # mediasize in sectors
    0               # stripesize
    0               # stripeoffset
    Yes             # TRIM/UNMAP support
    Unknown         # Rotation rate in RPM



Transfer rates:
    outside:       102400 kbytes in   0.383345 sec =   267122 kbytes/sec
    middle:        102400 kbytes in   0.382090 sec =   268000 kbytes/sec
    inside:        102400 kbytes in   0.381931 sec =   268111 kbytes/sec

So you are not doing something right or your hardware choice is not ideal.
100MB/sec is less than SATA1 speeds.
 
Unfortunately all howtos in the internet tell you to use round robin :(
Yes even Lucas's books. All you can do is test them out. No two systems are alike.

I have found that round robin on LAGG was faster on my network.
It defies logic because the Cisco switch recognizes LACP but it is not as fast. Enough difference to change it.
Maybe that is a weak spot on my network I dunno.

Basic testing is all that is needed. Trust through verification.
 
Yes even Lucas's books. All you can do is test them out. No two systems are alike.

I have found that round robin on LAGG was faster on my network.
It defies logic because the Cisco switch recognizes LACP but it is not as fast. Enough difference to change it.
Maybe that is a weak spot on my network I dunno.

Basic testing is all that is needed. Trust through verification.

The ways of choosing a disk to read cannot seriously change the speed. Especially if both drives are from the same batch, as happens most often.
 
You say there's nothing in dmesg, but have you installed smartmontools and checked the individual devices with it? Typically mirrors "work to the slowest" when writing and "work to the fastest" on reading, so maybe one of the devices has a marginal cable that is causing issues under load.
Once you install smartmontools you can add some flags to /etc/periodic.conf to probe the devices and get that included in the daily.log
 
Why would you want to use UFS for servers? ZFS would be better for RAID no? Also why are you passing it through to virtualbox? You can try bhyve instead, but I think the best thing to do is run it bare metal. And a lot of package versions / changes have been made from 8.4 to 12.3, so it's probably freezing because of an incompatibility somewhere.
 
Back
Top