Hi Everyone,
I am trying to come back to FreeBSD after turning to Linux in 1998. What I am building is a host with ZFS and several MySQL instances in jails. Well, it works but I am hitting some weird behavior which I can't explain.
First a bit about my hardware, yes I remember I had to craft it to match to FreeBSD not otherwise. Here it is:
1TB of RAM
2x Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
16x TOSHIBA PX05SVB096Y SAS3 SSD
and as it is Dell it has Dell's HBA330+ controller
Intel(R) Ethernet Connection XL710
4 NUMA domains
FreeBSD 11.1-release-p9
There is 2x ZFS:
- system root with 2x nvmem M.2 drives running as mirror
- data root with 16x Toshiba in mirrors (8 vdevs)
- compression enabled (lz4), no dedup
My plan was to utilize most of RAM as ARC and run MySQLs with direct flush or small buffers. I started testing it by running some well known SQL statements and it worked very well, however, not always.
Let's rewind to the very first day.
I had to install MySQL data from xtrabackup files. I had them in tar+lzop, 800G in file, around 1.8T unpacked. Run "lzop -d file | tar xf -" and bang - after unpacking around a half of its size system stopped responding properly. I repeated this a few times with exactly the same result:
- Whatever was open, say, a file in vim, still worked
- any other action such as ls, vi, top - hung frozen
After a few tries I started top first and kept watching. Nice, got some clue - these processes are waiting for another process trying to free ARC. Left for hours didn't progress so warm reboot.
Next step was to limit arc to 640G and set mem target and minimum to be 128/64. It worked, I was able to unpack files. Then I run MySQL with some tests with no issues, however, I noted wired memory growing fast and far beyond ARC. Disabled UMA - helped a bit but made it slow. Enabled UMA again.
Then further testing with unpacking. After having two tests run over MySQL and ARC sitting around 350G while wired around 500G, I started "pbzip2 -d file > deleteme". file.bz2 was 400G size with 1.8T packed content.
Whole process runs at 15% of drives speed until around 400G is unpacked. Then it slows down to next to nothing. ARC is full (640G), wired leaving around 63G free (less than min set) and still growing slowly. System is terribly slow, I am able to launch top but it takes minutes to start, no option to open a file, ls sometimes returns after minutes. Killing bzip doesn't help. Once I tried to delete "deleteme" file while system was dying, actually I had two files, one from previous test. Run "rm file1; rm file2" then ls. Files were gone but system deteriorated further. Warm reboot after around 30 minutes of nothing but top still running and reporting 59G free in RAM. After reboot surprise - files are back Conclusion - unable to flush transaction.
After that I limited number of blocks deleted in single transaction to 200000 and then to 50000 but that had no influence.
I switched timers to HPET as I allowed C3 state. Disallowing C3 (still C1/C2 allowed and powerd running) made SQL test faster but nothing else better.
So where am I now?
I am able to kill it every time I run unpacking for long enough. I tested compression=disabled with no difference. While dead it deteriorates slowly even when left alone, shows load around 5 while only top runs and reboot fixes it. BTW - it also doesn't reboot stopping after famous "uptime", disabling usb wait doesn't help.
What should I look at or read about to fix it?
Cheers,
I am trying to come back to FreeBSD after turning to Linux in 1998. What I am building is a host with ZFS and several MySQL instances in jails. Well, it works but I am hitting some weird behavior which I can't explain.
First a bit about my hardware, yes I remember I had to craft it to match to FreeBSD not otherwise. Here it is:
1TB of RAM
2x Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
16x TOSHIBA PX05SVB096Y SAS3 SSD
and as it is Dell it has Dell's HBA330+ controller
Intel(R) Ethernet Connection XL710
4 NUMA domains
FreeBSD 11.1-release-p9
There is 2x ZFS:
- system root with 2x nvmem M.2 drives running as mirror
- data root with 16x Toshiba in mirrors (8 vdevs)
- compression enabled (lz4), no dedup
My plan was to utilize most of RAM as ARC and run MySQLs with direct flush or small buffers. I started testing it by running some well known SQL statements and it worked very well, however, not always.
Let's rewind to the very first day.
I had to install MySQL data from xtrabackup files. I had them in tar+lzop, 800G in file, around 1.8T unpacked. Run "lzop -d file | tar xf -" and bang - after unpacking around a half of its size system stopped responding properly. I repeated this a few times with exactly the same result:
- Whatever was open, say, a file in vim, still worked
- any other action such as ls, vi, top - hung frozen
After a few tries I started top first and kept watching. Nice, got some clue - these processes are waiting for another process trying to free ARC. Left for hours didn't progress so warm reboot.
Next step was to limit arc to 640G and set mem target and minimum to be 128/64. It worked, I was able to unpack files. Then I run MySQL with some tests with no issues, however, I noted wired memory growing fast and far beyond ARC. Disabled UMA - helped a bit but made it slow. Enabled UMA again.
Then further testing with unpacking. After having two tests run over MySQL and ARC sitting around 350G while wired around 500G, I started "pbzip2 -d file > deleteme". file.bz2 was 400G size with 1.8T packed content.
Whole process runs at 15% of drives speed until around 400G is unpacked. Then it slows down to next to nothing. ARC is full (640G), wired leaving around 63G free (less than min set) and still growing slowly. System is terribly slow, I am able to launch top but it takes minutes to start, no option to open a file, ls sometimes returns after minutes. Killing bzip doesn't help. Once I tried to delete "deleteme" file while system was dying, actually I had two files, one from previous test. Run "rm file1; rm file2" then ls. Files were gone but system deteriorated further. Warm reboot after around 30 minutes of nothing but top still running and reporting 59G free in RAM. After reboot surprise - files are back Conclusion - unable to flush transaction.
After that I limited number of blocks deleted in single transaction to 200000 and then to 50000 but that had no influence.
I switched timers to HPET as I allowed C3 state. Disallowing C3 (still C1/C2 allowed and powerd running) made SQL test faster but nothing else better.
So where am I now?
I am able to kill it every time I run unpacking for long enough. I tested compression=disabled with no difference. While dead it deteriorates slowly even when left alone, shows load around 5 while only top runs and reboot fixes it. BTW - it also doesn't reboot stopping after famous "uptime", disabling usb wait doesn't help.
What should I look at or read about to fix it?
Cheers,