Hello!
I wonder if anyone here has recent experience on running several thousands of Jails, each having their own IP-address on a multi-node NUMA computer?
I use ezjail to set up 2048 jails. This works smoothly. However when accessing multiple of these jails concurrently (up to 100 connections at a time) with ssh executing remote commands reading the file system, operating system hangs in low tens of minutes.
System is running 9.0-RELEASE GENERIC kernel. Sysctl kern.maxfiles=128000 has been increased, but the other aspects of the system are stock. At the time of hang, only a small fraction of memory is used.
At this point I'm interested whether the setup is something that is known to work or fail.
I'll try 9.1-RC1 later, possibly with custom kernel having more run-time checks enabled to figure out what goes wrong.
My platform is a Supermicro 2U rack server with 4 * AMD Opteron 6100 series 8-core CPU's and the system has 512GB ECC RAM and four network adapters. I'd like to run 20000 hosts on this system.
Any advice on how to approach the issue are welcome.
P.S. I guess a five minute delay immediately after loading kernel is due to memory tests. Can these tests be disabled somehow?
P.P.S. It doesn't work any better with Linux, either.
BR
--
Tero M
				
			I wonder if anyone here has recent experience on running several thousands of Jails, each having their own IP-address on a multi-node NUMA computer?
I use ezjail to set up 2048 jails. This works smoothly. However when accessing multiple of these jails concurrently (up to 100 connections at a time) with ssh executing remote commands reading the file system, operating system hangs in low tens of minutes.
System is running 9.0-RELEASE GENERIC kernel. Sysctl kern.maxfiles=128000 has been increased, but the other aspects of the system are stock. At the time of hang, only a small fraction of memory is used.
At this point I'm interested whether the setup is something that is known to work or fail.
I'll try 9.1-RC1 later, possibly with custom kernel having more run-time checks enabled to figure out what goes wrong.
My platform is a Supermicro 2U rack server with 4 * AMD Opteron 6100 series 8-core CPU's and the system has 512GB ECC RAM and four network adapters. I'd like to run 20000 hosts on this system.
Any advice on how to approach the issue are welcome.
P.S. I guess a five minute delay immediately after loading kernel is due to memory tests. Can these tests be disabled somehow?
P.P.S. It doesn't work any better with Linux, either.
BR
--
Tero M
 
			     
 
		