Jails on same interface can't communicate

salimoneus · Feb 18, 2018

Running 11-1 Release, and have created 2 jails using qjail, each with different IP addresses, but on the same subnet and physical interface (em0).

Both jails have internet connectivity, and can communicate fine in both directions with other computers on that subnet.

But neither jail can connect to the other. I have enabled icmp and ssh on both jails, so I know those are working with other computers on that subnet.

Jails were created with qjail, the only options set were the interface and IP addresses. I also added a line to each of their conf files to enable raw sockets.

Here is the interface on the host where jail aliases have been created:

Code:

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
        ether 44:8a:5b:35:d0:67
        hwaddr 44:8a:5b:35:d0:67
        inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
        inet 192.168.1.98 netmask 0xffffffff broadcast 192.168.1.98
        inet 192.168.1.99 netmask 0xffffffff broadcast 192.168.1.99
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

I was expecting the jails to just have connectivity between each other by default, and not require any special pf/nat since the traffic should never even hit the router?

Do the jails need to be on different interfaces in order to be able to talk to each other? I've been digging around for answers in the man pages and the googles without any luck.

dch · Feb 19, 2018

You'll need to post the exact setup you used to create the qjails. I tried a
naive config and got the same results as you, BTW. I specifically needed to tell
qjail to use the correct interface for this to work as expected - I'm using lo1
cloned interface but your em0 example should also be fine.

Code:

## install bits

- qjail test run on vanilla FreeBSD 11.1p6 amd64

root@continuity /usr # pkg install qjail
Updating FreeBSD repository catalogue...
Fetching meta.txz: 100%    944 B   0.9kB/s    00:01
Fetching packagesite.txz: 100%    6 MiB   6.2MB/s    00:01
Processing entries: 100%
FreeBSD repository update completed. 29262 packages processed.
All repositories are up to date.
New version of pkg detected; it needs to be installed first.
The following 1 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
        pkg: 1.10.4 -> 1.10.5
Number of packages to be upgraded: 1
3 MiB to be downloaded.
Proceed with this action? [y/N]: y
[1/1] Fetching pkg-1.10.5.txz: 100%    3 MiB   3.0MB/s    00:01
Checking integrity... done (0 conflicting)
[1/1] Upgrading pkg from 1.10.4 to 1.10.5...
[1/1] Extracting pkg-1.10.5: 100%
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        qjail: 5.4

Number of packages to be installed: 1
73 KiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching qjail-5.4.txz: 100%   73 KiB  75.2kB/s    00:01
Checking integrity... done (0 conflicting)
[1/1] Installing qjail-5.4...
[1/1] Extracting qjail-5.4: 100%
Message from qjail-5.4:

########################################################################

## bootstrap qjail

Code:

root@continuity /usr# zfs create zroot/usr/jails
root@continuity /usr# qjail install
resolving server address: ftp2.freebsd.org:80
requesting http://ftp2.freebsd.org/pub/FreeBSD/releases/amd64/amd64/11.1-RELEASE/base.txz
remote size / mtime: 104780108 / 1500603338
base.txz                                      100% of   99 MB 1727 kBps 01m00s

The RELEASE distribution files are populating template.
Estimated less than 1 minute for this to complete.
sharedfs is being populated.
Estimated less than 1 minute for this to complete.
cp: /etc/localtime: No such file or directory
cp: /etc/localtime: No such file or directory

Successfully installed qjail system.

## network config

Code:

root@continuity /usr# grep lo1 /etc/rc.conf
cloned_interfaces="${cloned_interfaces} lo1"
ifconfig_lo1_aliases="inet 10.241.0.0-15/16"

root@continuity /usr# ifconfig lo1
lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 10.241.0.0 netmask 0xffff0000
        inet 10.241.0.1 netmask 0xffffffff
        inet 10.241.0.2 netmask 0xffffffff
        inet 10.241.0.3 netmask 0xffffffff
        inet 10.241.0.4 netmask 0xffffffff
        inet 10.241.0.5 netmask 0xffffffff
        inet 10.241.0.6 netmask 0xffffffff
        inet 10.241.0.7 netmask 0xffffffff
        inet 10.241.0.8 netmask 0xffffffff
        inet 10.241.0.9 netmask 0xffffffff
        inet 10.241.0.10 netmask 0xffffffff
        inet 10.241.0.11 netmask 0xffffffff
        inet 10.241.0.12 netmask 0xffffffff
        inet 10.241.0.13 netmask 0xffffffff
        inet 10.241.0.14 netmask 0xffffffff
        inet 10.241.0.15 netmask 0xffffffff
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo

## create some jails

Code:

root@continuity /usr# qjail create -4 10.241.0.1 -c -n lo1 one
Successfully created  one
root@continuity /usr# qjail create -4 10.241.0.2 -c -n lo1 two
Successfully created  two
root@continuity /usr# qjail start one
Jail successfully started  one
root@continuity /usr# qjail start two
Jail successfully started  two

## look at their network config

Code:

root@continuity /u/h/dch# jls
   JID  IP Address      Hostname                      Path
     3  10.241.0.1      one                           /usr/jails/one
     4  10.241.0.2      two                           /usr/jails/two
root@continuity /u/h/dch# qjail list
 
STATUS JID  NIC    IP              Jailname
------ ---- ------ --------------- --------------------------------------------
DR     3    lo1    10.241.0.1      one
DR     4    lo1    10.241.0.2      two

## run netcat tcp listener in jail one

Leave this console open in a separate terminal to see it

Code:

root@continuity /u/h/dch# qjail console one
FreeBSD 11.1-RELEASE-p4 (GENERIC) #0: Tue Nov 14 06:12:40 UTC 2017
Welcome to your FreeBSD jail.
one /root >nc -kl 12345

## check from host system

Code:

root@continuity /usr# sockstat -46l -p 12345
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
root     nc         89693 3  tcp4   10.241.0.1:12345      *:*
root@continuity /usr# echo helloooo | nc 10.241.0.1 12345
^C⏎

You should see output in the qjail one console.

## try it from jail two

This should also work.

Code:

root@continuity /usr# qjail console two
FreeBSD 11.1-RELEASE-p4 (GENERIC) #0: Tue Nov 14 06:12:40 UTC 2017
Welcome to your FreeBSD jail.
two /root >echo helloooo | nc 10.241.0.1 12345
^C

I'd try this first with a cloned lo1 to confirm there are no firewalls or NAT or whatever getting in the way, and then
re-try using your em0, no reason why this shouldn't work. The main difference for me between a failing config
and a working one is that without specifying the interface at creation time, stuff fails. I noted that qjail seems to
need to have the IPs already available on the interface or the creation silently fails. This could be my bad as
I literally read the manpage the first time today.

salimoneus · Feb 19, 2018

When I created the jails I did not use loopback in the qjail create command I just specified the physical interface (em0) for both:

Code:

qjail create -4 192.168.1.99 -n em0 two
qjail create -4 192.168.1.98 -n em0 one

The ifconfig from the host is in the OP, and here is the jail list:

Code:

JID  IP Address      Hostname                      Path
2  192.168.1.99    two                       /usr/jails/two
4  192.168.1.98    one                       /usr/jails/one


STATUS JID  NIC    IP              Jailname
------ ---- ------ --------------- --------------------------------------------
DR     4    em0    192.168.1.98    one
DR     2    em0    192.168.1.99    two

The jails were created successfully, and all communication with the internet and other systems on that subnet work as expected. I'm also able to connect to apache running on one of the jails from the internet using pf rdr rules as expected.

It's only the communication between jails that seems to be an issue. Having a bit of a hard time understanding how or why loopback would be needed in this scenario, but it sounds like you had success doing it that way?

I will give that a try by archiving my current jails and re-doing them using the loopback. Are there any drawbacks using this method, compared to strictly using the physical interface?

chrbr · Feb 19, 2018

Dear salimoneus,
to exclude the firewall from the list of culprins
- you could increase the verbosity of the firewall and check its log
- or run a short test with firewall completely open.

dch · Feb 19, 2018

salimoneus I only used a loopback as I didn't want to bork the machine in question, its a remote one, and I already
had the aliases set up.

+1 to chrbr 's suggestions. See https://www.freebsd.org/cgi/man.cgi?query=pf.conf and specifically add the /log/ keyword to any
rules you have in place that are blocking packets. If you can, disable the firewall and check if the connectivity changes. Your qjail
setup seems legit and the ifconfig seems correct too.

Before re-doing your stuff (unless you try it quickly in a VM) I suggest you post your full /etc/rc.conf, any qjail related config files
or settings, your /etc/pf.conf, and netstat -rn while the jails are up. Basically with that I should be able to re-create
your config here and see what's up.

salimoneus · Feb 19, 2018

chrbr said:
Dear salimoneus,
to exclude the firewall from the list of culprins
- you could increase the verbosity of the firewall and check its log
- or run a short test with firewall completely open.

Okay interesting. Clearing the firewall rules allows the jails to talk to each other.

I have the following pf rules to allow all traffic internally:

Code:

int_if="em0"

# by default block all
block in all

<snip>

# allow traffic originating locally
pass in log (all) on {$int_if} from any to any flags S/SA modulate state
pass out log (all) on {$int_if} from any to any flags S/SA modulate state

When I ping from one of the jails to another system on the subnet, it works fine and I see it in the pflog. If I comment out the "block in all" then jail-to-jail works, which is essentially the same as clearing since it's the only active block rule.

Why would the same rules above for local traffic not get hit when trying jail-to-jail?

salimoneus · Feb 19, 2018

After doing some more digging, and having loopback in my brain, I came across a post that mentioned all jails which aren't V-NET use the loopback interface even if you specify a physical interface. After a bit more digging, found someone who added the following to their pf rules:

Code:

set skip on lo0

And that seems to be working.

I've never had any rules that specify lo0 in my rulesets before, and have had no problems that I can think of.

Does this look like a proper solution to the issue? Are there other rules for lo0 that I should consider using on my gateway/router?

I saw someone suggest creating another loopback lo1 to isolate things from lo0, but would that really be worth the trouble?

rigoletto@ · Feb 19, 2018

That is more or less a basic pf entry.

EDIT: I think most people are using sysutils/iocell or sysutils/iocage now.

dch · Feb 19, 2018

I don't think those pf rules are doing what you think they're doing. Try this to see what is being blocked in realtime:

Code:

# service pflog onestart
# tcpdump -n -e -ttt -i pflog0

You can use this pfctl -vvnf /etc/pf.conf to see how pf expands your supplied ruleset into specific rules. I'm
thinking this and the realtime blocklist will show you what your config is missing ;-)

salimoneus · Feb 19, 2018

lebarondemerde said:
That is more or less a basic pf entry.

Yea, after looking at some example "starter" pf scripts, I do see lo0 accounted for in many of them. But still a fair number completely ignore it. I guess that's what I get for copying and pasting random shit off the internet

lebarondemerde said:
EDIT: I think most people are using sysutils/iocell or sysutils/iocage now.

Dang, just built a new server and completely re-did the jails. Will look into those next time, thanks for the tip!