recently lost network access from inside jails

I have a home server that has been running FreeBSD for several years. I have some applications running in jails that I set up with ezjail-admin. I recently upgraded to 11.0-RELEASE-p2, but everything was working fine after that upgrade. This afternoon, some jobs failed, and it quickly became apparent the problem was that all my jails had lost connectivity to the Internet.

I don't think I made any changes on the server that would have affected the network configuration, and I haven't done any port or package updates in the past few days, so I'm completely baffled. I need help to figure out why my jails can't access the Internet, any more.

The main server is working just fine. Here are some key configuration bits:

My rc.conf file has a bit more than this, but it does NOT include pf_enable="YES"
This server is connected to a router that is providing a caching name server, NAT, etc. I've always just used aliases of the server IP address for the jails, I don't do additional NATing for them.

Code:
# cat /etc/rc.conf
kld_list="accf_data accf_http"

zfs_enable="YES"

hostname=myserver.myhome.net
defaultrouter="172.16.1.1"
ifconfig_bge0="inet 172.16.1.11 netmask 255.255.255.0"
ifconfig_bge1="inet 172.16.1.12 netmask 255.255.255.0"

cloned_interfaces="lo1"

gateway_enable="YES"
sshd_enable="YES"
ezjail_enable="YES"
samba_enable="YES"

The main server (and all the jails) point to the router for DNS:

Code:
# cat /etc/resolv.conf
search myhome.net
nameserver 172.16.1.1


The main server has two ethernet ports, and I alias one of them for all the jails.

Code:
# ifconfig
bge0: flags=8c43<UP,BROADCAST,RUNNING,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
        ether 3c:a8:2a:4b:a7:a0
        inet 172.16.1.11 netmask 0xffffff00 broadcast 172.16.1.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
        ether 3c:a8:2a:4b:a7:a1
        inet 172.16.1.12 netmask 0xffffff00 broadcast 172.16.1.255
        inet 172.16.1.13 netmask 0xffffffff broadcast 172.16.1.13
        inet 172.16.1.19 netmask 0xffffffff broadcast 172.16.1.19
        inet 172.16.1.21 netmask 0xffffffff broadcast 172.16.1.19
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (none)
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo
lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 127.0.2.8 netmask 0xffffffff
        inet 127.0.2.2 netmask 0xffffffff
        inet 127.0.2.1 netmask 0xffffffff
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        groups: lo

Network is fine in the main server. No problems pinging any local addresses, aliases or the router:

Code:
# ping -c3 172.16.1.11                                                                                                
PING 172.16.1.11 (172.16.1.11): 56 data bytes
64 bytes from 172.16.1.11: icmp_seq=0 ttl=64 time=0.028 ms
64 bytes from 172.16.1.11: icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from 172.16.1.11: icmp_seq=2 ttl=64 time=0.034 ms

--- 172.16.1.11 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.028/0.033/0.037/0.004 ms

# ping -c3 172.16.1.12
PING 172.16.1.12 (172.16.1.12): 56 data bytes
64 bytes from 172.16.1.12: icmp_seq=0 ttl=64 time=0.026 ms
64 bytes from 172.16.1.12: icmp_seq=1 ttl=64 time=0.068 ms
64 bytes from 172.16.1.12: icmp_seq=2 ttl=64 time=0.008 ms

--- 172.16.1.12 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.008/0.034/0.068/0.025 ms

# ping -c3 172.16.1.21
PING 172.16.1.18 (172.16.1.21): 56 data bytes
64 bytes from 172.16.1.21: icmp_seq=0 ttl=64 time=0.028 ms
64 bytes from 172.16.1.21: icmp_seq=1 ttl=64 time=0.057 ms
64 bytes from 172.16.1.21: icmp_seq=2 ttl=64 time=0.025 ms

--- 172.16.1.21 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.025/0.037/0.057/0.014 ms

# ping -c3 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=64 time=0.239 ms
64 bytes from 172.16.1.1: icmp_seq=1 ttl=64 time=0.195 ms
64 bytes from 172.16.1.1: icmp_seq=2 ttl=64 time=0.214 ms

--- 172.16.1.1 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.195/0.216/0.239/0.018 ms

Also no problem looking up Internet addresses or pinging them:

Code:
# ping -c3 www.google.com
PING www.google.com (216.58.217.4): 56 data bytes
64 bytes from 216.58.217.4: icmp_seq=0 ttl=55 time=12.340 ms
64 bytes from 216.58.217.4: icmp_seq=1 ttl=55 time=14.606 ms
64 bytes from 216.58.217.4: icmp_seq=2 ttl=55 time=13.133 ms

--- www.google.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 12.340/13.360/14.606/0.939 ms


To make sure it wasn't something specific to the software installed in the jails, I created a clean, new jail, using

ezjail-admin create testjail 'bge1|172.16.1.21,lo1|127.0.2.8'

I started the jail, edited the resolv.conf and hosts files, and confirmed it was having the same problem as my other jails. So I added allow.raw_sockets=1 to the configuration file so I could try pings in the new jail. So the new jail configuration file looks like this:

Code:
# cat /usr/local/etc/ezjail/testjail
# To specify the start up order of your ezjails, use these lines to
# create a Jail dependency tree. See rcorder(8) for more details.
#
# PROVIDE: standard_ezjail
# REQUIRE:
# BEFORE:
#

export jail_testjail_hostname="testjail"
export jail_testjail_ip="bge1|172.16.1.21,lo1|127.0.2.8"
export jail_testjail_rootdir="/storage/jails/testjail"
export jail_testjail_exec_start="/bin/sh /etc/rc"
export jail_testjail_exec_stop=""
export jail_testjail_mount_enable="YES"
export jail_testjail_devfs_enable="YES"
export jail_testjail_devfs_ruleset="devfsrules_jail"
export jail_testjail_procfs_enable="YES"
export jail_testjail_fdescfs_enable="YES"
export jail_testjail_image=""
export jail_testjail_imagetype="zfs"
export jail_testjail_attachparams=""
export jail_testjail_attachblocking=""
export jail_testjail_forceblocking=""
export jail_testjail_zfs_datasets=""
export jail_testjail_cpuset=""
export jail_testjail_fib=""
export jail_testjail_parentzfs="storage/jails"
export jail_testjail_parameters=""
export jail_testjail_post_start_script=""
export jail_testjail_retention_policy=""
export jail_testjail_parameters="allow.raw_sockets=1"

Now the network in the jail looks like I think it should:

Code:
# ifconfig
bge0: flags=8c43<UP,BROADCAST,RUNNING,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
        ether 3c:a8:2a:4b:a7:a0
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
        ether 3c:a8:2a:4b:a7:a1
        inet 172.16.1.21 netmask 0xffffffff broadcast 172.16.1.21
        media: Ethernet autoselect (none)
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        groups: lo
lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 127.0.2.8 netmask 0xffffffff
        groups: lo

And I can ping my own address, and local IP addresses on the server:

Code:
# ping -c3 127.0.2.8
PING 127.0.2.8 (127.0.2.8): 56 data bytes
64 bytes from 127.0.2.8: icmp_seq=0 ttl=64 time=0.014 ms
64 bytes from 127.0.2.8: icmp_seq=1 ttl=64 time=0.044 ms
64 bytes from 127.0.2.8: icmp_seq=2 ttl=64 time=0.017 ms

--- 127.0.2.8 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.014/0.025/0.044/0.013 ms

# ping -c3 172.16.1.21
PING 172.16.1.21 (172.16.1.21): 56 data bytes
64 bytes from 172.16.1.21: icmp_seq=0 ttl=64 time=0.014 ms
64 bytes from 172.16.1.21: icmp_seq=1 ttl=64 time=0.016 ms
64 bytes from 172.16.1.21: icmp_seq=2 ttl=64 time=0.079 ms

--- 172.16.1.21 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.014/0.036/0.079/0.030 ms

# ping -c3 172.16.1.11
PING 172.16.1.11 (172.16.1.11): 56 data bytes
64 bytes from 172.16.1.11: icmp_seq=0 ttl=64 time=0.016 ms
64 bytes from 172.16.1.11: icmp_seq=1 ttl=64 time=0.017 ms
64 bytes from 172.16.1.11: icmp_seq=2 ttl=64 time=0.019 ms

--- 172.16.1.11 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.016/0.017/0.019/0.001 ms

# ping -c3 172.16.1.12
PING 172.16.1.12 (172.16.1.12): 56 data bytes
64 bytes from 172.16.1.12: icmp_seq=0 ttl=64 time=0.015 ms
64 bytes from 172.16.1.12: icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from 172.16.1.12: icmp_seq=2 ttl=64 time=0.018 ms

--- 172.16.1.12 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.015/0.017/0.018/0.001 ms

But I can't ping the router, or any other machines on the local network, so name resolution doesn't work.

Code:
# ping -c3 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes

--- 172.16.1.1 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss

# ping -c3 172.16.1.30
PING 172.16.1.30 (172.16.1.30): 56 data bytes

--- 172.16.1.30 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss

# ping -c3 216.58.217.4
PING 216.58.217.4 (216.58.217.4): 56 data bytes

--- 216.58.217.4 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss

Now, the jail's routing table looks weird, to me, but I'm not sure it was ever any different, and I can't add a default gateway.

Code:
# netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
127.0.2.8          link#4             UH          lo1
172.16.1.21        link#2             UHS         lo0

# route add default 172.16.1.1
route: writing to routing socket: Operation not permitted
 
The problem was right there in the ifconfig output! o_O

media: Ethernet autoselect (none)

Ethernet cable had come loose! And I was trying to turn it into a problem with jails!
 
Back
Top