Jails losing their IP address overnight on ec2

I have an EC2 instance with 2 jails, it's jail.conf looks like:

Code:
jail1 {
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.consolelog = "/var/log/jail_console_${name}.log";

  allow.raw_sockets;
  exec.clean;
  mount.devfs;

  path = "/jail/${name}";

  ip4.addr = 10.0.0.11;
  interface = ena0;
  persist;
}

jail2 {
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.consolelog = "/var/log/jail_console_${name}.log";

  allow.raw_sockets;
  exec.clean;
  mount.devfs;

  path = "/jail/${name}";

  ip4.addr = 10.0.0.12;
  interface = ena0;
  persist;
}

The ena0 interface is setup for DHCP (this is the default provided by the image):

Code:
ifconfig_ena0=DHCP

I also have some PF rules to forward traffic to the jails and setup NAT:

Code:
nat on ena0 from 10.0.0.0/23 to any -> 10.0.0.10

rdr on ena0 proto tcp from any to any port 80 -> 10.0.0.11 port 8080
rdr on ena0 proto tcp from any to any port 443 -> 10.0.0.11 port 4443
rdr on ena0 proto tcp from any to any port 22 -> 10.0.0.12 port 2223

pass in on ena0 proto tcp from any to any port { 22, 80, 443 } keep state

pass out on ena0 proto { tcp udp icmp } from any to any keep state

When this is all working as expected the interface has all expected IPs:

Code:
#ifconfig ena0
ena0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 9001
        options=422<TXCSUM,JUMBO_MTU,LRO>
        ether 02:02:91:7d:93:85
        inet 10.0.0.11 netmask 0xfffffe00 broadcast 10.0.1.255
        inet 10.0.0.10 netmask 0xfffffe00 broadcast 10.0.1.255
        inet 10.0.0.12 netmask 0xfffffe00 broadcast 10.0.1.255
        media: Ethernet autoselect (Unknown <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

However, sometimes when I go to use one of the jails (often the next day), I can't access it and have found that the ena0 interface only has some of the IPs:
Code:
# yesterday, missing one jail IP
ena0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 9001
        options=422<TXCSUM,JUMBO_MTU,LRO>
        ether 02:02:91:7d:93:85
        inet 10.0.0.11 netmask 0xfffffe00 broadcast 10.0.1.255
        inet 10.0.0.10 netmask 0xfffffe00 broadcast 10.0.1.255
        media: Ethernet autoselect (Unknown <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

# today, missing both jail IPs
ena0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 9001
        options=422<TXCSUM,JUMBO_MTU,LRO>
        ether 02:02:91:7d:93:85
        inet 10.0.0.10 netmask 0xfffffe00 broadcast 10.0.1.255
        media: Ethernet autoselect (Unknown <full-duplex>)net
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

Rebooting the VM or restarting whichever jails are missing their IP assignment resolves the issue and the IP gets re-added and networking works again.

I suspect this is related to using DHCP on ena0 but I if I try to disable DHCP, on reboot the setting in rc.conf is reset back to DHCP (I suspect by cloudinit)
 
Is there something related in /var/log/messages?
I don't see the primary ip of ena0, the one got by DHCP in your ifconfig output...

Note that if finally you can't avoid this problem, you can set your jails to use VNET.
 
Is there something related in /var/log/messages?

I do see a log from yesterday where dhclient updated the address:

Code:
Mar 27 18:27:04 ip-10-0-0-10 dhclient[3449]: New IP Address (ena0): 10.0.0.10
Mar 27 18:27:04 ip-10-0-0-10 dhclient[3453]: New Subnet Mask (ena0): 255.255.254.0
Mar 27 18:27:04 ip-10-0-0-10 dhclient[3457]: New Broadcast Address (ena0): 10.0.1.255
Mar 27 18:27:04 ip-10-0-0-10 dhclient[3461]: New Routers (ena0): 10.0.0.1

Nothing stands out before/after those messages in the logs.

Note that if finally you can't avoid this problem, you can set your jails to use VNET.

Would I still need to use a bridge if I switched to VNET? Perhaps not since I'm using PF now? I was initially worried I'd run into issues with promiscuous mode on EC2 but possibly I was thinking about it the wrong way.
 
Yes. You need to use a bridge if you want to use vnet interfaces, shouldn't be an issue:

Code:
cloned_interfaces="bridge1"
ifconfig_bridge1="addm ena0"
 
I just disabled cloud-init and set the primary ip and defaultrouter in rc.conf.
The config seems good after a reboot so that might be all I need to do, if not I'll give VNET a try.
Either way I'll report back if that worked.
 
Disabling cloud-init and using a static IP seems to be working fine.
I'm sure there's some way to do that in cloud-init but will save that for another day. Thanks all for the help!
 
Back
Top