Amazon AWS EC2 super light configuration

gschadow · Jan 10, 2019

I want to create small fast clone-able servers to spawn many worker systems that respond to HTTP servlet requests and to SQS messages. Yes, yes, I know that Amazon has this auto-scaling group stuff, and I will use that of course, but I don't have to use Amazon Linux for that. I want a small boot disk of the minimal 1 GB size, and then mount /usr from EFS (NFSv4).

Why do I want that? Because I want to scale up to possibly 100s of worker systems, each being a t2.micro instance, brought up on demand. Possibly in hibernating state to come up quickly, while not costing anything. Each such server needs one boot disk. But if I can do with 1 GB instead of 10 GB (the minimum size of the FreeBSD 12.0 AMI) I can save on disk space. The bulk of the /usr file system would be mounted through NFS. I can make the bootable root even smaller than 1 GB (500 MB) and read-only, so I can quickly clone that, and with the other 500 MB available for writable stuff.

My life has drifted away from FreeBSD since most cloud hosting providers push some CentOS Linux version on me. So I got used to that. But the issue with Amazon Linux / CentOS is that they have done away with the multi-stage boot design of original UNIX (or may be only Ultrix and BSD only ever really had it, I don't know). In this design

/sbin, /bin, /lib and of course /etc were on the root disk
/usr could be mounted later during the boot process,
and we intend to mount /usr from NFS

I know that a disk-less X-terminal is easily in the cards. I have build diskless set-top boxes with the once famous Soekris boards (in fact I still have one of Soeren's first ever prototypes). Diskless flash memory and network boot I have all done. Now I want to make a minimal boot disk, and then all of them use the same /usr world from NFS.

I have been able to push this idea up to a point. It is pretty easy do a decent boot and mount NFS without the /usr file system tree present if I make limiting assumptions, especially I need to configure the local IPv4 address statically in rc.conf (ifconfig_xn0="inet 172.x.y.z/24". And then still the boot is not complete, because after the /usr system has finally been mounted through NFS, it doesn't seem to be processing the AWS specific initializations. Unfortunately it seems that for an AWS instance to be connectable through the public IP address I must have the IPv6 interfaces configured, with dual-dhcp, and all of that depends on stuff installed in /usr/local and also a lot in /usr. These rc scripts seem to be making promiscuous use of /usr tools, such as /usr/bin/find and /usr/bin/sed, etc. That is not good to do a minimal bootable configuration.

I wonder if there is interest in reorganizing the AWS AMI a bit and push the essentials down into the minimal root file system. Also would like to know how it works for a new box to read configuration parameters from EC2 config resource. May be if I could just read the private IPv4 and IPv6 address from that configuration resource, then I wouldn't even need DHCP at all.

To whom it may concern, I give here my recipe for what I have:

create an initial instance from the latest official FreeBSD AMI (12.0 in my case)
initially already add 2 more disks:
1. one 1 GB (/dev/sdb) will show as /dev/xdb1
2. one 4 GB (/dev/sdc) will show as /dev/xbd2
boot
enter with ec2-user and then to root
- su - to become root
- obviously in the end a root password needs to be set
- FreeBSD doesn't rely so heavily on sudo
geom disk list shows all disk devices
1. ada0
2. xbd1
3. xbd2
gpart create -s GPT xbd1 to create a GPT label on the xbd1 device
gpart bootcode -b /boot/pmbr xbd1 to write the master boot record
gpart add -b 40 -s 472 -t freebsd-boot -l bootfs0 xbd1 to create the 2nd stage boot partition
- NOTICE the use of the -l option to set a label (name) for the partition, here bootfs0
  - we are recreating the setup of the AMI boot disk gpart show -l ada0
    - bootfs
    - rootfs
  - we must then use any of these partitions through its named handle
    - /dev/gpt/bootfs - currently booted system
    - /dev/gpt/rootfs - currently booted system
    - /dev/gpt/bootfs0 - new boot partition
    - /dev/gpt/rootfs0 - new root partition (see below)
    - /dev/gpt/usrfs - new usr file system (see below)
copy the boot code onto the boot partition
- gpart bootcode -p /boot/gptboot -i 1 xbd1 is how its done, but I didn't know, so
- dd if=/dev/gpt/bootfs of=/dev/gpt/bootfs0 status=progress is what I did.
If you follow man gpart, don't set up swap partition
gpart add -t freebsd-ufs -l rootfs0 xbd1 - to set up /dev/gpt/rootfs0 using the entire rest of the disk
- newfs -U /dev/gpt/rootfs0 - to create a file system on the new partition
gpart create -s GPT xbd2 - label for the usr disk
gpart add -t freebsd-ufs -l usrfs xbd2 - to set up /dev/gpt/usrfs using the entire disk
- newfs -U /dev/gpt/usrfs - to create a file system on the new partition
mount /dev/gpt/rootfs0 /mnt - to mount what will be the new minimal root disk
- echo * - list all items on /
- cd /mnt - and go to our new root
- (cd / ; tar cf - boot bin sbin ... all items from / except for mnt and usr) |tar xvf - - copy everything from / to /mnt, except /usr and /mnt itself
  - ls -l / . - compare
  - cp the dot-files that you didn't see initially, .snap, .profile, .cshrc
- mkdir /mnt /usr - make the 2 directories not copied
- compare once more that you have everything
mount /dev/gpt/usrfs /mnt/usr - to mount what will be the /usr file system
- (cd / ; tar cf - usr) |tar xvf - - copy the entire /usr tree on the new file system
now we are almost ready
- umount /mnt/usr - already unmount the new /usr file system
- vi /mnt/etc/fstab - add /usr mount usrfs /usr ufs 1 1
- umount /mnt - unmount the new / root.
- gpart modify -i 2 -l rootfs xbd1 - change label from rootfs0 to rootfs as it will be the only disk attached with that label on next boot
- gpart modify -i 0 -l bootfs xbd1 - change label from bootfs0 to bootfs as it will be the only disk attached with that label on next boot
  - not sure where it is used, but we want to mirror exactly the set up of the AMI, only using 2 different disks.
shutdown -p now - shut down the system and power down
set up the disk configuration with the two new disks now
- detach all 3 disks from EC2 instance
- attach the 1 GB root disk to EC2 instance as /dev/sda1 - will become ada0
- attach the 4 GB usr disk to EC2 instance as /dev/sdb - will become xbd1
- start EC2 instance
  - see it boot!

Now we have the working set up with 2 disks. On with replacing /usr with the EFS. First
mount the EFS through NFSv4:

mount_nfs -o nfsv4,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-fXXXXXXX.efs.RRRRRR.amazonaws.com:/ /mnt

and check:

ls /mnt

it's empty at first.

Now fill it:

cd /mnt
(cd /usr ; tar cf - .) |dd bs=10m status=progress |tar xf -

Then I have added the /etc/fstab entry:

/dev/gpt/rootfs / ufs rw 1 1
/dev/gpt/usrfs /usr ufs rw 1 1
fs-fXXXXXX.efs.RRRRRR.amazonaws.com:/ /mnt nfs rw,nfsv4,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0

and the server into the /etc/hosts

172.X.Y.Z fs-fXXXXXX fs-fXXXXXX.efs.RRRRRR.amazonaws.com

But then this never works because the network doesn't come up.

I try

netwait_enable=YES
netwait_if="xn0"
netwait_ip="172.A.B.C"

noting that the EFS server can never be pinged. But all that is of no use. Until I give:

ifconfig_xn0="inet 172.31.58.248 netmask 0xfffff000"

then it will boot, but I can only connect through that local IPv4 address, not through the global IP address. I am pretty sure AWS maps the global IPv4 address to the server via IPv6.

With the system basically up, I could now manually add the /usr/local/etc/rc.d work or may be just set up the IPv6 in rc.local.

But wouldn't it be much nicer if we moved the very essential stuff into the rootfs partition leaving all that Python stuff behind that the AWS CLI needs. Anyone interested in fleshing this out with me?

gschadow · Jan 12, 2019

Well, I am going to reply to my own post. I figured out a nice solution.

UFS atime attribute saved the day!

I created a file

touch /marker

right before reboot from the stock system. After reboot I can run:

find /usr -neweram /marker > /essential.list

And then I can mount my minimal bootable system

cd /mnt
(cd / ; tar cfT - /essential.list) | tar xvf -

and that way I have everything I need to get the system up including all AWS EC2 setup things. I revert rc.conf to the original behavior (minus commenting out everything about "firstboot").

Then I find that I hardly use 50% of a half GB. This means I start over with a new volume of 1 GB which I divide in 3 partitions, first half with that rootfs, and second half with a file system I call varfs. I move var onto that. Clearly I will have to move some of the /var stuff further on to /usr/var, especially the package installation db, etc.

But I can add the NFS mount in /etc/fstab with mount option ,late,bg added and the system boots clean even with services like sendmail coming up.

gschadow · Feb 20, 2019

... I hit a problem, my FreeBSD based workers consistently lock up after some 12 hours. Very difficult to diagnose because we can't go to any serial console or some such on AWS. No errors shown on the console picture. The servers just lock up and we can't log in any more.

ralphbsz · Feb 21, 2019

Are you sure Amazon AWS doesn't have a virtual serial console? I know that Google Cloud has it; matter-of-fact, I've used it to log in to my externally hosted virtual FreeBSD machine. And given that the large vendors all attempt to have feature parity, it would surprise me if Amazon doesn't have it.

obsigna · Feb 21, 2019

ralphbsz said:
Are you sure Amazon AWS doesn't have a virtual serial console? I know that Google Cloud has it; matter-of-fact, I've used it to log in to my externally hosted virtual FreeBSD machine. And given that the large vendors all attempt to have feature parity, it would surprise me if Amazon doesn't have it.

I was also looking for the virtual serial console of my AWS-EC2 instance. And a few month ago, the best I could find was the „Instance Screenshot“, which still happens to work. Although, I would love, anybody proves me wrong. I know, Google Cloud Computing got a real interactive virtual serial console, and this comes in quite handy, in case you need to recover a system which otherwise would not carry on booting until start of sshd.

Perhaps gschadow referred to this „Instance Screenshot“ (AWS terminology) when he mentioned „console picture“. Just in case something else was meant, the screenshot can be opened in the Actions menu of the Instance dashboard:

Bobi B. · Feb 21, 2019

Have you considered nanobsd (more information available here). It is very solid, also enables atomic upgrades, if you allocate two root filesystem partitions (as you should). Lots of features and size-related optimizations are already there.

gschadow · Nov 1, 2020

Been running this in development for over a year now, but there is a nasty lock-up issue https://forums.freebsd.org/threads/massive-problems-with-freebsd-on-aws.77554/ that didn't go away after upgrade from 12.0 to 12.2. Bug report here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250770

rootbert · Nov 1, 2020

you might reconsider your architecture. If you stick to AWS, choose the fargate service: it is made for scaling services. According to your description, this really seems tailored to what you want to achieve. Fargate takes the burden of building/taking care of systems, just provide the container and run for your millions of applications. Unfortunately, it is Linux tech, we don't have application containers in FreeBSD.

ralphbsz · Nov 1, 2020

Thanks for updating the OP. I suspect the root cause of the problem is that AWS probably uses a lot of IPv6 internally (as they have to, there aren't enough IPv4 addresses left for using them in the big cloud data centers).

Amazon AWS EC2 super light configuration

Profile disabled