How to debug pxeboot (anybody using it right now)?

cracauer@ · Oct 16, 2024

I have the FreeBSD machines in my PXE cluster down with very weird behavior.

"pxeboot" (the bootloader you get specified from dhcpd and that is retrieved via tftp) hangs. There were significant changes in recent versions, but I tried those from FreeBSD releases 11, 14 and 15-current (and a few others I didn't mark) and they all hang, in a variety of places.

What makes it odd:
- this used to work. I have fresh PXE installs of 2 machines from spring that suddenly stopped working. Keep in mind that I still use the same pxeboot binary, so this broke without me changing any of the software on the FreeBSD side
- I thought that maybe I updated the BIOS (and hence the pxe software in the BIOS) without remembering it, but just this morning I fired up an old mainboard that I used with pxeboot for a long time and that I definitely did not update. Same behavior now
- the Linux machines in this PXE cluster (booted via LILO) work just fine

I inserted some debug print statements into some progressing places in pxeboot, but that didn't tell me anything significant either. How do I debug pxeboot other than in a fully debugged virtual machine?

Does pxeboot work for anybody? I assume yes, because I Didn't Change Anything(tm).

So here are some remaining theories:
- something about my Ethernet changed. What and how would that affect pxeboot in FreeBSD but not Linux?
- it seems unlikely that changes in the dhcp or tftp servers are responsible. Or does it?
- I suppose I could try a diskless boot from the same dhcpd and tftpd in a virtual machine. That would determine whether the Ethernet is responsible

Opinions?

jb82 · Oct 16, 2024

cracauer@ said:
Does pxeboot work for anybody? I assume yes, because I Didn't Change Anything(tm).

I use PXE right now. With your previous help. It works just fine. The only issues I have are with USB network card I use for booting the other machine. I needed to ditch anything Realtek based and still I sometimes need to unplug/re-plug the device. Besides that dnsmasq and nfsroot seem to work properly.

cracauer@ · Oct 16, 2024

jb82 said:
I use PXE right now. With your previous help. It works just fine. The only issues I have are with USB network card I use for booting the other machine. I needed to ditch anything Realtek based and still I sometimes need to unplug/re-plug the device. Besides that dnsmasq and nfsroot seem to work properly.

Which tftpd do you use? I use tftpd-hpa, because Linux lilo PXE boot requires a daemon that announces the size of files in advance.

Thinking about all the puzzle pieces, a change in tftpd could explain my observations.

VladiBG · Oct 16, 2024

On my work i'm deploying all M$ OS via WDS (PXE) and some time ago i hit a bug when the tftp window size and the MTU cause packet fragmentation and interruption during the transfer of the LiteTouchPE_x64.wim during the network boot.
It may not be related to your issue but you can check the tftp window size and blocksize and using port mirror on the switch you can inspect the traffic for fragmentation.

1456 blcksize and 8 window size

And if you are using jumbo frames then you can try bigger blksize

jb82 · Oct 16, 2024

cracauer@ said:
Which tftpd do you use? I use tftpd-hpa, because Linux lilo PXE boot requires a daemon that announces the size of files in advance.

Thinking about all the puzzle pieces, a change in tftpd could explain my observations.

dnsmasq

tftpd section of my /usr/local/etc/dnsmasq.conf looks like:

Code:

# TFTP Configuration
enable-tftp  # Enable TFTP server functionality
tftp-root=/zroot/tftp  # Directory where TFTP boot files (e.g., pxelinux.0) are stored

#dhcp-boot=/freebsd/boot/pxeboot  # Specifies the bootloader file for PXE clients
dhcp-boot=/freebsd/boot/loader.efi
dhcp-option=17,10.0.0.1:/zroot/nfsroot

I had issues with NFS but tftpd just worked right away. I use dnsmasq for DHCP and tftp on a NATed virtual network/bridge to which I add also the USB interface ue0 (used for PXE), i.e. I bound dnsmasq to that vm-private bridge.

cracauer@ · Oct 19, 2024

I am getting very confused.

I use the "next-server" directive in dhcpd.conf to point the PXE bios thing to a different host with a different software tftp server. So far so good, it now seems to load pxeboot.

But now it is saying that it can't find /boot/lua/loader.lua. And it is a timeout, not a "not found" error.

That must be searched for via NFS, no? Because it is part of the root filesystem and my old config didn't have it in tftp. But I have not changed the root-path directive, it still points to the same NFS server I always use.

jbo@ · Oct 19, 2024

I know I'm not being helpful but... That spiderweb... please... PLEASE!!

cracauer@ · Oct 19, 2024

cracauer@ · Oct 20, 2024

OK, we figured it out. There are two issues at play here:

hpa-tftpd in Debian does something new that now makes all releases of FreeBSD's pxeboot(8) hang, even previously working ones. That change in hpa-tftpd must have been coming down the pipe when I wasn't looking. Things work fine with a FreeBSD tftpd.
FreeBSD's pxeboot(8) interprets the "next-server" clause from dhcpd correctly as redirecting the tftpd location. But it also redirects the NFS root dir location to the same IP address, and I doubt that this is correct. It directly ignores the IP address stated in "root-path". That is why things were not loading loader.lua on the redirected tftpd server setup.

So I need to find a tftpd that announces file size (for Lilo) and is not hpa (for FreeBSD).

An unrelated but remaining issue is that on some but not all mainboards the kernel load after pxeboot is very slow. I never figured out why that is. It is not that the NIC is placed in 10 Mbit mode, it is too slow for even that.

jbo@ · Oct 20, 2024

cracauer@ said:
hpa-tftpd in Debian does something new that now makes all releases of FreeBSD's pxeboot(8) hang, even previously working ones.

To clarify: The "even previously working ones" was specifically FreeBSD 11.

jb82 · Oct 21, 2024

cracauer@ said:
An unrelated but remaining issue is that on some but not all mainboards the kernel load after pxeboot is very slow. I never figured out why that is. It is not that the NIC is placed in 10 Mbit mode, it is too slow for even that.

Everybody talks about increasing block size, but neither tftp-hpa nor dnsmasq IMHO offer such option, do they? UDP might be an issue as well...

cracauer@ · Oct 21, 2024

Right now I am stuck in the next absurd situation. I tried a whole variety of tftp demons on that (Linux) primary server, and none actually work. They either have the same hang in FreeBSD's pxeboot (and work for Linux clients), or they cause timeouts for the on-BIOS PXE primary loader (means they work neither for FreeBSD nor Linux clients). That is while transfers via a commandline tftp program work fine.

The only working tftpd I have right now is the FreeBSD one on the other machine via "next-server" in dhcpd.conf. So in desperation I took a FreeBSD VM on the Linux primary server and port-forwarded it (so that a next-server statement is not needed). It also times out

The messages from the various tftpds that will lead to timeouts indicate that the boot file is successfully requested. ?‍

Jose · Oct 21, 2024

I think someone will be writing a new TFTP daemon soon...

cracauer@ · Oct 21, 2024

Jose said:
I think someone will be writing a new TFTP daemon soon...

I think the most constructive thing for me to do here is fix what I consider a bug in that "next-server"'s IP address overrides the NFS root address host portion even if that one is specified.

That way I can use tftpd from the working other server (FreeBSD) and keep NFS mounts on the primary server.

JamesElstone · Nov 25, 2024

Hi,

I have noticed that there are two distinct phases to PXE diskless, the BIOS PXE client issuing a DHCP request, and then the loader.efi / pxeboot.
They both make their own requests, and do ask for different options to be returned; service dnsmasq stop && dnsmasq -d was very enlightening recently.

With 14.1-p5 GENERIC, I have the following dnsmasq settings working reliabily, but use vendor and client tagging to work out what to send to the relevant PXE client:

Code:

port=53

# tftp section
enable-tftp
tftp-root=/usr/local/pxe/release/
tftp-mtu=1300
tftp-no-blocksize
tftp-unique-root=mac

dhcp-name-match=set:wpad-ignore,wpad
dhcp-ignore-names=tag:wpad-ignore

log-dhcp
log-debug
dhcp-authoritative
dhcp-no-override
dhcp-leasefile=/var/db/dnsmasq/dnsmasq.leases

dhcp-vendorclass=set:PXE,PXEClient
dhcp-vendorclass=set:BIOS,PXEClient:Arch:00000
dhcp-vendorclass=set:UEFI,PXEClient:Arch:00007
dhcp-vendorclass=set:UEFI,PXEClient:Arch:00009
dhcp-userclass=set:FREEBSD,FreeBSD


dhcp-option=bge0,1,255.255.255.0   #Subnet Mask
dhcp-option=bge0,3,<x.x.x>.254   #Router
dhcp-option=bge0,6,<x.x.x>.254   #Domain Server
dhcp-option=bge0,7,<x.x.x>.253   #Log Server
dhcp-option=bge0,42,<x.x.y>.244   #NTP Server
dhcp-option=bge0,134,"test1 test2 test3 other" #sysctl kern.bootp_cookie
dhcp-option-force=bge0,15,some.domain.somewhere.example.org #domain name

dhcp-option=PXE,66,<x.x.x>.253   #Server Name
dhcp-option=PXE,26,1300    #MTU

dhcp-option=BIOS,67,/FreeBSD/14.1/root/boot/pxeboot   #BIOS Boot filename
dhcp-option=UEFI,67,/FreeBSD/14.1/root/boot/loader.efi    #UEFI Boot filename

# FreeBSD
dhcp-option=PXE,FREEBSD,17,<x.x.x>.253:/usr/local/pxe/release/FreeBSD/14.1/root/    #NFS Root path
dhcp-option-force=FREEBSD,134,"this that something else"    #sysctl kern.bootp_cookie
dhcp-option-force=FREEBSD,26,1300    #mtu

dhcp-option-force=known,119,some.domain.somewhere,some.domain.elsewhere    #domain names to search

dhcp-host=f0:1f:af:e0:87:5a,id:*,set:<hostname>,<hostname.domain.name>,<x.x.x>.1,1800
dhcp-host=f0:1f:af:e0:87:5b,id:*,set:<hostname>,<hostname.domain.name>,<x.x.x>.65,1800

dhcp-range=<x.x.x>.129,<x.x.x>.160,1h

Where <something>.INT needs to be replaced with something sensible for other networks.

The above provides 'hobnob biscuit' level of repeat reliability for me unfortunately.

Also could DHCP option 13, "Boot File Size" which is in [RFC2132] defined as the "Size of boot file in 512 byte chunks" be of any help, if moving the TFTP network service back to a FreeBSD box?

VladiBG · Nov 25, 2024

Your "tftp-no-blocksize" set the blocksize to 512. It ensure that there will be no fragmentation due to MTU size but it cause the transfer speed to be slower.