I've been chasing this all week. Found some interesting clues, but I've about run out of ideas. I think my problem may be related to unexpected behavior (it may be my expectations are wrong!) of mountd and nfsd.
Problem - FreeBSD 7.1 box with two NICs:
1) One setup as a DHCP client, outgoing connection to "the world".
2) One setup as a PXE/DHCP server to an isolated subnet which will not be routed or exposed to "the world".
Since (1) is outgoing, it does not need, nor should it have, any access to NFS or the DHCP server hosted by the box.
Some PXE clients work, some do not. Of those that work, some may cease to work and they will no longer work again. These failing clients connect, obtain pxeboot (default, nfs behavior), and at that point the boot process locks-up, timing out after 20+ minutes:
It's not able to access NFS after the initial TFTP load.
After much research, I've found that if I bring the first network down via 'ifconfig rl0 down' while the client is waiting at the (not-so-spinning) slash, then pxeboot instantly picks up and boots. I came to try this after noting that 'rpcinfo' shows mountd and nfs to be associated with '0.0.0.0' for all transports, while rpcbind is associated with 192.168.1.1 (the IP of the FreeBSD PXE host at/on the subnet interface bge0). Given my /etc/rc.conf (below), I expected all three to be bound to 192.168.1.1.
So, am I completely misunderstanding the purpose of the -h flags for mountd and nfsd?
Clearly the presence of the active rl0 (outgoing connection to the world) is interfering with the traffic between the clients and mountd/nfsd. What must I do to ensure that mountd/nfsd are not trying to use the outgoing (rl0) interface, but only the subnet?
output of 'netstat -r' when rl0 is "up":
output of 'netstat -r' when rl0 is "down":
output of 'ifconfig' (rl0 up):
/etc/rc.conf:
/etc/dhclient.conf
# empty file
/usr/local/etc/dhcpd.conf:
/etc/exports:
Problem - FreeBSD 7.1 box with two NICs:
1) One setup as a DHCP client, outgoing connection to "the world".
2) One setup as a PXE/DHCP server to an isolated subnet which will not be routed or exposed to "the world".
Since (1) is outgoing, it does not need, nor should it have, any access to NFS or the DHCP server hosted by the box.
Some PXE clients work, some do not. Of those that work, some may cease to work and they will no longer work again. These failing clients connect, obtain pxeboot (default, nfs behavior), and at that point the boot process locks-up, timing out after 20+ minutes:
Code:
FreeBSD/i386 bootstrap loader, Revision 1.1
...
pxe_open: server addr: 192.168.1.1
pxe_open: server path: /usr/home/nfs
pxe_open: gateway ip: 0.0.0.0
\
\
can't load 'kernel'.
It's not able to access NFS after the initial TFTP load.
After much research, I've found that if I bring the first network down via 'ifconfig rl0 down' while the client is waiting at the (not-so-spinning) slash, then pxeboot instantly picks up and boots. I came to try this after noting that 'rpcinfo' shows mountd and nfs to be associated with '0.0.0.0' for all transports, while rpcbind is associated with 192.168.1.1 (the IP of the FreeBSD PXE host at/on the subnet interface bge0). Given my /etc/rc.conf (below), I expected all three to be bound to 192.168.1.1.
So, am I completely misunderstanding the purpose of the -h flags for mountd and nfsd?
Clearly the presence of the active rl0 (outgoing connection to the world) is interfering with the traffic between the clients and mountd/nfsd. What must I do to ensure that mountd/nfsd are not trying to use the outgoing (rl0) interface, but only the subnet?
output of 'netstat -r' when rl0 is "up":
Code:
Routing tables
Internet:
Destination Gateway Flags Refs Use Netif Expire
default 192.168.0.1 UGS 0 381 rl0
localhost localhost UH 0 544 lo0
192.168.0.0 link#2 UC 0 0 rl0
192.168.0.1 00:18:e7:c4:88:9e UHLW 2 0 rl0 465
192.168.0.101 00:1e:65:28:b9:6a UHLW 1 196 rl0 1225
192.168.1.0 link#1 UC 0 0 bge0
192.168.1.241 00:0f:1f:bd:07:93 UHLW 1 9345 bge0 246
output of 'netstat -r' when rl0 is "down":
Code:
Routing tables
Internet:
Destination Gateway Flags Refs Use Netif Expire
default 192.168.0.1 UGS 0 416 rl0
localhost localhost UH 0 694 lo0
192.168.1.0 link#1 UC 0 0 bge0
192.168.1.241 00:0f:1f:bd:07:93 UHLW 1 9345 bge0 19
output of 'ifconfig' (rl0 up):
Code:
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
ether 00:11:43:ca:54:f7
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
media: Ethernet autoselect (1000baseTX <full-duplex>)
status: active
rl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:0a:cd:18:5e:5a
inet 192.168.0.103 netmask 0xffffff00 broadcast 192.168.0.255
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
plip0: flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT> metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
/etc/rc.conf:
Code:
ifconfig_rl0="DHCP"
ifconfig_bge0="192.168.1.1 netmask 255.255.255.0"
dhcpd_enabled="YES"
dhcpd_ifaces="bge0"
inetd_enabled="YES"
rcpbind_enabled="YES"
rpcbind_flags="-h 192.168.1.1"
nfs_server_enabled="YES"
nfs_server_flags="-u -t -n 4 -h 192.168.1.1"
mountd_enabled="YES"
mountd_flags="-r -h 192.168.1.1"
named_enabled="YES"
/etc/dhclient.conf
# empty file
/usr/local/etc/dhcpd.conf:
Code:
default-lease-time 600;
max-lease-time 7200;
authoritative;
ddns-update-style none;
subnet 192.168.1.0 netmask 255.255.255.0
{
range 192.168.1.15 192.168.1.241;
option subnet-mask 255.255.255.0;
option broadcast-address 192.168.1.255;
server-name "192.168.1.1";
next-server 192.168.1.1;
filename "pxeboot";
option root-path "/usr/home/nfs";
}
/etc/exports:
Code:
/usr/home/nfs -ro -maproot=0 -alldirs -network=192.168.1.0/24