Hi, all. We have a simple setup: one NFS server and many clients having RW access. Server setup is as follows:
/etc/rc.conf:
/etc/exports:
And each client is created and destroyed dynamically. Normally around 30-40 running at any given time:
And the mountpoint is mounted like this when starting:
(ac* options is me trying to fix the problems).
Now the problem: more often than not the mount hangs, or file access hangs, or `df` hangs, or umount hangs, which all makes my life miserable. Plain `mount` call never hangs though, and displays the list of mounted filesystems, sometimes the mount on /mnt is duplicated literally hundreds of times (this is probably the result of me trying to umount /mnt + mount it again when file access (`cp` from the server to local) fails - probably mount/umount fail but leave the mount around). If this wasn't enough, the nfsd on the server chews up a lot of system CPU (no meaningful disk access at that time according to `gstat`), literally 100% cpu for about a minute, and then eases for 30-40 seconds or so. I can "fix" this cpu chewing by not trying to mount/umount on clients in case `cp` fails, but this leaves the clients inoperable without the needed files. I'm not really sure how to fix this NFS, and I'm open to any ideas. Also the NFS server logs contantly lines like this:
Needless to say /etc/hostid is unique on each client)
/etc/rc.conf:
Code:
nfs_server_enable="YES"
nfsv4_server_enable="YES"
nfs_server_flags="$nfs_server_flags --maxthreads 2048"
mountd_flags="$mountd_flags -p 878"
rpcbind_enable="YES"
rpcbind_flags="-s"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
/etc/exports:
Code:
V4: /path/to/storage -network 100.64.0.0 -mask 255.255.0.0
And each client is created and destroyed dynamically. Normally around 30-40 running at any given time:
Code:
rpc_lockd_enable="YES"
rpcbind_enable="YES"
rpcbind_flags="-s"
rpc_lockd_flags="-h 127.0.0.1" # Flags to rpc.lockd (if enabled)."
nfs_client_enable="YES"
rpc_statd_enable="YES"
And the mountpoint is mounted like this when starting:
Code:
/sbin/mount -t nfs -o nfsv4,nosuid,acregmin=3600,acregmax=86400,acdirmin=3600,acdirmax=86400 masterdb.local:/$project_name/ /mnt
(ac* options is me trying to fix the problems).
Now the problem: more often than not the mount hangs, or file access hangs, or `df` hangs, or umount hangs, which all makes my life miserable. Plain `mount` call never hangs though, and displays the list of mounted filesystems, sometimes the mount on /mnt is duplicated literally hundreds of times (this is probably the result of me trying to umount /mnt + mount it again when file access (`cp` from the server to local) fails - probably mount/umount fail but leave the mount around). If this wasn't enough, the nfsd on the server chews up a lot of system CPU (no meaningful disk access at that time according to `gstat`), literally 100% cpu for about a minute, and then eases for 30-40 seconds or so. I can "fix" this cpu chewing by not trying to mount/umount on clients in case `cp` fails, but this leaves the clients inoperable without the needed files. I'm not really sure how to fix this NFS, and I'm open to any ideas. Also the NFS server logs contantly lines like this:
Code:
nfsrv_cache_session: no session IPaddr=100.64.87.21, check NFS clients for unique /etc/hostid's