Failed to contact local NSM - rpc error 5

I recently noticed these messages in the system logs and was wondering what they are and if there is a fix (if needed)

The box is a DB and NFS server



Code:
[root@dbraid /]# uname -a
FreeBSD dbraid 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #1: ....  amd64



[root@dbraid /]# tail -10 /var/log/messages

Jul 29 04:15:49 dbraid kernel: Failed to contact local NSM - rpc error 5
Jul 29 04:16:39 dbraid last message repeated 2 times
Jul 29 04:18:20 dbraid last message repeated 4 times
Jul 29 04:28:23 dbraid last message repeated 24 times
Jul 29 04:38:26 dbraid last message repeated 24 times
Jul 29 04:40:31 dbraid last message repeated 5 times
 
Try enabling the RPC statd.

Add to /etc/rc.conf:
Code:
rpc_statd_enable="YES"
 
Seems like that suggestion made the error only pop up once a day now, enabled it on both server & clients


Code:
[root@dbraid /usr/home/guy]# ps -aux | grep -i statd
root   80647  0.0  0.0 267808  1288  ??  Is   Thu11AM   0:00.41 /usr/sbin/rpc.statd


[root@dbraid /usr/home/guy]# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  rpcbind
    100000    3   tcp    111  rpcbind
    100000    2   tcp    111  rpcbind
    100000    4   udp    111  rpcbind
    100000    3   udp    111  rpcbind
    100000    2   udp    111  rpcbind
    100000    4 local    111  rpcbind
    100000    3 local    111  rpcbind
    100000    2 local    111  rpcbind
    100005    1   udp    630  mountd
    100005    3   udp    630  mountd
    100005    1   tcp    630  mountd
    100005    3   tcp    630  mountd
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100021    0   udp    819  nlockmgr
    100021    0   tcp    675  nlockmgr
    100021    1   udp    819  nlockmgr
    100021    1   tcp    675  nlockmgr
    100021    3   udp    819  nlockmgr
    100021    3   tcp    675  nlockmgr
    100021    4   udp    819  nlockmgr
    100021    4   tcp    675  nlockmgr
    100024    1   udp    720  status
    100024    1   tcp    832  status


[root@dbraid /usr/home/guy]# tail -100 /var/log/messages

Aug  3 07:52:03 dbraid kernel: Failed to contact local NSM - rpc error 5
Aug  3 07:52:28 dbraid kernel: Failed to contact local NSM - rpc error 5
Aug  3 07:54:34 dbraid last message repeated 5 times
Aug  3 07:55:24 dbraid last message repeated 2 times
Aug  4 03:41:15 dbraid kernel: Failed to contact local NSM - rpc error 5
Aug  4 03:42:05 dbraid last message repeated 2 times
Aug  4 03:43:46 dbraid last message repeated 4 times
Aug  4 03:53:49 dbraid last message repeated 24 times
Aug  4 04:03:52 dbraid last message repeated 24 times
Aug  4 04:07:13 dbraid last message repeated 8 times
 
Odd. I've been running NFS on FreeBSD for quite some years now and I've never seen that message before.

So, I'm just guessing a bit.. Are both client and server FreeBSD? Same version too?

Anything in /etc/exports (options for instance)?

In addition to rpc.statd I also have the rpc.lockd running (on both the server and clients). Not sure if it's strictly needed and it looks like you're already running rpc.lockd.
 
For future reference

This error popped up out of the blue about a week ago and I found this thread (it didn't help me, but hopefully this followup will help someone). I've been using FreeBSD since its beginnings and have never seen this error before. Both statd and lockd were enabled on the server and all clients, and they had all been working fine for some time. I eventually traced the problem to a badly corrupted installation on one of the client machines (by shutting everything down and fully testing NFS after bringing each one up). The problem client was causing stale locks, and hence problems with writes from other clients.

The hardware reliability of the corrupted client is uncertain (it's newly acquired and several years old). We were also hit by multiple power outages in rapid succession recently. After bringing this client back up, it exhibited other serious issues. After 43 passes of memtest86, a BIOS flash, and a clean reinstall, everything is back to normal (for the moment, but I won't be surprised if this client machine deteriorates again).

An important lesson I've learned over the years is that if there's any inconsistency in a FreeBSD system's behavior, it's best to look for a hardware issue. FreeBSD isn't perfect, but it sure is predictable. I've never been able to conclusively trace flaky behavior back to the OS, but I have fixed a lot of flaky problems with hardware replacements, and occasionally with BIOS updates.
 
I know this thread is a bit dated, but I ran into the same issue when we moved NFS servers and started getting the same error. The fix for us was to stop the statd and lockd daemons, blow away /var/db/statd.status and then restart statd/lockd. For us, this stopped the error.
 
Problem identified

Forgot to follow up on this before:

Our problem turned out to be an overheating issue. I replaced a bad chassis fan, and the system has been solid for several months.
 
Back
Top