NFS client hangs all the time when I enter large directory

Hi All,

I have been facing a strange problem on AWS, where I created a large FreeBSD NFS server, with the below configuration.

My throughput to writing a file is 600-800 Mbps and reading files are much higher. I have no issues with throughput, but the problem is with while accessing a folder which has several 1000's of files makes NFS client hangs, until I force reboot client, during the same time all other clients are fine with no issues(until other clients enters and try to open a file when i have 1000's of files)

file size doesn't matter, even a 0 byte causes problem.

Server config:
FreeBSD nfs-server 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1
128vcpu with 512 GB memory
50TB ZFS dataset
ZIL and L2arch configured with SSD.
NFSv3

Client side messages:
Code:
kernel: [  850.016074] nfs: server nfs-server not responding, timed out.
even umounting using -l option and re-mounting it back doesn't work for client, until i stop and start the client the problem will not go away.

I tried to re-produce the same problem on Linux to Linux, even if i take smallest configuration NFS server, I don't see NFS clients are having issues with Linux NFS servers, when it comes to accessing 1000's of files.

Is there anything else I should take a look at FreeBSD server side?
 
Last edited by a moderator:
I would like add one more point here, soft with timeo options works fine. but some times it fails with timeout error, but I am realy concern about data corruption by using soft option instead of hard NFS mount.
 
Hi All,

I have been facing a strange problem on AWS, where i created a large freebsd NFS server, with the below configuration.

My Throughput to writing a file is 600-800MBPS and reading files are much higher. I have no issues with throughput, but the problem is with while accessing a folder which has several 1000's of files makes NFS client hangs,
Has nothing to do with AWS, NFS, file system you are using, or FreeBSD for that mater. It has everything to do with the fact that you have hundreds of thousands of small files. I am surprised that you didn't run our of inodes. My friend you have so much metadata that you are bringing the system to its knees. It doesn't matter how small is a file. Its metadata is no smaller than the metadata of very large files. Now multiply that with 100 000 and you are in troubles. I have dealt with such problems at work (Carnegie Mellon University) when we were dealing with millions of tiny files. I am guessing you are in the same business of statistical data mining. Welcome to the club :)

You have to make directories smaller. You will not be able to use NFS (neither sync nor async) to export such files to other machines. You will probably have to use SAN.
 
Has nothing to do with AWS, NFS, file system you are using, or FreeBSD for that mater. It has everything to do with the fact that you have hundreds of thousands of small files. I am surprised that you didn't run our of inodes. My friend you have so much metadata that you are bringing the system to its knees. It doesn't matter how small is a file. Its metadata is no smaller than the metadata of very large files. Now multiply that with 100 000 and you are in troubles. I have dealt with such problems at work (Carnegie Mellon University) when we were dealing with millions of tiny files. I am guessing you are in the same business of statistical data mining. Welcome to the club :)

You have to make directories smaller. You will not be able to use NFS (neither sync nor async) to export such files to other machines. You will probably have to use SAN.
Hi Oka, Thanks for the detailed explanation, but the same issue is not found when i use a tiny Linux server with Linux Client combination. though I have little latency while using ls command, the client is not impacting.

Is there anyway I can tune freebsd ZFS metadata ?
 
Hi Oka, Thanks for the detailed explanation, but the same issue is not found when i use a tiny Linux server with Linux Client combination. though I have little latency while using ls command, the client is not impacting.

Is there anyway I can tune freebsd ZFS metadata ?
Linux is probably doing async writes and XFS can't do COW and checksum anyway. No question that it is faster. The real question is how much do you care for safety of your data. If the answer is I care but I could live with very low probability of data loss go with Linux. if you truly care about data loss and data integrity you are using the right tool.
 
Back
Top