Recommended RAM for L2Arc !!

Hi,

We're using FreeBSD server for video streaming site serving 200-4096MB size of each video. Server is built around 12 x 3TB SATA HDDs on HBA RAID-10 with 64GB DDR3 RAM.

Recently we decided to add SSD as L2ARC device to cache video files and wanted to gather recommendations regarding how much memory should be used when adding 500G-1TB SSD as cache? Would 64GB work efficiently?

I tried to google about it but didn't find any useful resource.

Anyway, we're currently looking to go with Fusion-IO 1TB SSD card.

Regards.
Shahzaib
 
  • Thanks
Reactions: Oko
We're using FreeBSD server for video streaming site serving 200-4096MB size of each video.
As these are big files with mostly sequential data I'm doubtful L2ARC is going to be beneficial. I'm not even sure ARC is useful in this case.
 
It has been my experience that you want to further characterize the workload.. Are any of the video streams reread? Do multiple users look at the same thing at the same time? Those are your sizing metrics. L2 can become problematic at about 10x the size of ram. You can measure all of these things on a test platform. I prefer never to use "rules of thumb" as most of them are for one type workload.
 
As these are big files with mostly sequential data I'm doubtful L2ARC is going to be beneficial. I'm not even sure ARC is useful in this case.
ARC is the primary memory which is RAM and it's always useful either its video files or small files as we've concurrent users on the site accessing same video multiple times over webserver 'nginx' just like any video sharing website out there.
Can you please let me know why you think even ARC is not going to be useful?
 
It has been my experience that you want to further characterize the workload.. Are any of the video streams reread? Do multiple users look at the same thing at the same time? Those are your sizing metrics. L2 can become problematic at about 10x the size of ram. You can measure all of these things on a test platform. I prefer never to use "rules of thumb" as most of them are for one type workload.

Yes its a video sharing website like any streaming website out there where some videos gets viral and accessed multiple times from different users.
 
Your limiting factor as already has been pointed out is your RAM - 64GB just doesn't allow a huge L2ARC. Best case would be your L2ARC is never fully used; worst case - and more likely - your performance will get even worse, as you heavily capped your ARC size by sacrificing RAM for L2ARC mapping.

For this use case you'd have to keep all, or at least the "hot" data in a faster pool that ideally already resides on SSDs (NVMe!). Have a look for presentations by netflix on their OpenConnect appliances - their performance requirements might be overkill for you, but storage-wise you get a glimpse of what you need to aim for (also IIRC they don't use ZFS as they just care for raw throughput, not data resiliency - the data is re-mastered every few days anyways...)
Spinning rust with a slow SATA interface is definitely not the thing you want to work with if bandwidth and latency is key. If cost is a limiting factor for using NVMe drives for the whole pool, at least go for SAS drives and create a pool that is distributed over multiple vdevs (use more smaller instead of fewer bigger disks) to spread the IO and increase throughput of the pool.
Also use a hardware platform that enables you to use (MUCH) more RAM; only then you can think about adding a huge L2ARC as a last resort.

I'd also test with varying record sizes; depending on the data this can heavily impact I/O performance. For larger files usually a larger record size is beneficial.
 
ARC is the primary memory which is RAM and it's always useful either its video files or small files as we've concurrent users on the site accessing same video multiple times over webserver 'nginx' just like any video sharing website out there.
Can you please let me know why you think even ARC is not going to be useful?
Because there are relatively large files and a finite amount of RAM. So it's only possible to cache a few files and depending on the workload it could end up constantly evicting 'old' cached data to cache a newly requested file. These constant cache misses may actually be detrimental to the overall performance. Caches are nice but caching causes some overhead (cache management) and it depends on the effectiveness if the overhead is significant or not. Cache the wrong things and the overhead will cause performance to degrade, which is exactly the opposite of what you're trying to achieve with caching in the first place.

Some time ago I worked for a porn site, they obviously had a LOT of video files, large and small. The way they handled it was by adding a couple of fast Varnish servers in front of the file servers. The Varnish servers took care of most of the caching and had fast 10K spinning disks. The file servers themselves had massive amounts of storage with fairly basic 7200RPM disks. The web (content) servers linked back to the Varnish servers through several load-balancers for any of the image or video files.
 
Back
Top