You say your workload is random read. Is it really only random read? Or is it 80% random read with 20% write, perhaps small random write? Next question: How large are your IOs? For example, if you have a 20TB file system (quite realistic with 5 drives), you could have 100MB random reads, or you could have 4KB random reads, and that makes a giant difference. While 100MB reads are still "random" when compared to the overall size of the file system, they are individually large enough to get great sequential performance out of the drives, and prefetching will work well.
Next question: Are the reads whole files (but you happen to have lots of smallish files), or is it random offsets within a large file? This matters for how intense the metadata traffic will be, and also random offsets within a large file tends to defeat prefetching.
Eric's question about the working set it very important, but I fear the answer may be: there is no working set in the usual sense, and the whole big file system gets accessed. If there is no re-use, then read caching (including ARC and L2ARC) won't help, and you just need to get your backend disks to run faster.
And an insulting question: Have you turned off atime updates?
With a parity-based RAID, random small writes are expensive. So much so that even mixing in 20% small writes can hurt overall performance. On the other hand, parity-based RAID tends to be pretty good with random reads (because you read the data directly from disk, without having to do parity stuff); and with the rotating parity on modern RAID implementations, a random read workload will keep all disks busy. That's why I asked about atime: those can turn a read-only workload into having a significant metadata update workload (in particular if you are reading many small files, meaning many atimes need to be updated).
If you have small random writes, one great (but difficult) strategy is to change the application to get rid of them. Maybe implement your own log, where the writes are temporarily stored in a sequentially written file, and then have a separate process that asynchronously writes that log back to the correct location.
If your workload is metadata read intensive, it could be that metadata is the real bottleneck. In that case, moving all the metadata to SSD would help (but the reliability problem you point out is real).
Lastly, if you use the 2.5" slot to handle the OS, then maybe you don't need to actually use an SSD, but you can use a SFF (laptop-sized) spinning disk. Those are obviously much cheaper than SSDs and have better endurance.