Solved (Solved: BEWARE of ZFS bug!) AData Legend 960: problems as L2ARC - does somebody use this NVMe SSD?

blacklion

Developer
After upgrading platform of my home storage server to MoBo with M.2 socket and enough PCIe (though only 3.0) lines, I've added new (unused) AData Legend 960 NVMe SSD as L2ARC (this SSD has best-in-class sustained write performance, when it is full).

Almost immediately I get live lock related to ZFS (with "z_rd_int_%d" thread consume 100% of one thread). It repeats twice.

No errors related to nvme are shown, `smartctl` shows no errors. But looks like it is rather unique problem, as I can not google something like this.

Are somebody know about problems with this SSD on FreeBSD?

Are somebody know about ZFS problem like this?

My system is 13.2-STABLE FreeBSD 13.2-STABLE stable/13-n256849-05c55eed44e5
 
The only reports I've found is that it is unstable in a PS5, the SM2264F is also a not so common controller overall so you might be looking at some odd controller/firmware issue. People have reported that some WD drives also do not play nice but other than that I havent heard of any ZFS related issues.
 
T500 looks very strange: in "write when full" mode its performance looks unpredictable, and it requires very high over-provisioning to be comparable with P5 Plus and FireCuda.

P5 Plus and FireCida 530 looks comparable, one or other is more performant depends on benchmark methodology (different sites show different relative results), but both are never bad.
 
But anyway it is better to fix AData Legend. Problem is I can not repeat live-lock without ZFS (all low-level torture scenarios works as expected!) and with ZFS it is almost impossible to diagnose - ZFS people points toward hardware people and vice versa.
 
(this SSD has best-in-class sustained write performance, when it is full).
I have never heard of half the brands recommended here. Adata I would never consider in my life.
I want a storage company that has been around a while.

You want solid, get Samsung or Intel enterprise drives.

You seem to be tasking a consumer NVMe with a server task (L2ARC).
Server NVMe drives are not that much more.

Heck Optane seems perfectly suited and they are dirt cheap now.
 
Actually the Optane drive I was thinking of has terrible performance.

It seems to me L2ARC and SLOG really dont need 1TB sized drives.
These Intel Optane are only 58GB which seems ideal for L2ARC or SLOG.

890MB/s write speed. Pathetic.

Sorry for the noise.

Intel consumer 670p M.2 would be on my test list. They are a sleeper.
 
I have never heard of half the brands recommended here. Adata I would never consider in my life.
I want a storage company that has been around a while.

You want solid, get Samsung or Intel enterprise drives.

You seem to be tasking a consumer NVMe with a server task (L2ARC).
Server NVMe drives are not that much more.

Heck Optane seems perfectly suited and they are dirt cheap now.
I don't see "dirt cheap" Optanes on the market. And I don't see any meaning to have second-level cache same size as first-level one and I have 128G of RAM on this server, so ARC could be 120G easily. Intel 670p uses QLC Flash, not TLC one. Sorry, it is device in other league (and it is not higher league), it had terrible performance when SLC cache is full, as any other QLC drive.

Accessible server SSDs are same hardware as consumer one, but higher price tag (and slightly tweaked firmware). True sever SSDs costs from five to ten times more per byte, I can not afford it for home NAS :)

Actually the Optane drive I was thinking of has terrible performance.

It seems to me L2ARC and SLOG really dont need 1TB sized drives.
These Intel Optane are only 58GB which seems ideal for L2ARC or SLOG.

890MB/s write speed. Pathetic.

Sorry for the noise.

Intel consumer 670p M.2 would be on my test list. They are a sleeper.
58GB of L2 cache is pathetic when you have 120GB of L1 cache :)

I see IX has this to say
It is interesting, thank you.
 
I have never heard of half the brands recommended here. Adata I would never consider in my life.
I want a storage company that has been around a while.

You want solid, get Samsung or Intel enterprise drives.

You seem to be tasking a consumer NVMe with a server task (L2ARC).
Server NVMe drives are not that much more.

Heck Optane seems perfectly suited and they are dirt cheap now.
You mean drives with half working firmware and unresolved bugs (Samsung)? :)
 
I mean Samsung PM983A which are purring along nicely. I got eight and want more.

Same as consumer drives except firmware?
Are you really saying that? Notice all the pretty PLP chips....
 
PM983 are not special just OK. PM17xx is full enterprise. PM9A3 is mid line PCIe4 replacment for PM983. Faster.

I did not mean to come across as so brand exclusive. But for storage I am choosy.

I have never considered Adata and have bias against them. Corsair too.

I did dip my toes into Micron Enterprise grade NVMe.

Kioxa is a wildcard. They don't make AIC but thier U.2 and U.3 offerings are impressive. Rulers too.

Seagate is off list for buying HGST and ruining that line..Own some drives but no NVMe except 2230 on RockPi4a.
 
PM983 are not special just OK. PM17xx is full enterprise. PM9A3 is mid line PCIe4 replacment for PM983. Faster.

I did not mean to come across as so brand exclusive. But for storage I am choosy.

I have never considered Adata and have bias against them. Corsair too.

I did dip my toes into Micron Enterprise grade NVMe.

Kioxa is a wildcard. They don't make AIC but thier U.2 and U.3 offerings are impressive. Rulers too.

Seagate is off list for buying HGST and ruining that line..Own some drives but no NVMe except 2230 on RockPi4a.
You see, this is not primary storage, it is volatile cache. Yes, I never buy AData as primary storage (Crucial is effectively Micron, BTW), but cache needs best write-on-full-drive performance, nothing more :) Even Samsung 990 Pro (which I use as primary storage on my Desktop and Laptop) is surprisingly bad at this task. And AData is not cheaper than Samsung, BTW, so it is not cjheap-out choice.
Of course, all this valid if it works :)
and looks like as it didn't :(
 
So your custom sysctl is whats causing an issue?
vfs.zfs.compressed_arc_enabled=0

Thanks so much for the detailed report.
I am building a similar rig with 4x12TB drives. Mirror and striped.
Limited RAM on C246 means 64GB.

What do you think of FreeBSD OS on same NVMe as L2ARC???
Bad idea? Samsung PM9A1 M.2 module... 2280 only on this board.

What about NVMe namespaces. Are they only available on cabinets?
All my drives only offer single namespace.(That I am aware of)
 
So your custom sysctl is whats causing an issue?
vfs.zfs.compressed_arc_enabled=0
Yes.

Thanks so much for the detailed report.
I am building a similar rig with 4x12TB drives. Mirror and striped.
Limited RAM on C246 means 64GB.

What do you think of FreeBSD OS on same NVMe as L2ARC???
Bad idea? Samsung PM9A1 M.2 module... 2280 only on this board.
I think, as system drive is mostly R/O, it will be Ok. Maybe, put /var somewhere else is good idea... But it depends mostly on activity in /var/log

What about NVMe namespaces. Are they only available on cabinets?
All my drives only offer single namespace.(That I am aware of)
I never seen NVMe with multiple namespaces, but I seen only "end-user" SSDs - no U.2 format, nothing form fullsize PCIe slots, nothing fancy.
 
"PM983a"+"samsung.com" Mention on quite a few pages?
Yes they are not Amazon drives that was BS and Samsung Enterprise models use these suffixes regularly
ie. PM1725a/PM1725b and PM1733a/PM1735a
Probably just revisions.

What is nice is M.2 PM983 only has half the PLP capacitor chips populated while M.2 PM983a has them all.

I also found out why I had much high rates with one PM1725 AIC card versus another.
The 3.2TB model has double the write speed of 1.6TB model.

put /var somewhere else is good idea
I agree this is a frequently written to directory.
I have experimented with mounting it on a memdisk from /etc/fstab..
The problem is it contains /var/db/pkg/ too and that must be preserved..
So you need to be more granular with offloading..

Whats your thoughts on swap and Solid State Drives? I worry about writes with my UFS systems..

The Klara article indicates you can tune for prescribed life.
An unthrottled L2ARC feed thread might be capable of utterly wrecking its CACHE vdevs in a matter of months.

So it seems a great torture test for an enterprise NVMe with 5DWPD
 
I never seen NVMe with multiple namespaces, but I seen only "end-user" SSDs - no U.2 format, nothing form fullsize PCIe slots, nothing fancy.
Not sure if "Intel DC P3100 Series" can be called "end-user" (it's 2280 format), but it does support namespace management while not supporting reservations, so I wouldn't really call it server-grade, as compared to e.g. "WD/HGST Ultrastar SN200 1.6TB NVMe U.2" supporting both namespace management and reservations (just looking at the lab equipment).

And, FWIW, I'm using ADATA Legend 960 (4TB model) on my win11 desktop, don't have any issues with it and the temperatures are about 5-10 degrees lower than Samsung NVMe (same cooling plate above both).
 
Back
Top