Solved OpenZFS: graphing ARC and L2ARC

grahamperrin

Son of Beastie

Reaction score: 830
Messages: 2,681

Please: can anyone tell, or guess, what's used for the graphs at and under <https://github.com/openzfs/zfs/issues/10508#issuecomment-940713843>?

As far as I can tell, graphs there are ARC-focused, although I see (written) "… 512 GB available in the L2ARC. …".

If not difficult, I'd like to produce graphs with some focus on L2ARC. Context:




From 2013 (the image is no longer available):

You will get a significant performance boost. The below graph is from a very busy web server. I usually get 60-70 % hit ratio. The system was rebooted on the 10th of May. Look how quickly L2ARC catches up.

<http://www.aisecure.net/wp-content/uploads/2013/05/zfs_stats_l2efficiency-week.png>

More recent (2019):

… That machine also has L2ARC, but the hit rate isn't that great. Even after the L2ARC is 'warm', it's only about 25%.
 

SirDice

Administrator
Staff member
Administrator
Moderator

Reaction score: 12,744
Messages: 39,332

If you look at the code from sysutils/zfs-stats you'll see it's a lot of sysctl(8) it's using to get the information. Then it's just a matter of graphing it using MRTG, Cacti or Zabbix.
 
OP
grahamperrin

grahamperrin

Son of Beastie

Reaction score: 830
Messages: 2,681

I forgot, net-mgmt/netdata/ already installed:
Under efficiencies, for L2 there's a hidden anchor <http://localhost:19999/#chart_zfs_l2hits> that seems to be slightly broken by the page header. In some cases, there's not only overlap, there's also wobble (maybe caused by the overlap).

A manual edition of the URL: <http://localhost:19999/#chart_zfs_l2hits;after=-86400;before=0;theme=slate> then, after scrolling a fraction to avoid the overlap:

1637726580540.png

zfs-mon (not a graph) shortly afterwards:

1637726957314.png
 

Beastie7

Aspiring Daemon

Reaction score: 616
Messages: 727

Does KDE system monitor support ARC and L2ARC? That'd be a really neat addition for users and sys admins.
 
OP
grahamperrin

grahamperrin

Son of Beastie

Reaction score: 830
Messages: 2,681

For now, I'll mark this topic Solved without doing what I originally intended:

… produce graphs with some focus on L2ARC. …

Results over time might be not suitably meaningful, for two reasons:
  1. zpool-iostat(8): unreasonably/impossibly high alloc and free measurements for two cache devices (simple USB thumb drives) · Issue #12779 · openzfs/zfs
  2. occasional sudden chilling, which (for now) I'll attribute to inadequacies involving USB:

1637887237430.png


– <http://localhost:19999/#menu_zfs_submenu_size;after=-410000;before=0>

Code:
% grep -e "<BOOT>" -e suspend /var/log/messages
Nov 22 16:13:41 mowa219-gjp4-8570p-freebsd acpi[50039]: suspend at 20211122 16:13:41
Nov 23 02:24:06 mowa219-gjp4-8570p-freebsd acpi[50562]: suspend at 20211123 02:24:06
Nov 23 07:21:21 mowa219-gjp4-8570p-freebsd acpi[58693]: suspend at 20211123 07:21:21
Nov 23 16:06:44 mowa219-gjp4-8570p-freebsd acpi[88629]: suspend at 20211123 16:06:44
Nov 24 08:31:49 mowa219-gjp4-8570p-freebsd acpi[17820]: suspend at 20211124 08:31:49
Nov 24 16:24:47 mowa219-gjp4-8570p-freebsd acpi[43156]: suspend at 20211124 16:24:47
Nov 25 07:20:43 mowa219-gjp4-8570p-freebsd kernel: ---<<BOOT>>---
Nov 25 07:27:12 mowa219-gjp4-8570p-freebsd kernel: ---<<BOOT>>---
Nov 25 07:44:41 mowa219-gjp4-8570p-freebsd acpi[3921]: suspend at 20211125 07:44:41
Nov 25 16:11:52 mowa219-gjp4-8570p-freebsd acpi[32194]: suspend at 20211125 16:11:52
%
 

Alain De Vos

Son of Beastie

Reaction score: 784
Messages: 2,566

You could write a script which regularly checks the values and write those in a time-series database.
To graph, "grafana" can produce nice graphs.
 
OP
grahamperrin

grahamperrin

Son of Beastie

Reaction score: 830
Messages: 2,681

… occasional sudden chilling, which (for now) I'll attribute to inadequacies involving USB: …

For reference:

… I think there are 2 ways to expire the cache unless there is a bug: stream of new l2arc writes and deletions of files/snapshots/datasets. …

… your speculation that a small set of files (=1 in this case) get constantly overwritten is exactly what happened here, probably filling up the L2ARC with blocks that soon after get deleted. …

… The L2ARC is implemented as an on-disk ring buffer which is why eventually all of the block were overwritten. This is at the heart of the design …

– <https://github.com/openzfs/zfs/discussions/11209#discussioncomment-232924> under Almost empty L2ARC on raidz2 (bleeding out after reboot).

Ignoring (for a moment) the RAIDZ2 aspect: I suspect that my current case is unrelated. Sudden chills are sometimes noticeable after boot, but there might be a different explanation.
 
Top