l2arc degraded

I have a new ZFS deployment with an Intel DC S3500 as L2ARC running FreeBSD 9.3. Smart monitoring tools show no errors on the drive. But zfs-stats show the following

Code:
L2 ARC Summary: (DEGRADED)
        Passed Headroom:                        23.31m
        Tried Lock Failures:                    4.28m
        IO In Progress:                         13.87k
        Low Memory Aborts:                      103
        Free on Write:                          1.44m
        Writes While Full:                      2.40m
        R/W Clashes:                            1.74k
        Bad Checksums:                          1.15m
        IO Errors:                              501.55k
        SPA Mismatch:                           11

Code:
kstat.zfs.misc.arcstats.l2_compress_successes: 51418193
kstat.zfs.misc.arcstats.l2_compress_zeros: 0
kstat.zfs.misc.arcstats.l2_compress_failures: 108469007

I have an identical hardware in deployment on FreeBSD 9.2 without this problem.
Looks like this may be related to L2ARC compression according to the url's below


https://bugs.freenas.org/projects/f...ions/ededc82f03e5dd5a518ccda3b0aae506fbfd8fc8

http://svnweb.freebsd.org/base?view=revision&sortby=file&revision=256889

https://bugs.freenas.org/issues/3418

http://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html

Until the above fixes show up on release versions, is there any thing that could be done now to solve this issue? I would prefer to stick with release versions. The server is otherwise working fine. Thanks.
 
Thanks. I looked at arc.c and it indeed has that patch in 9.3. But this didn't fix the issue. It would have been nice if the L2ARC compression could be disabled easily by sysctl for now.
 
I can confirm, running 9.3-RELEASE that this is happening. This is on Intel 120GB 320 SSDs attached to the motherboard ICH9 controller ports, AHCI enabled, on a SuperMicro X7SBE board. I've been using this set of hardware since 8.0 and never saw the problem until 9.3-RELEASE with the L2ARC compression. I tend to trust Intel to report brokenness via SMART. ashift on my root pool and 15k rpm stripe/mirror pool is 9, and on my 7200rpm raidz3 pool it's 12.

Any updates or any way to disable L2ARC compression?
 
ZFSZealot said:
Any updates or any way to disable L2ARC compression?
Not sure if it's going to help but have you tried removing and re-adding the L2ARC?
 
@SirDice, yes, as an experiment, I removed the L2ARC from all of the pools and then added them back to just rpool and my 15k RPM pool, as both of them are ashift=9. I'm waiting for those to fill up before declaring that the errors aren't increasing ( sysctl of course still shows the same number of errors as I haven't rebooted). Right now I'm seeing no increase in errors (kstat.zfs.misc.arcstats.l2_cksum_bad and kstat.zfs.misc.arcstats.l2_io_error) and almost 60GB of L2ARC written. It would be nice to be able to isolate this just to situations where ashift=12 as is the case with my bulk storage pool.
 
Last edited by a moderator:
@SirDice, yes, as an experiment, I removed the L2ARC from all of the pools and then added them back to just rpool and my 15k RPM pool, as both of them are ashift=9. I'm waiting for those to fill up before declaring that the errors aren't increasing ( sysctl of course still shows the same number of errors as I haven't rebooted). Right now I'm seeing no increase in errors (kstat.zfs.misc.arcstats.l2_cksum_bad and kstat.zfs.misc.arcstats.l2_io_error) and almost 60GB of L2ARC written. It would be nice to be able to isolate this just to situations where ashift=12 as is the case with my bulk storage pool.

Since you never posted an update to this, I´m asking, how did it go?

/Sebulon
 
Thanks for checking back. Yes, unfortunately I do see "degraded" with L2ARC devices only on ashift=9 pools. This seems to show up when the L2ARC devices fill completely. The other thing that's strange is that it looks like it's allocated a lot more cache than there is space for ( zfs-stats -L):

Code:
L2 ARC Summary: (DEGRADED)
  Passed Headroom:  87.71m
  Tried Lock Failures:  102.49m
  IO In Progress:  14.96k
  Low Memory Aborts:  194.42k
  Free on Write:  338.78k
  Writes While Full:  57.84k
  R/W Clashes:  7.84k
  Bad Checksums:  11.20m
  IO Errors:  956.77k
  SPA Mismatch:  142.81b

L2 ARC Size: (Adaptive)  222.59  GiB
  Header Size:  2.00% 4.45  GiB

L2 ARC Evicts:
  Lock Retries:  291
  Upon Reading:  0

L2 ARC Breakdown:  251.27m
  Hit Ratio:  16.09% 40.42m
  Miss Ratio:  83.91% 210.85m
  Feeds:  4.02m

L2 ARC Buffer:
  Bytes Scanned:  2.17 PiB
  Buffer Iterations:  4.02m
  List Iterations:  256.91m
  NULL List Iterations:  69.43m

L2 ARC Writes:
  Writes Sent:  100.00%  1.85m

Output of zpool list -v:

Code:
NAME  SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  85G  48.2G  36.8G  56%  1.00x  ONLINE  -
  mirror  85G  48.2G  36.8G  -
  gpt/disk1  -  -  -  -
  gpt/disk2  -  -  -  -
  gpt/disk0  -  -  -  -
cache  -  -  -  -  -  -
  ada3  14.9G  14.9G  8.00M  -
sas15k  1.09T  683G  429G  61%  1.00x  ONLINE  -
  mirror  278G  171G  107G  -
  da9  -  -  -  -
  da13  -  -  -  -
  mirror  278G  171G  107G  -
  da10  -  -  -  -
  da14  -  -  -  -
  mirror  278G  171G  107G  -
  da11  -  -  -  -
  da15  -  -  -  -
  mirror  278G  171G  107G  -
  da12  -  -  -  -
  da16  -  -  -  -
cache  -  -  -  -  -  -
  gpt/sas15kl2arc  32.0G  32.0G  6.95M  -
sata7k  20T  14.0T  6.02T  69%  1.00x  ONLINE  -
  raidz3  20T  14.0T  6.02T  -
  da8  -  -  -  -
  da3  -  -  -  -
  da4  -  -  -  -
  da7  -  -  -  -
  da22  -  -  -  -
  da20  -  -  -  -
  da5  -  -  -  -
  da21  -  -  -  -
  da6  -  -  -  -
  da0  -  -  -  -
  da2  -  -  -  -
cache  -  -  -  -  -  -
  gpt/sata7kl2arc  32.0G  32.0G  8M  -
scratch  79.5G  2.00G  77.5G  2%  1.00x  ONLINE  -
  gpt/scratch  79.5G  2.00G  77.5G  -

Notice I have a total of 15+32+32GB of L2ARC devices but zfs-stats -L says the L2ARC size is 222GB.

If this isn't a problem anyone else is seeing I'm willing to blame hardware. The L2ARC devices are Intel 320 series SSDs on a SuperMicro X7SBE motherboard SATA controller (Intel ICH9). It has been quite stable and I don't see any SMART errors though, and it's odd "degraded" doesn't show up until L2ARC devices are full. Also, any hardware failure would have to have coincidentally happened at the exact same time as I updated to 9.3.

Is there any way to see which one of the L2ARC devices ZFS thinks is reading bad data?
 
And now I've moved the server to 10.1-RELEASE on a different server (PowerEdge 2950) with different SSDs (but same spinning disks and external enclosures), and have found it's happening there also. Another strange artifact - notice it says 16.0E (exabytes???!) are free on my "sata7kl2arc" device:

Code:
root@cadence:/ # zpool list -v
NAME                                     SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH  ALTROOT
rpool                                    247G  11.6G   235G      -         -     4%  1.00x  ONLINE  -
  mirror                                 247G  11.6G   235G      -         -
    da1p3                                   -      -      -      -         -
    da0p3                                   -      -      -      -         -
cache                                       -      -      -      -      -      -
  gpt/rpooll2arc                        16.0G  1.15G  14.8G     0%         -
sas15k                                  1.09T   678G   434G      -         -    60%  1.00x  ONLINE  -
  mirror                                 278G   170G   108G      -         -
    diskid/DISK-BJ00PA1041UG                -      -      -      -         -
    diskid/DISK-BJ00P86011L5                -      -      -      -         -
  mirror                                 278G   170G   108G      -         -
    diskid/DISK-BJ00PA1040D3                -      -      -      -         -
    diskid/DISK-BJ00P86011F0                -      -      -      -         -
  mirror                                 278G   169G   109G      -         -
    diskid/DISK-BJ00PA1041YB                -      -      -      -         -
    diskid/DISK-BJ00PA1041P5                -      -      -      -         -
  mirror                                 278G   170G   108G      -         -
    diskid/DISK-BJ00PA1041H9                -      -      -      -         -
    diskid/DISK-BJ00P86011FU                -      -      -      -         -
cache                                       -      -      -      -      -      -
  gpt/sas15kl2arc                       32.0G  17.8G  14.2G     0%         -
sata7k                                    30T  14.5T  15.5T      -         -    48%  1.00x  ONLINE  -
  raidz3                                  30T  14.5T  15.5T      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK8475A      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK524RG      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK9K5YA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK9DPNA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0351YHGEW0DA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20PN2234P8KKP4WY      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK9EATA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHHRUSMA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHJZB13G      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK9K6TA      -      -      -      -         -
    diskid/DISK-%20%20%20%20%20%20MK0371YHK9DX3A      -      -      -      -         -
cache                                       -      -      -      -      -      -
  gpt/sata7kl2arc                       32.0G  41.3G  16.0E     0%         -

I've tested, wiped and tested the SSDs again and again. SMART reports are clean. Pool status looks OK also:

Code:
root@cadence:/ # zpool status
  pool: rpool
state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on software that does not support feature
        flags.
  scan: scrub repaired 0 in 0h7m with 0 errors on Mon Feb  9 20:56:59 2015
config:

        NAME              STATE     READ WRITE CKSUM
        rpool             ONLINE       0     0     0
          mirror-0        ONLINE       0     0     0
            da1p3         ONLINE       0     0     0
            da0p3         ONLINE       0     0     0
        cache
          gpt/rpooll2arc  ONLINE       0     0     0

errors: No known data errors

  pool: sas15k
state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on software that does not support feature
        flags.
  scan: scrub repaired 0 in 1h23m with 0 errors on Sun Feb  8 12:26:21 2015
config:

        NAME                          STATE     READ WRITE CKSUM
        sas15k                        ONLINE       0     0     0
          mirror-0                    ONLINE       0     0     0
            diskid/DISK-BJ00PA1041UG  ONLINE       0     0     0
            diskid/DISK-BJ00P86011L5  ONLINE       0     0     0
          mirror-1                    ONLINE       0     0     0
            diskid/DISK-BJ00PA1040D3  ONLINE       0     0     0
            diskid/DISK-BJ00P86011F0  ONLINE       0     0     0
          mirror-2                    ONLINE       0     0     0
            diskid/DISK-BJ00PA1041YB  ONLINE       0     0     0
            diskid/DISK-BJ00PA1041P5  ONLINE       0     0     0
          mirror-3                    ONLINE       0     0     0
            diskid/DISK-BJ00PA1041H9  ONLINE       0     0     0
            diskid/DISK-BJ00P86011FU  ONLINE       0     0     0
        cache
          gpt/sas15kl2arc             ONLINE       0     0     0

errors: No known data errors

  pool: sata7k
state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on software that does not support feature
        flags.
  scan: scrub repaired 0 in 7h42m with 0 errors on Sat Feb  7 00:32:55 2015
config:

        NAME                                              STATE     READ WRITE CKSUM
        sata7k                                            ONLINE       0     0     0
          raidz3-0                                        ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK8475A  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK524RG  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK9K5YA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK9DPNA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0351YHGEW0DA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20PN2234P8KKP4WY  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK9EATA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHHRUSMA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHJZB13G  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK9K6TA  ONLINE       0     0     0
            diskid/DISK-%20%20%20%20%20%20MK0371YHK9DX3A  ONLINE       0     0     0
        cache
          gpt/sata7kl2arc                                 ONLINE       0     0     0

errors: No known data errors

The pools are all version 28, and I've hesitated upgrading them as I don't need any of the newer features, and they'll still import into Solaris if need be.

Should I post a PR per the handbook? Is there any other info I can provide?
 
Hey man!

Try going up to 10.1-STABLE instead. It seems a lot was fixed after 10.1-RELEASE:
FreeBSD 10.1-STABLE #0 r277949
# zpool list -v
Code:
NAME  SIZE  ALLOC  FREE  EXPANDSZ  FRAG  CAP  DEDUP  HEALTH  ALTROOT
pool1  3.97G  639M  3.34G  -  11%  15%  1.00x  ONLINE  -
  mirror  3.97G  639M  3.34G  -  11%  15%
  gpt/disk1  -  -  -  -  -  -
  gpt/disk2  -  -  -  -  -  -
pool2  59.5G  26.2G  33.3G  -  40%  44%  1.00x  ONLINE  -
  raidz1  29.8G  13.1G  16.7G  -  41%  43%
  gpt/disk3  -  -  -  -  -  -
  gpt/disk4  -  -  -  -  -  -
  gpt/disk5  -  -  -  -  -  -
  raidz1  29.8G  13.1G  16.7G  -  40%  44%
  gpt/disk6  -  -  -  -  -  -
  gpt/disk7  -  -  -  -  -  -
  gpt/disk10  -  -  -  -  -  -
cache  -  -  -  -  -  -
  gpt/disk9  9.99G  9.19G  820M  -  0%  91%

/Sebulon
 
I can give it a shot as I can always get back through the magic of snapshots, but this is production instead of test so it may be a few days... In the meantime is it worth filing a PR?

At the risk of possibly conflating different problems and acknowledging this was 9.2, I'm going to say this sounds exactly like https://bugs.freenas.org/issues/5347, but that was determined by the reporter to be a hardware problem. Like reported for the cited FreeNAS bug, my problem doesn't show until one of the L2ARC devices gets full. Again, I never saw this problem until L2ARC compression was committed.

I'm using Intel 120GB 320 and 330 SSDs (da2, da3, da5) in a PowerEdge 2950, connected to the LSI 1068e based controller in the storage slot, so very different hardware. I've partitioned smaller 32GB and 16GB slices to constrain L2ARC size as the server only has 32GB of RAM.
 
I have the same problem but only on one server with FreeBSD 10.1. Unfortunately the server needed to reboot 2 days ago, so everything looks fine.
Before the reboot it showed around 100k+ IO errors on 2x http://ark.intel.com/products/56601/Intel-SSD-X25-M-Series-80GB-2_5in-SATA-3Gbs-34nm-MLC

At the moment:
Code:
L2 ARC Summary: (HEALTHY)
    Passed Headroom:            3.34m
    Tried Lock Failures:            11.80m
    IO In Progress:                92.44k
    Low Memory Aborts:            44
    Free on Write:                35.06k
    Writes While Full:            3.75k
    R/W Clashes:                1.71k
    Bad Checksums:                0
    IO Errors:                0
    SPA Mismatch:                23.74k

L2 ARC Size: (Adaptive)                130.00    GiB
    Header Size:            2.09%    2.72    GiB

L2 ARC Breakdown:                111.71m
    Hit Ratio:            11.92%    13.31m
    Miss Ratio:            88.08%    98.39m
    Feeds:                    191.58k

L2 ARC Buffer:
    Bytes Scanned:                104.72    TiB
    Buffer Iterations:            191.58k
    List Iterations:            12.11m
    NULL List Iterations:            5.87m

L2 ARC Writes:
    Writes Sent:            100.00%    128.13k

I think its worth filing a PR.
 

I had a problem like if not identical to that described in that bug report, on the original hardware that only had 8GB of RAM. I was using 120GB + 120GB + 16GB SSD L2ARC devices which was already far more than the 5x RAM recommended, and when L2ARC compression came along in 9.3-RELEASE it pushed the RAM usage for L2ARC headers to the point where the system was actively swapping and then hung as described. I caught the system before it completely hung a couple of times and removed the L2ARC devices, freeing up a large amount of RAM and restoring the system to a working state. Then I started constraining, using partitions, the size of the L2ARC devices. I had analyzed it as a misconfiguration, not a bug.

Unfortunately, I'm not convinced that 197164 has anything to do with the L2ARC IO errors and degraded state, although that showed up at the same time as L2ARC compression also. 197164 seems more like what Karl was describing here: http://lists.freebsd.org/pipermail/freebsd-bugs/2014-March/055604.html.

Someone did already file a PR for the problem described here though: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195746.

I will try updating to 10.1-STABLE in a few days if possible.
 
ZFSZealot
Now I see what you mean, no that´s not fixed:
Code:
cache  -  -  -  -  -  -
  gpt/cache1  238G  306G  16.0E  -  0%  128%
  gpt/cache2  238G  303G  16.0E  -  0%  127%
I´m thinking it´s purely cosmetical though, since everything else seems OK. Like it´s the "FREE" calculation that just freaks out since "ALLOC" > "SIZE". But possible now due to the compression in L2ARC that didn´t exist back when the guys at Sun wrote it.

/Sebulon
 
And any IO Errors in the L2 ARC Summary?

Yes there are, but they may also just be "normal" if kstat.zfs.misc.arcstats.l2_compress_failures counts as errors? I don´t think a failure to compress something is an error, it did it´s best and just couldn´t, the L2ARC shouldn´t be negatively affected by it, in my opinion.

/Sebulon
 
Well the "degraded L2ARC problem" can't be only compression related. Compression is not enabled on my datasets,
and the L2ARC was degraded after 60-70 days uptime. Before 10.1 the server need to be restarted at least one time a month.
I'll stay with 10.1 Release until this happens again and test 10.1 Stable after that.
 
ZFSZealot
Now I see what you mean, no that´s not fixed:
Code:
cache  -  -  -  -  -  -
  gpt/cache1  238G  306G  16.0E  -  0%  128%
  gpt/cache2  238G  303G  16.0E  -  0%  127%
I´m thinking it´s purely cosmetical though, since everything else seems OK. Like it´s the "FREE" calculation that just freaks out since "ALLOC" > "SIZE". But possible now due to the compression in L2ARC that didn´t exist back when the guys at Sun wrote it.

/Sebulon

Right, that's one of the issues I pointed out but yes, it's probably cosmetic. What I'm concerned about is zfs-stats -L showing DEGRADED, with IO errors and bad checksums. Again, this didn't happen until I upgraded to 9.3-RELEASE and it continues to show up on different hardware with 10.1-RELEASE. I've extensively tested the hardware and there is no indication of any hardware problem that would explain the I/O errors. I also see exactly 0 I/O errors until one of the L2ARC devices fills completely with cache data.

Code:
root@cadence:/ # zfs-stats -L

------------------------------------------------------------------------
ZFS Subsystem Report                            Fri Feb 13 10:06:58 2015
------------------------------------------------------------------------

L2 ARC Summary: (DEGRADED)
        Passed Headroom:                        30.19m
        Tried Lock Failures:                    24.75m
        IO In Progress:                         247
        Low Memory Aborts:                      103
        Free on Write:                          54.95k
        Writes While Full:                      10.62k
        R/W Clashes:                            562
        Bad Checksums:                          1.29m
        IO Errors:                              128.28k
        SPA Mismatch:                           48.35b

L2 ARC Size: (Adaptive)                         33.47   GiB
        Header Size:                    2.24%   767.31  MiB

L2 ARC Evicts:
        Lock Retries:                           18
        Upon Reading:                           0

L2 ARC Breakdown:                               35.45m
        Hit Ratio:                      26.64%  9.45m
        Miss Ratio:                     73.36%  26.01m
        Feeds:                                  567.04k

L2 ARC Buffer:
        Bytes Scanned:                          528.84  TiB
        Buffer Iterations:                      567.04k
        List Iterations:                        36.05m
        NULL List Iterations:                   961.29k

L2 ARC Writes:
        Writes Sent:                    100.00% 135.47k

------------------------------------------------------------------------

And here:

Code:
root@cadence:/ # sysctl kstat.zfs.misc.arcstats.l2_io_error
kstat.zfs.misc.arcstats.l2_io_error: 128275
root@cadence:/ # sysctl kstat.zfs.misc.arcstats.l2_cksum_bad
kstat.zfs.misc.arcstats.l2_cksum_bad: 1290021
 
Well the "degraded L2ARC problem" can't be only compression related. Compression is not enabled on my datasets,
and the L2ARC was degraded after 60-70 days uptime. Before 10.1 the server need to be restarted at least one time a month.
I'll stay with 10.1 Release until this happens again and test 10.1 Stable after that.

I'm not convinced that because you don't have compression enabled on any filesystems that it doesn't compress L2ARC anyway. Ever since 9.3-RELEASE it compresses and there doesn't seem to be any way to turn it off. This whole thread is worth a read, and I think this problem has been around a while and hasn't ever been fixed:

http://lists.freebsd.org/pipermail/freebsd-current/2013-October/045695.html

You can infer that compression is enabled thus:

Code:
root@cadence:/ # sysctl kstat.zfs.misc.arcstats.l2_compress_successes
kstat.zfs.misc.arcstats.l2_compress_successes: 1352012

What I have not looked for is whether or not this problem exists only on FreeBSD or if it's present on other operating systems that use OpenZFS.
 
Back
Top