ZFS zpool not degraded, but certain files cause transfers to hang

Avery Freeman · May 16, 2018

Hello,

I have a pool on a general file server with 6 x 2TB drives configured in a three-group mirror that are very old. The system, for all intensive purposes, appears to be working normally. The pool does not report as degraded.

However, I have been trying to move certain files using rsync and every time it gets to a certain folder it hangs. It's always transferring the same, or near the same (same folder), file every time it hangs.

Also du command hangs when gets to folder.

I have scrubbed the pool several times now and it does not appear to improve the behavior. I also have tested the drives individually using smartmontools and also taken them all out and individually tested each surface using HDDtools on a wintel machine. They all come back basically OK (just old).

Does anyone have any ideas for what I should do?

SirDice · May 16, 2018

Avery Freeman said:
I also have tested the drives individually using smartmontools and also taken them all out and individually tested each surface using HDDtools on a wintel machine. They all come back basically OK (just old).

Basically OK? Any "Offline uncorrectable" or "Pending sectors"? Did you run the short and long tests? Can you post the output from smartctl(8) of one of the disks?

Avery Freeman · May 16, 2018

SirDice said:
Basically OK? Any "Offline uncorrectable" or "Pending sectors"? Did you run the short and long tests? Can you post the output from smartctl(8) of one of the disks?

I only ran short tests. unfortunately I didn't save any of the information.

I will start running them again after I get most of the stuff backed up off of it. I have another drive to replace the culprit left, I just wish I could figure out which one it was...

ralphbsz · May 17, 2018

When it hangs, what states are the processes in?
While it is hung, can you read from the disks? Try a quick "dd if=/dev/adaX... of=/dev/null bs=4096 count=4096" or something like that.

Note that all the discussion so far has been about a problem with the disks. What if it is instead a problem with the file system, and you found a bug? Are you running a recent version?

Avery Freeman · May 17, 2018

ralphbsz said:
When it hangs, what states are the processes in?
While it is hung, can you read from the disks? Try a quick "dd if=/dev/adaX... of=/dev/null bs=4096 count=4096" or something like that.

Note that all the discussion so far has been about a problem with the disks. What if it is instead a problem with the file system, and you found a bug? Are you running a recent version?

Hi,

well I doubt it's a problem with the file system since that is located on another drive, not the array. Plus, it's a brand new VM - the pool is an ancient FreeNAS pool I imported to FreeBSD when I started having trouble loading it in FreeNAS (I use FreeBSD for a couple other VMs so I was planning on switching permanently).

I am still waiting for my long smartctl tests to finish but they should be done pretty soon now... will report back...

Edit: Somehow the first three drives tested but the last three were aborted, so here's the data from the first three while I wait another few hours for the rest to scan:

https://paste2.org/FyHnnBx2

I don't see any uncorrectable errors or pending sectors. Anxious to get the results from the rest of them, but still confused about what the issue is.

Does anyone think just destroying the pool would be of any benefit? I have all the files I need backed up and could just copy them back... I was kind of thinking of switching from a 3-mirror to a 4+2 configuration anyway...

SirDice · May 17, 2018

No need to run any new tests, just post the output of smartctl -a /dev/<disk>. That should provide plenty of information for us to look at.

phoenix · May 18, 2018

Start top on a console and leave it running. Next time the system hangs, check the output of top. Pay attention to the Wired, Free, and ARC entries. Post them here.

Avery Freeman · May 18, 2018

phoenix said:
Start top on a console and leave it running. Next time the system hangs, check the output of top. Pay attention to the Wired, Free, and ARC entries. Post them here.

That's a great idea - running top in a separate console and seeing what happens - I wish I had thought of that. I do have a feeling it would just be the rsync process, though.

Edit: What would I be looking for re: ARC, wired, free memory levels?

Here's the output from the last 3 drives:

https://paste2.org/y2H2mXzC

It's included with the first three drives' data for convenience.

I gave up early and destroyed the pool - thankfully when I copied the data back from my backup, I never had any issues. We'll see how long that lasts.

It's two different drives from three different lines of WD Greens that have spindown turned off for longevity. I'm surprised how long they've lasted. A couple I bought back in 2011 were even refurbished when I got them. When I took them out and tested them I noticed two things 1) the oldest ones were the heaviest (perhaps more platters?) and 2) The oldest ones (the EARS) were the only ones that could handle Butterfly test in HDDscan. They don't make them like they used to (?)

They are kind of crap drives, although I do think they're similar physically to the WD Reds just with different firmware. I usually stick to HGST SAS drives now after reading several Backblaze reports.

SirDice · May 18, 2018

Avery Freeman said:
Here's the output from the last 3 drives:

https://paste2.org/y2H2mXzC

Yep, good info. And I would agree, they're a bit old but other than that they should still be fine to use.

For comparison and everybody else's amusement, here's one of my old drives: https://paste2.org/dhACXI7N
Yes, it seriously needs to be replaced. I'm currently copying data I can still access as this disk is causing a lot of problems now. It's a single disk pool and the errors are causing any and all zpool or zfs commands (even to other pools/datasets) to hang for an unspecified amount of time. Quite annoying.

Avery Freeman · May 19, 2018

That's great, I'm glad to hear it!

I never expect drives to last very long because I had a bad experience with 4K sector drives when they first came out - In 2010 I got four of these EARS and put them in a Mediasonic RAID enclosure for use with a Mac Mini (was doing a lot of audio work back then). They were the first widely available 4k/512e drives AFAIK, so there was very little info on these at the time and I learned the hard way that they do not like to emulate 512e in RAID-5 - the array took a dump, I replaced two drives under warranty after paying $500 to a data recovery outfit in San Francisco where I was living at the time. It traumatized me, and ever since then I just kind of expect that drives are "going to die any day now" and am astonished whenever they keep working without issue.

I bought 4x HDS722020ALA330 drives as per a report I read from Backblaze a few years ago: https://www.backblaze.com/blog/best-hard-drive-q4-2014/

...as I wanted to just get the cheapest possible decent drives I could find for recording TV on a ReFS array with Windows 8.1/WMC - $50/ea "refurbished" (i.e. beat to hell in a data warehousing environment) shipped to me from goharddrive (don't recommend) in the worst drive packaging I have seen in my life.
They were doomed to failure, right? Apparently not ... I record the news on two channels all day every day so I can skip commercials and I think it's been 3 years now they've been going without a hitch. They were underneath my TV in an inadequately vented cabinet with a case that has awful fans, not getting nearly enough airflow for most of this time.

Now they're recording HDHomeRun TV in a more adequate case with a Rosewill RSV enclosure using FreeBSD/ZFS, but I noticed yesterday that they were running hot when doing a random smartctl scan - 65 deg for the hottest one at the time - thankfully I checked my case and, sure enough, a cable had fallen into the fan on the back of the drive rack and stopped it dead.

It's quite possible they were like this for weeks recording TV 18-20 hrs/day at 60+ deg temperatures and they're still chugging along without a hitch. I bought a spare in case one actually dies someday but I just have this feeling I'm not going to need it.

I have the data here: https://paste2.org/4Wwf0eJf omfg top-endured temperature for two of them is 87 deg (!!)

What's with the attributes on your smartctl scan? Is that due to being from a different MFG, or is it a different version of smartmontools?

I've never seen errors like that before but they do look awfully scary (something about hexidecimal seems ominous

That was one thing about the Backblaze report is that they said Seagates were having unusually
high failure rates relative to other MFGs.

ralphbsz · May 19, 2018

65deg is very high. 87deg is insane. You probably shortened the life of your disks significantly. On the other hand, they aren't dead yet, and you have RAID, and I don't think your data is vitally important. I used to think that 45deg was getting unpleasantly warm, and at 48 degrees I would start making phone calls. But that was on systems that cost millions, the data is worth a lot more than millions, and the fans are monitored and redundant.

Yes, on SATA SMART, every vendor has a different set of parameters. That makes working with SMART very not fun. The SCSI (SAS) version of SMART is much easier to deal with; still not completely standardized, but mostly consistent.

Avery Freeman · May 19, 2018

ralphbsz said:
65deg is very high. 87deg is insane. You probably shortened the life of your disks significantly. On the other hand, they aren't dead yet, and you have RAID, and I don't think your data is vitally important. I used to think that 45deg was getting unpleasantly warm, and at 48 degrees I would start making phone calls. But that was on systems that cost millions, the data is worth a lot more than millions, and the fans are monitored and redundant.

Yes, on SATA SMART, every vendor has a different set of parameters. That makes working with SMART very not fun. The SCSI (SAS) version of SMART is much easier to deal with; still not completely standardized, but mostly consistent.

Yeah, no it's not vitally important. That's what the other two ESXi boxes are for

The motherboard doesn't even support ECC (still works great, though). Those hard drives SHOULD be on their way out any day, and I think that they're not dying ages ago is a testament to what a feat of engineering they are.

Interesting, did not know that about SMART. Hopefully won't have to deal with it too much since I am extremely brand loyal when it comes to hard drives... (did I mention my HGST drives that just won't die?)

phoenix · May 19, 2018

Avery Freeman said:
That's a great idea - running top in a separate console and seeing what happens - I wish I had thought of that. I do have a feeling it would just be the rsync process, though.

Edit: What would I be looking for re: ARC, wired, free memory levels?

Free memory very low, Wired memory covering most of RAM, and ARC bring most of Wired. That indicates a memory exhaustion lockup as ZFS has requested too much memory for ARC and isn't releasing it properly for other things to run.

It's a common deadlock on ZFS, although it not nearly as common as it used to be in the pre-10 days.

If that's what's happening, limiting the ARC or adding RAM will fix it.

Terri_Kennedy · May 20, 2018

Avery Freeman said:
It's two different drives from three different lines of WD Greens that have spindown turned off for longevity.

Are you sure about that? From your first 3 drives:

Code:

  9 Power_On_Hours          0x0032   062   062   000    Old_age   Always       -       27744
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       616

Good

Code:

  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       26486
193 Load_Cycle_Count        0x0032   166   166   000    Old_age   Always       -       104361

Not good - 4 loads per hour over the life of the drive

Code:

  9 Power_On_Hours          0x0032   038   038   000    Old_age   Always       -       45280
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       1226472

Also not good.

Avery Freeman · May 20, 2018

phoenix said:
Free memory very low, Wired memory covering most of RAM, and ARC bring most of Wired. That indicates a memory exhaustion lockup as ZFS has requested too much memory for ARC and isn't releasing it properly for other things to run.

It's a common deadlock on ZFS, although it not nearly as common as it used to be in the pre-10 days.

If that's what's happening, limiting the ARC or adding RAM will fix it.

Awesome, good to know!

Although, I copied the exact same files back onto the drives - would it make a difference being copied to instead of from?

Avery Freeman · May 20, 2018

Terry_Kennedy said:

Are you sure about that? From your first 3 drives:

Code:

  9 Power_On_Hours          0x0032   062   062   000    Old_age   Always       -       27744
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       616

Good

Code:

  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       26486
193 Load_Cycle_Count        0x0032   166   166   000    Old_age   Always       -       104361

Not good - 4 loads per hour over the life of the drive

Code:

  9 Power_On_Hours          0x0032   038   038   000    Old_age   Always       -       45280
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       1226472

Also not good.

That's a good point - I remember going through and doing it for these drives, but perhaps some of the newer ones have firmware that don't accept the changes or are reset after reboot (similar to how TLER can be turned on with some consumer drives but gets reset after power cycle). I am not sure. Did you notice which model those ones were? I will go back and look eventually but I work all weekend, it's 3:30 am and I have to go to sleep soon.

Thanks for pointing that out, I am apparently not that good at reading smart test data set and your comments are really helpful.

phoenix · May 22, 2018

Later versions of the WD Greens don't allow you to turn off the head-parking feature. They really want the Greens to be used in systems that are very little use. Like a desktop that is powered up, used for a bit, then put to sleep. And where the usage pattern is stop-n-go (load a web page, read for a few minutes, load another page, read for a few minutes, etc).

They didn't want the Greens eating into their margins on the Reds and Blacks.

Avery Freeman · Jun 14, 2018

phoenix said:
Later versions of the WD Greens don't allow you to turn off the head-parking feature. They really want the Greens to be used in systems that are very little use. Like a desktop that is powered up, used for a bit, then put to sleep. And where the usage pattern is stop-n-go (load a web page, read for a few minutes, load another page, read for a few minutes, etc).

They didn't want the Greens eating into their margins on the Reds and Blacks.

That's crap! These were refurb replacements for the original WD20EARS I RMAed after they took a shit in an external RAID enclosure.

Now one of the replacements is dying apparently, and it IS one of the newer ones. I think that's what's causing my whole problem:

https://forums.freebsd.org/threads/dd-if-dev-zero-one-hard-drive-much-slower-than-others.66265/

Thanks for your help.

Edit:

Upon further inspection, the only one out of 6 drives that has a low load_cycle_count is one of the newest drives. So I think it worked on that one.

I have seen posts of people claiming the power has to be turned off for the WDIDLE.EXE settings to take effect. I have also seen posts of people saying the BIOS has to have the SATA port set to ATA - which I imagine on other BIOSes might be IDE or 'legacy', 'compatible', etc. (e.g. not AHCI).

The power explanation in my case could be explained by me discovering the process, going through all the drives and changing the WDIDLE setting, rebooting after each time and then turning the computer off for the last one - leaving only one drive with persistent settings.

One thing is for sure, I'm not going to be buying any more WD Green drives after this, what a pain this has been (not to mention, they also have crap transfer speeds).

This is one of the longest threads I've ever seen on the FreeNAS forum - it has quite a bit of information and different people weighing in anecdotally:

https://forums.freenas.org/index.php?threads/hacking-wd-greens-and-reds-with-wdidle3-exe.18171

Unfortunately, there's not a lot of completely definitive information, but it still seems useful since that's really all we have to go by - I could not find any official information from WD besides a description of what the utility is supposed to do.

ZFS zpool not degraded, but certain files cause transfers to hang

Avery Freeman

SirDice

Administrator

Avery Freeman

ralphbsz

Avery Freeman

SirDice

Administrator

phoenix

Avery Freeman

SirDice

Administrator

Avery Freeman

ralphbsz

Avery Freeman

phoenix

Terri_Kennedy

Avery Freeman

Avery Freeman

phoenix

Avery Freeman