ZFS identify disks serial number on Raid0 P410i

saniplastic · Jul 28, 2019

hi to all

in my server, each disk is on Raid0 Hardware Raid P410i and separate raid0 disk present to zfs.

when I want get disk serial number for fail time, all disk have same serial.

smartctl -a /dev/da1 ( or da2 and so on)
i see all serial number are same and SMART support is: Unavailable - device lacks SMART capability.

1- How Can I get disk serial number for replace in failing time?

2- is SMART functioning properly for failing detection? if no, what is alternative solution?

3- in my implementation how can i find failed drive?

ondra_knezour · Jul 28, 2019

Try smartctl -a -d cciss,0 /dev/ciss0 and then see smartctl(8)

VladiBG · Jul 28, 2019

use this:
sysutils/cciss_vol_status
cciss_vol_status -s /dev/ciss0

ralphbsz · Jul 28, 2019

As written above, the ciss tricky might work. But you are making your life artificially difficult. With ZFS, it is just a bad idea to use hardware RAID. ZFS has a built-in RAID system, which works better than having hardware RAID underneath (better in several ways). You should instead reformat your system to remove the hardware RAID functionality, and let ZFS handle the disks directly.

By the way, you are aware that RAID0 does not give you any redundancy, and no protection against disk errors? It is designed to give you more disk capacity.

And above you ask whether SMART works for failure detection. The answer is that while SMART helps, it is far from perfect. In my previous job we had a joke (which is actually very close to the truth): Half the time when SMART predicts a disk failure, the disk will actually fail; the other half of the time the disk will continue to function perfectly for a long time. And half the time when a disk actually fails, SMART did predict it; the other half, the disk fails without warning. So while installing SMART and monitoring it is definitely a good idea, you also have to be prepared for disk failures to occur without SMART telling you first.

saniplastic · Jul 29, 2019

VladiBG said:
use this:
sysutils/cciss_vol_status
cciss_vol_status -s /dev/ciss0

root@freenas[~]# sysutils/cciss_vol_status
zsh: no such file or directory: sysutils/cciss_vol_status
root@freenas[~]# cciss_vol_status -s /dev/ciss0
zsh: command not found: cciss_vol_status

saniplastic · Jul 29, 2019

ralphbsz said:
As written above, the ciss tricky might work. But you are making your life artificially difficult. With ZFS, it is just a bad idea to use hardware RAID. ZFS has a built-in RAID system, which works better than having hardware RAID underneath (better in several ways). You should instead reformat your system to remove the hardware RAID functionality, and let ZFS handle the disks directly.

By the way, you are aware that RAID0 does not give you any redundancy, and no protection against disk errors? It is designed to give you more disk capacity.

And above you ask whether SMART works for failure detection. The answer is that while SMART helps, it is far from perfect. In my previous job we had a joke (which is actually very close to the truth): Half the time when SMART predicts a disk failure, the disk will actually fail; the other half of the time the disk will continue to function perfectly for a long time. And half the time when a disk actually fails, SMART did predict it; the other half, the disk fails without warning. So while installing SMART and monitoring it is definitely a good idea, you also have to be prepared for disk failures to occur without SMART telling you first.

you are right. but i dont want use hardware raid protection.i use zfs raid.
is there really bad thing to install zfs top of HW raid?

https://mangolassi.it/topic/12047/zfs-is-perfectly-safe-on-hardware-raid

as i read, if i use zfs at top of HW raid, i loos some feature like disk failure prediction.

SirDice · Jul 29, 2019

saniplastic said:
root@freenas

FYI: PC-BSD, FreeNAS, XigmaNAS, and all other FreeBSD Derivatives

saniplastic said:
if i use zfs at top of HW raid, i loos some feature like disk failure prediction.

You also lose ZFS's error-correction. So you only get error-detection. The ability to correct errors is one of the major ZFS features. Hardware RAID cards don't do error correction at all.

ralphbsz · Jul 29, 2019

And you lose the ability to identify disk drives easily, as you have found out.

Side remark: SirDice said that "Hardware RAID cards don't do error correction at all". I think that statement is correct for RAID that is sold in the form of a physical add-on card that gets plugged into a slot on the motherboard. But there are commercial RAID implementations that implement checksums, and can perform error correction. Personally, I only know one intimately, and it is not available for FreeBSD, nor is it cheap (prices start at a quarter million).

SirDice · Jul 29, 2019

ralphbsz said:
prices start at a quarter million

That's the reason why I have never seen such a card. Most expensive cards I've worked with, on FreeBSD and others, were usually in the 2-3000 euro range. Still quite expensive.

phoenix · Jul 30, 2019

Everyone is picking on the wrong detail. Their setup has each individual disk as a 1-disk RAID0 array. Each of these RAID0 arrays are then passed to ZFS to create a pool with ZFS in control of redundancy.

That's not the issue. The pool is redundant.

The issue is that they can't figure out how to convert between "raid array X has failed" message, and the unique ID of the actual drive that has failed.

They're looking for the right smartctl options to use to pass-through the command to the disks (raid controllers intercept smart commands). Or for alternatives to identifying drives.

This is where gpart, GPT, and labels come in handy.

Partition your raid arrays with GPT, and give the partition a label that describes where the drive is located physically. Then use the label (/dev/gpt/labelname) to create the pool. If you added partitions to the pool, then you can offline an array, label the partition, then online the label, and the resolver will take seconds/minutes to complete, and you'll be better off than before.

ZFS identify disks serial number on Raid0 P410i

saniplastic

ondra_knezour

VladiBG

ralphbsz

saniplastic

saniplastic

SirDice

Administrator

ralphbsz

SirDice

Administrator

phoenix