ZFS Zfs and dead disk

Snake89 · Nov 19, 2019

Good afternoon!

Immediately apologize for poor knowledge of English

Got server with ZFS file system

One of its disks failed in the seventh slot

There is an IBM 3630 server with an LSI Megaraid controller

When the disk failed, the operating system hung up and did not respond to commands.
The server was rigidly shut down and the disk was replaced in the seventh slot with the same disk volume.
After that, the server was turned on, and we see this picture:

Code:

zpool status -x
 pool: tank
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
       the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
  see: [URL]http://illumos.org/msg/ZFS-8000-2Q[/URL]
 scan: resilvered 371G in 57h1m with 0 errors on Fri Jul 14 21:59:44 2017
config:

       NAME                                            STATE     READ WRITE CKSUM
       tank                                            DEGRADED     0     0     0
         raidz2-0                                      DEGRADED     0     0     0
           mfid0p2                                     ONLINE       0     0     0
           mfid1p2                                     ONLINE       0     0     0
           gptid/79862951-872a-11e3-a225-5cf3fca61284  ONLINE       0     0     0
           gptid/79d13754-872a-11e3-a225-5cf3fca61284  ONLINE       0     0     0
           gptid/a546e108-66e7-11e7-b776-5cf3fca61284  ONLINE       0     0     0
           gptid/7a667f7e-872a-11e3-a225-5cf3fca61284  ONLINE       0     0     0
           mfid2p2                                     ONLINE       0     0     0
           mfid3p2                                     ONLINE       0     0     0
           mfid4p2                                     ONLINE       0     0     0
           mfid5p2                                     ONLINE       0     0     0
           mfid6p2                                     ONLINE       0     0     0
           17544707893867162089                        UNAVAIL      0     0     0  was /dev/mfid7p2
           gptid/7c75ce8e-872a-11e3-a225-5cf3fca61284  ONLINE       0     0     0
           gptid/7cc2cea5-872a-11e3-a225-5cf3fca61284  ONLINE       0     0     0

errors: No known data errors

Code:

ls -l /dev/mfi*
crw-r-----  1 root  operator  0x2d Nov 18 11:52 /dev/mfi0
crw-r-----  1 root  operator  0x64 Nov 18 11:52 /dev/mfid0
crw-r-----  1 root  operator  0x65 Nov 18 11:52 /dev/mfid0p1
crw-r-----  1 root  operator  0x66 Nov 18 11:52 /dev/mfid0p2
crw-r-----  1 root  operator  0x6b Nov 18 11:52 /dev/mfid1
crw-r-----  1 root  operator  0xad Nov 18 11:52 /dev/mfid10
crw-r-----  1 root  operator  0xae Nov 18 11:52 /dev/mfid10p1
crw-r-----  1 root  operator  0xaf Nov 18 11:52 /dev/mfid10p2
crw-r-----  1 root  operator  0xc0 Nov 18 11:52 /dev/mfid11
crw-r-----  1 root  operator  0xc1 Nov 18 11:52 /dev/mfid11p1
crw-r-----  1 root  operator  0xc2 Nov 18 11:52 /dev/mfid11p2
crw-r-----  1 root  operator  0xc3 Nov 18 11:52 /dev/mfid12
crw-r-----  1 root  operator  0xc8 Nov 18 11:52 /dev/mfid12p1
crw-r-----  1 root  operator  0xc9 Nov 18 11:52 /dev/mfid12p2
crw-r-----  1 root  operator  0x6c Nov 18 11:52 /dev/mfid1p1
crw-r-----  1 root  operator  0x6f Nov 18 11:52 /dev/mfid1p2
crw-r-----  1 root  operator  0x75 Nov 18 11:52 /dev/mfid2
crw-r-----  1 root  operator  0x76 Nov 18 11:52 /dev/mfid2p1
crw-r-----  1 root  operator  0x77 Nov 18 11:52 /dev/mfid2p2
crw-r-----  1 root  operator  0x7c Nov 18 11:52 /dev/mfid3
crw-r-----  1 root  operator  0x7d Nov 18 11:52 /dev/mfid3p1
crw-r-----  1 root  operator  0x7e Nov 18 11:52 /dev/mfid3p2
crw-r-----  1 root  operator  0x83 Nov 18 11:52 /dev/mfid4
crw-r-----  1 root  operator  0x84 Nov 18 11:52 /dev/mfid4p1
crw-r-----  1 root  operator  0x85 Nov 18 11:52 /dev/mfid4p2
crw-r-----  1 root  operator  0x8a Nov 18 11:52 /dev/mfid5
crw-r-----  1 root  operator  0x8b Nov 18 11:52 /dev/mfid5p1
crw-r-----  1 root  operator  0x8c Nov 18 11:52 /dev/mfid5p2
crw-r-----  1 root  operator  0x8d Nov 18 11:52 /dev/mfid6
crw-r-----  1 root  operator  0x92 Nov 18 11:52 /dev/mfid6p1
crw-r-----  1 root  operator  0x93 Nov 18 11:52 /dev/mfid6p2
crw-r-----  1 root  operator  0x94 Nov 18 11:52 /dev/mfid7
crw-r-----  1 root  operator  0x99 Nov 18 11:52 /dev/mfid7p1
crw-r-----  1 root  operator  0x9a Nov 18 11:52 /dev/mfid7p2
crw-r-----  1 root  operator  0x9b Nov 18 11:52 /dev/mfid8
crw-r-----  1 root  operator  0xa0 Nov 18 11:52 /dev/mfid8p1
crw-r-----  1 root  operator  0xa1 Nov 18 11:52 /dev/mfid8p2
crw-r-----  1 root  operator  0xa2 Nov 18 11:52 /dev/mfid9
crw-r-----  1 root  operator  0xa7 Nov 18 11:52 /dev/mfid9p1
crw-r-----  1 root  operator  0xa8 Nov 18 11:52 /dev/mfid9p2

Command:
# zpool online tank 17544707893867162089

Accordingly, nothing will change:

Code:

warning: device '17544707893867162089' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present

Command:

Code:

# zpool detach tank 17544707893867162089
cannot detach 17544707893867162089: only applicable to mirror and replacing vdevs

Also I post the output of the utility mfiutil

Code:

# mfiutil show config
mfi0 Configuration: 13 arrays, 13 volumes, 0 spares
    array 0 of 1 drives:
        drive 10 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKGT7JVCXSA3C0> SCSI-6
    array 1 of 1 drives:
        drive 11 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMK88LVCXSA3C0> SCSI-6
    array 2 of 1 drives:
        drive 12 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMHMNLVCXSA3C0> SCSI-6
    array 3 of 1 drives:
        drive 13 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZE6LVCXSA3C0> SCSI-6
    array 4 of 1 drives:
        drive 14 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKH3EJVCXSA3C0> SCSI-6
    array 5 of 1 drives:
        drive 15 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZZULVCXSA3C0> SCSI-6
    array 6 of 1 drives:
        drive 16 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMSLZLVCXSA3C0> SCSI-6
    array 7 of 1 drives:
        drive 18 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMZ1YLVCXSA3C0> SCSI-6
    array 8 of 1 drives:
        drive 19 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKGALJVCXSA3C0> SCSI-6
    array 9 of 1 drives:
        drive 20 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKH77JVCXSA3C0> SCSI-6
    array 10 of 1 drives:
        drive 21 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZPALVCXSA3C0> SCSI-6
    array 11 of 1 drives:
        drive 23 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLVT5LVCXSA3C0> SCSI-6
    array 12 of 1 drives:
        drive 24 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLVRRLVCXSA3C0> SCSI-6
    volume mfid0 (558G) RAID-0 64K OPTIMAL spans:
        array 0
    volume mfid1 (558G) RAID-0 64K OPTIMAL spans:
        array 1
    volume mfid2 (558G) RAID-0 64K OPTIMAL spans:
        array 2
    volume mfid3 (558G) RAID-0 64K OPTIMAL spans:
        array 3
    volume mfid4 (558G) RAID-0 64K OPTIMAL spans:
        array 4
    volume mfid5 (558G) RAID-0 64K OPTIMAL spans:
        array 5
    volume mfid6 (558G) RAID-0 64K OPTIMAL spans:
        array 6
    volume mfid7 (558G) RAID-0 64K OPTIMAL spans:
        array 7
    volume mfid8 (558G) RAID-0 64K OPTIMAL spans:
        array 8
    volume mfid9 (558G) RAID-0 64K OPTIMAL spans:
        array 9
    volume mfid10 (558G) RAID-0 64K OPTIMAL spans:
        array 10
    volume mfid11 (558G) RAID-0 64K OPTIMAL spans:
        array 11
    volume mfid12 (558G) RAID-0 64K OPTIMAL spans:
        array 12

Disc number 25 was after the reboot in the aged unrecognized good, we transferred it to online, this is all we managed to do:

Code:

# mfiutil show drives
mfi0 Physical Drives:
10 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKGT7JVCXSA3C0> SCSI-6 E1:S0
11 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMK88LVCXSA3C0> SCSI-6 E1:S1
12 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMHMNLVCXSA3C0> SCSI-6 E1:S2
13 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZE6LVCXSA3C0> SCSI-6 E1:S3
14 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKH3EJVCXSA3C0> SCSI-6 E1:S4
15 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZZULVCXSA3C0> SCSI-6 E1:S5
16 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMSLZLVCXSA3C0> SCSI-6 E1:S6
18 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYMZ1YLVCXSA3C0> SCSI-6 E1:S8
19 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKGALJVCXSA3C0> SCSI-6 E1:S9
20 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYKH77JVCXSA3C0> SCSI-6 E1:S10
21 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLZPALVCXSA3C0> SCSI-6 E1:S11
23 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLVT5LVCXSA3C0> SCSI-6 E1:S13
24 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYLVRRLVCXSA3C0> SCSI-6 E1:S12
25 (  559G) ONLINE <IBM-ESXS VPCA600900EST1 N A3C0 serial=JZYH0SMLVCXSA3C0> SCSI-6 E1:S7


# mfiutil show volumes
mfi0 Volumes:
  Id     Size    Level   Stripe  State   Cache   Name
 mfid0 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid1 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid2 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid3 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid4 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid5 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid6 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid7 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid8 (  558G) RAID-0      64K OPTIMAL Enabled
 mfid9 (  558G) RAID-0      64K OPTIMAL Enabled
mfid10 (  558G) RAID-0      64K OPTIMAL Enabled
mfid11 (  558G) RAID-0      64K OPTIMAL Enabled
mfid12 (  558G) RAID-0      64K OPTIMAL Enabled

We are not strong experts in the field, tell us what to do next?

SirDice · Nov 19, 2019

Snake89 said:
We are not strong experts in the field, tell us what to do next?

Disk is broken, maybe you should replace it?

msplsh · Nov 19, 2019

?

Snake89 said:
the disk was replaced in the seventh...

Seems like they're looking for "how to fix a degraded pool"

SirDice · Nov 19, 2019

Which LSI controller? Wasn't it possible to use a JBOD setting? Those single disk RAID0 volumes are such a horrid kludge. But since everything else uses it I would suggest keeping everything the same. So, you're going to need to configure that new "unconfigured, good" disk and create a single disk RAID0 volume from it. Once that's done you need to copy the partition table from one of the other disks. Then use zfs replace to replace the UNAVAILABLE disk in the pool with the new one. That should automatically trigger a resilver.

Snake89 · Nov 20, 2019

SirDice said:
Which LSI controller? Wasn't it possible to use a JBOD setting? Those single disk RAID0 volumes are such a horrid kludge. But since everything else uses it I would suggest keeping everything the same. So, you're going to need to configure that new "unconfigured, good" disk and create a single disk RAID0 volume from it. Once that's done you need to copy the partition table from one of the other disks. Then use zfs replace to replace the UNAVAILABLE disk in the pool with the new one. That should automatically trigger a resilver.

All 14 disk slots were occupied in the server

Accordingly, when the disk in the seventh slot died, the server turned off, the disk from the seventh slot was removed, and a new disk was inserted into the seventh slot of the failed

The eslib was an empty slot, it would be clear to specify the command zpool replace < pool > < oldiskname > < newdiskname >

And in the case where the disk was inserted into the same seventh slot, how to use the command?

Snake89 · Nov 20, 2019

SirDice said:
Which LSI controller? Wasn't it possible to use a JBOD setting? Those single disk RAID0 volumes are such a horrid kludge. But since everything else uses it I would suggest keeping everything the same. So, you're going to need to configure that new "unconfigured, good" disk and create a single disk RAID0 volume from it. Once that's done you need to copy the partition table from one of the other disks. Then use zfs replace to replace the UNAVAILABLE disk in the pool with the new one. That should automatically trigger a resilver.

We tried to disconnect the disk:
# zpool offline tank 17544707893867162089

# zpool status
pool: tank
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 248K in 32h10m with 0 errors on Wed Nov 20 00:49:38 2019
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
mfid0p2 ONLINE 0 0 0
mfid1p2 ONLINE 0 0 0
gptid/79862951-872a-11e3-a225-5cf3fca61284 ONLINE 0 0 0
gptid/79d13754-872a-11e3-a225-5cf3fca61284 ONLINE 0 0 0
gptid/a546e108-66e7-11e7-b776-5cf3fca61284 ONLINE 0 0 0
gptid/7a667f7e-872a-11e3-a225-5cf3fca61284 ONLINE 0 0 0
mfid2p2 ONLINE 0 0 0
mfid3p2 ONLINE 0 0 0
mfid4p2 ONLINE 0 0 0
mfid5p2 ONLINE 0 0 0
mfid6p2 ONLINE 0 0 0
17544707893867162089 OFFLINE 0 0 0 was /dev/mfid7p2
gptid/7c75ce8e-872a-11e3-a225-5cf3fca61284 ONLINE 0 0 0
gptid/7cc2cea5-872a-11e3-a225-5cf3fca61284 ONLINE 0 0 0

errors: No known data errors

We tried to remove the disk:
# zpool remove tank 17544707893867162089
cannot remove 17544707893867162089: root pool can not have removed devices, because GRUB does not understand them

In theory, the procedure must be what we understood:
1. zpool offline <poolname> <diskname>
2. Physically remove the drive from the system
3. Physically add the new drive to the system
4. Partition the drive as needed; label the partitions as needed
5. zpool replace <poolname> <olddiskname> <newdiskname>

And here in the fifth step it is not clear what to indicate if we inserted a new disk in the place of the old disk in the same slot

Snake89 · Nov 20, 2019

SirDice said:
Which LSI controller? Wasn't it possible to use a JBOD setting? Those single disk RAID0 volumes are such a horrid kludge. But since everything else uses it I would suggest keeping everything the same. So, you're going to need to configure that new "unconfigured, good" disk and create a single disk RAID0 volume from it. Once that's done you need to copy the partition table from one of the other disks. Then use zfs replace to replace the UNAVAILABLE disk in the pool with the new one. That should automatically trigger a resilver.

If we understand correctly, the new disk is defined as the old mfid7...

# dmesg | grep mfid
mfid0 on mfi0
mfid0: ses0 at ahciem0 bus 0 scbus6 target 0 lun 0
mfid1 on mfi0
mfid1: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid2 on mfi0
mfid2: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid3 on mfi0
mfid3: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid4 on mfi0
mfid4: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid5 on mfi0
mfid5: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid6 on mfi0
mfid6: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid7 on mfi0
mfid7: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid8 on mfi0
mfid8: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid9 on mfi0
mfid9: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid10 on mfi0
mfid10: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid11 on mfi0
mfid11: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid12 on mfi0
mfid12: 571250MB (1169920000 sectors) RAID volume (no label) is optimal
mfid9: hard error cmd=read 637549794-637549953

SirDice · Nov 20, 2019

Snake89 said:
If we understand correctly

I was actually wondering what type/model the LSI card was. Most of the newer cards allow you to set the disks as JBOD, so you don't need that single disk RAID0 volume kludge. But we're past that already so it's moot.

Snake89 said:
And here in the fifth step it is not clear what to indicate if we inserted a new disk in the place of the old disk in the same slot

Use the ID to refer to the "old" disk, i.e. zpool tank replace 17544707893867162089 /dev/mfid7p2. Double check if the partition has been properly created before you do; gpart show /dev/mfid7.

Snake89 · Nov 22, 2019

SirDice said:
I was actually wondering what type/model the LSI card was. Most of the newer cards allow you to set the disks as JBOD, so you don't need that single disk RAID0 volume kludge. But we're past that already so it's moot.

Use the ID to refer to the "old" disk, i.e. zpool tank replace 17544707893867162089 /dev/mfid7p2. Double check if the partition has been properly created before you do; gpart show /dev/mfid7.

Unfortunately, it doesn 't work that way.

The markup on the disk is correct, and the replace command issues an error

# gpart show /dev/mfid7
=> 34 1169919933 mfid7 GPT (558G)
34 1024 1 freebsd-boot (512K)
1058 1169918909 2 freebsd-zfs (558G)

# zpool replace -f tank 17544707893867162089 /dev/mfid7p2
invalid vdev specification
the following errors must be manually repaired:
/dev/mfid7p2 is part of active pool 'tank'

ZFS Zfs and dead disk

Snake89

SirDice

Administrator

msplsh

SirDice

Administrator

Snake89

Snake89

Snake89

SirDice

Administrator

Snake89