Solved FreeBSD 10.1 p6 with zfs on root with raidz, one disk member was created oddly

Here is what I have:
  • 5 sata III magnetic disk all same drive model and size
    • ada1..5
  • 1 pcie ssd drive
    • ada0
  • 1 SSD drive
    • ada6
Here is what I did:
  • Installed FreeBSD 10.1 on real hardware
  • Selected guided root on zfs option
  • Selected gpt
  • Added 5 like sata disk to raidz
  • Changed swap to 6GB per disk
  • Created logzil mirror from 2 partitions, 1 on the ssd, and other on pcie ssd
  • Created l2arc cache on remaining disk space from both ssd
Here is what I got:
  • I have a functioning raidz with logzil and l2arc cache with an anomalous disk member of the raidz pool from the 5 magnetic drives
  • Each of the five members except for one appeared in /dev/gtp/zfs<#>
  • However one of them was set up as /dev/diskid/DISK-YFG89PPAp3
  • Additional each of the 4 disk that are gpt have 3 partitions
    • For example
      • ada1 - the disk
        • ada1p1
        • ada1p2
        • ada1p3
  • For unknown reasons the odd ball disk has ada2 has no partitions
  • Besides making the labeling look out of place and setting off my OCD, the swap partition is missing causing an error when the OS attempts to mount the expected swap partition the installer added to /etc/fstab.
It looks like this:

Code:
# zpool status -v
  pool: Datastore
  state: ONLINE
config:
[INDENT]NAME                                          STATE   READ    WRITE     CKSUM
Datastore                                     ONLINE  0       0         0 [/INDENT]
          raidz1-0                               ONLINE  0       0         0
              gpt/zfs0                           ONLINE  0       0         0
              diskid/DISK-YFG89PPAp3             ONLINE  0       0         0
              gpt/zfs2                           ONLINE  0       0         0
              gpt/zfs3                           ONLINE  0       0         0
              gpt/zfs4                           ONLINE  0       0         0
      logs
          mirror-1                               ONLINE  0        0         0
              gpt/log0                           ONLINE  0        0         0
              gpt/log1                           ONLINE  0        0         0
      cache
          gpt/cache0                             ONLINE  0        0         0
          gpt/cache1                             ONLINE  0        0         0

errors: No known data errors

Code:
ll /dev| grep ada | awk '{print $9}' | sort | uniq | grep ada

ada0
ada0p1
ada0p2
ada1
ada1p1
ada1p2
ada1p3
ada2       -  This is the odd ball ???
ada3
ada3p1
ada3p2
ada3p3
ada4
ada4p1
ada4p2
ada4p3
ada5
ada5p1
ada5p2
ada5p3
ada6
ada6p1
ada6p2
Here is what I would like to accomplish:
  • I would like to remove this disk and add it back so that it is configure as gpt with 4K and labeled as a /dev/gpt/zfs1 along with having the same partition layout as the other members of the zpool "Datastore"
Code:
# ll | grep zfs | awk '{print $9}'
zfs0
zfs2
zfs3
zfs4

You will notice above zfs1 is missing
This anomaly occurred following a very vanilla install of the OS. I expect it would likely happen again if I reinstalled. Additionally, I have substantial and quite complex work into the system as it is, which I why I don't want to attempt a reinstall.

Any help here would be appreciated.
 
You can "zpool offline" the offending disk, which will mark the pool as degraded:
# zpool offline Datastore diskid/DISK-YFG89PPAp3

Then properly partition the disk, label it, etc. And then "zpool replace" the "old" disk with the new partition.
# zpool replace Datastore diskid/DISK-YFG89PPAp3 gpt/zfs1

You may want to remove all traces of ZFS and GPT from the disk first, before re-partitioning it:
# zpool labelclear diskid/DISK-YFG89PPAp3
# gpart destroy -F ada2

Note: because you are using raidz1, if any other disk dies during the resilver, you will lose the whole pool. As these are new disks, a new pool with little data on it, the resilver shouldn't take long, and you shouldn't have any is
 
If you performed the installation using bsdinstall then I am afraid that your disk partitions are not 4K aligned. Can you also post the output of gpart show?
 
This is probably a bug in bsdinstall(8) that should be fixed. What probably happened is that the problematic disk had some leftover metadata from a RAID array or something else that prevented bsdinstall(8) from creating the proper partitioning on it. Instead of checking for errors from gpart(8) and halting the install bsdinstall(8) just went ahead with the pool creation and install of the system and that resulted in the pool you got.
 
Here is the output of gpart show.

I will attempt Phoenix's recommendation shortly after dinner, but I should probably confirm I don't have the 4k geom. Which if the below output would indicate that I am thinking I don't.

Code:
# gpart show
=>  34  250069613  ada0  GPT  (119G)
  34  2014  - free -  (1.0M)
  2048  16777216  1  freebsd-zfs  (8.0G)
  16779264  233290376  2  freebsd-zfs  (111G)
  250069640  7  - free -  (3.5K)

=>  34  3907029101  ada1  GPT  (1.8T)
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

=>  34  3907029101  ada3  GPT  (1.8T)
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

=>  34  3907029101  ada4  GPT  (1.8T)
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

=>  34  3907029101  ada5  GPT  (1.8T)
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

=>  34  234441581  ada6  GPT  (112G)
  34  2014  - free -  (1.0M)
  2048  16777216  1  freebsd-zfs  (8.0G)
  16779264  217662344  2  freebsd-zfs  (104G)
  234441608  7  - free -  (3.5K)

=>  34  3907029101  diskid/DISK-YFG89PPA  GPT  (1.8T)
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

=>  34  15633341  da0  GPT  (7.5G)
  34  1024  1  bios-boot  (512K)
  1058  6  - free -  (3.0K)
  1064  15632304  2  freebsd-zfs  (7.5G)
  15633368  7  - free -  (3.5K)

=>  34  15633341  diskid/DISK-4C530013510724112284  GPT  (7.5G)
  34  1024  1  bios-boot  (512K)
  1058  6  - free -  (3.0K)
  1064  15632304  2  freebsd-zfs  (7.5G)
  15633368  7  - free -  (3.5K)

=>  34  3907029097  da1  GPT  (1.8T)
  34  6  - free -  (3.0K)
  40  409600  1  efi  (200M)
  409640  3906357344  2  apple-hfs  (1.8T)
  3906766984  262147  - free -  (128M)

=>  34  3907029097  diskid/DISK-000000000024  GPT  (1.8T)
  34  6  - free -  (3.0K)
  40  409600  1  efi  (200M)
  409640  3906357344  2  apple-hfs  (1.8T)
  3906766984  262147  - free -  (128M)
 
I so far have offlined the disk, cleared it and created the partitions.

I will post the steps when I am done.

While I am still looking, I have not found how to create the label where I want.

The rest of the zfs partitions show up in /dev/gpt/blah

I have tried something like
glabel label gpt/zfs1 /dev/ada2
but this created the label in /dev/lable/gpt/zfs1.

So I removed this until I can get it to match the rest.
 
Tobik thanks for the reply. I am still learning about labeling in freebsdFreeBSD, so I thank you for you patience in advance.

The man page says:
gpart modify -i index [-l label] [-t type] [-f flags] geom
So I have tried:
gpart modify -i 3 -l zfs1 ada2
Code:
part: table 'ada2' is corrupt: Operation not permitted
Here is the current output of gpart show ada2:
Code:
# gpart show ada2
=>  34  3907029101  ada2  GPT  (1.8T) [CORRUPT]
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)
So with my current understanding I believe I would want a label for ada2p3 with is the zfs partitions to be at /dev/gpt/zfs1

If you could provide a more specific command I will reverse engineer how it works and better understand what is happening here. :)
 
Here is the output of gpart show.
I will attempt Phoenix's recommendation shortly after dinner, but I should probably confirm I don't have the 4k geom. Which if the below output would indicate that I am thinking I don't.

Just as I thought. Your drives don't seem to be 4K partitioned.
 
Yes. Right now I am trying to get the label issue fixed and get the drive back in the pool.

While I was researching the drive label I thought I would put the drive back in the pool during the interim, but had this issue.
Code:
# zpool replace Datastore /dev/diskid/DISK-YFG89PPAp3 /dev/ada2p3
invalid vdev specification
use '-f' to override the following errors:
/dev/ada2p3 is part of active pool 'Datastore'

# zpool replace -f Datastore /dev/diskid/DISK-YFG89PPAp3 /dev/ada2p3
invalid vdev specification
the following errors must be manually repaired:
/dev/ada2p3 is part of active pool 'Datastore'

After that if it is possible I will work to see if the drives can be converted over 4k without a reinstall, but that would probably be out of scope for this thread.

I will start a new one for that to make it more useful to others.
 
Well, just to keep things up to date, here are the steps I have taken thus far:

zpool offline Datastore diskid/DISK-YFG89PPAp3

zpool labelclear diskid/DISK-YFG89PPAp3
zpool labelclear /dev/ada2
Both of these "labelclear" commands had nothing to clear.

gpart destroy -F /dev/ada2
This command also had nothing to destroy.

Code:
gpart create -s gpt ada2
gpart add -s 512K -t freebsd-boot ada2
gpart add -s 6G -t freebsd-swap ada2
gpart add  -t freebsd-zfs ada2
newfs /dev/ada2p1
newfs /dev/ada2p2
newfs /dev/ada2p3

When I find how to label and then get it back in the pool I will post that here as well and mark this solved.

Thanks to everyone who has contributed.
 
Ah, just noticed it is corrupt; it told me; I just wasn't listening...
Code:
# gpart show ada2
=>  34  3907029101  ada2  GPT  (1.8T) [CORRUPT]
  34  1024  1  freebsd-boot  (512K)
  1058  12582912  2  freebsd-swap  (6.0G)
  12583970  3894445165  3  freebsd-zfs  (1.8T)

I have pretty much got this sorted out now.

Here are the steps to resolve from the beginning:

Code:
zpool offline Datastore diskid/DISK-YFG89PPAp3
gpart create -s gpt ada2
gpart add -s 512K -t freebsd-boot ada2
gpart add -s 6G -t freebsd-swap ada2
gpart add -t freebsd-zfs ada2
gpart modify -l zfs1 -i 3 ada2

The above all worked, partitioned my drives like the rest, and created the label where it belongs.

Now I am just left with getting it back into the pool:
Code:
# zpool replace Datastore /dev/diskid/DISK-YFG89PPAp3 /dev/gpt/zfs1
invalid vdev specification
use '-f' to override the following errors:
/dev/gpt/zfs1 is part of active pool 'Datastore'

I think it would add back in on a reboot and resilver, but I would like to get it back in without bringing the server down.
 
Tried a reboot. It didnt pick it up, and still would not replace.

Code:
root@rc-prod-infra-nas01.dresdencraft.com:/root
# zpool replace Datastore diskid/DISK-YFG89PPAp3 gpt/zfs1
invalid vdev specification
use '-f' to override the following errors:
/dev/gpt/zfs1 is part of active pool 'Datastore'

root@rc-prod-infra-nas01.dresdencraft.com:/root
# zpool status
  pool: Datastore
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
   Sufficient replicas exist for the pool to continue functioning in a
   degraded state.
action: Online the device using 'zpool online' or replace the device with
   'zpool replace'.
  scan: resilvered 180K in 0h0m with 0 errors on Mon Apr 27 19:51:16 2015
config:

   NAME  STATE  READ WRITE CKSUM
   Datastore  DEGRADED  0  0  0
    raidz1-0  DEGRADED  0  0  0
    gpt/zfs0  ONLINE  0  0  0
    6091044272226804680  OFFLINE  0  0  0  was /dev/diskid/DISK-YFG89PPAp3
    gpt/zfs2  ONLINE  0  0  0
    gpt/zfs3  ONLINE  0  0  0
    gpt/zfs4  ONLINE  0  0  0
   logs
    mirror-1  ONLINE  0  0  0
    gpt/log0  ONLINE  0  0  0
    gpt/log1  ONLINE  0  0  0
   cache
    gpt/cache0  ONLINE  0  0  0
    gpt/cache1  ONLINE  0  0  0

errors: No known data errors

Code:
# ll | grep zfs
crw-r-----  1 root  operator  0x8e Apr 28 19:35 zfs0
crw-r-----  1 root  operator  0xb5 Apr 28 19:35 zfs1
crw-r-----  1 root  operator  0xbb Apr 28 19:35 zfs2
crw-r-----  1 root  operator  0xc1 Apr 28 19:35 zfs3
crw-r-----  1 root  operator  0xc7 Apr 28 19:35 zfs4
 
Solved !!!

So the final issue with replacing the drive was it still had metadata on the disk.

I destroyed it with dd < /dev/zero > /dev/ada2 bs=16777216

Then followed all my steps above, only this time the replace command worked
 
Code:
# zpool status
  pool: Datastore
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Apr 28 20:33:17 2015
  63.7G scanned out of 625G at 350M/s, 0h27m to go
  12.6G resilvered, 10.18% done
config:

   NAME  STATE  READ WRITE CKSUM
   Datastore  DEGRADED  0  0  0
    raidz1-0  DEGRADED  0  0  0
    gpt/zfs0  ONLINE  0  0  0
    replacing-1  OFFLINE  0  0  0
    6091044272226804680  OFFLINE  0  0  0  was /dev/diskid/DISK-YFG89PPAp3
    gpt/zfs1  ONLINE  0  0  0  (resilvering)
    gpt/zfs2  ONLINE  0  0  0
    gpt/zfs3  ONLINE  0  0  0
    gpt/zfs4  ONLINE  0  0  0
   logs
    mirror-1  ONLINE  0  0  0
    gpt/log0  ONLINE  0  0  0
    gpt/log1  ONLINE  0  0  0
   cache
    gpt/cache0  ONLINE  0  0  0
    gpt/cache1  ONLINE  0  0  0

errors: No known data errors
 
Depending on the drives, you might experience performance problems as the pool gets filled up with data due to the wrong alignment.
 
Back
Top