Labels "disappear" after zpool import

Sebulon · Jun 26, 2011

Hi,

June 21st I upgraded to 8-STABLE and ZFS V28. Afterwards, I've been having issues with my labels. Doesn't matter if I use glabel or label with gpt. It's like this:

glabel

Code:

# zpool export pool2
# zpool import
  pool: pool2
    id: 6066349100353182436
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

	pool2       ONLINE
	  raidz2-0  ONLINE
	    ada1    ONLINE
	    ada2    ONLINE
	    ada3    ONLINE
	    ada4    ONLINE
	    ada5    ONLINE
	    ada6    ONLINE
	    ada7    ONLINE
	    ada8    ONLINE
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel  512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel  512 Jun 26 19:34 ..
# glabel label rack1-2 ada1
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
crw-r-----   1 root  operator    0, 191 Jun 26 20:11 rack1-2
# zpool import pool2
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel  512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel  512 Jun 26 19:34 ..
# zpool export pool2
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
crw-r-----   1 root  operator    0, 191 Jun 26 20:11 rack1-2

gpart

Code:

# zpool export pool2
# zpool import
  pool: pool2
    id: 6066349100353182436
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

	pool2       ONLINE
	  raidz2-0  ONLINE
	    ada1    ONLINE
	    ada2    ONLINE
	    ada3    ONLINE
	    ada4    ONLINE
	    ada5    ONLINE
	    ada6    ONLINE
	    ada7    ONLINE
	    ada8    ONLINE
# ls -la /dev/gpt/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
# gpart create -s gpt ada1
ada1 created
# gpart add -t freebsd-zfs -l rack1-2 ada1
ada1p1 added
# ls -la /dev/gpt/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
crw-r-----   1 root  operator    0, 192 Jun 26 20:18 rack1-2
# zpool import pool2
# ls -la /dev/gpt/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
# zpool export pool2
# gpart show ada1
=>        34  1953525101  ada1  GPT  (931G) [CORRUPT]
          34  1953525101     1  freebsd-zfs  (931G)
# ls -la /dev/gpt/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
crw-r-----   1 root  operator    0, 192 Jun 26 20:23 rack1-2

It's like magic =)

Is this a known bug, regression, expected behaviour in certain circumstances, or am I doing something wrong? I'd really hate having to quit using labels, I think they are super otherwise.

/Sebulon

AndyUKG · Jun 27, 2011

Sebulon said:

Code:

# zpool export pool2
# zpool import
  pool: pool2
    id: 6066349100353182436
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

	pool2       ONLINE
	  raidz2-0  ONLINE
	    ada1    ONLINE
	    ada2    ONLINE
	    ada3    ONLINE
	    ada4    ONLINE
	    ada5    ONLINE
	    ada6    ONLINE
	    ada7    ONLINE
	    ada8    ONLINE

Hi,

Looks to me like your RAIDZ2 vdev is using whole disks, and then you are trying to use glabel or gpart to add a label to the physical disks. AFAIK both of these will write data to the disk (gpart the first 34 blocks and glabel the last block or something like that) and as such you are risking damaging your ZFS data, as ZFS thinks the whole disk belongs to it. My advice is stop trying to label these disks as you risk corrupting the pool,

cheers Andy.

gkontos · Jun 27, 2011

IMHO the corruption has already occurred. I would recommend a restore from backup after proper labeling. That is, remove all gpart info from disks and label accordingly with glabel.

[CMD=""]#glabel label -v rack1-1 /dev/ad0[/CMD]
[CMD=""]#glabel label -v rack1-2 /dev/ad1[/CMD]
...

AndyUKG · Jun 27, 2011

gkontos said:
IMHO the corruption has already occurred. I would recommend a restore from backup after proper labeling.

In theory a scrub should be able to confirm whether corruption has occured. I've read that ZFS creates some kind of GPT partition if asked to use whole disk, although on FreeBSD you can't tell via # gpart show adx, but if true then gpart create may not be corrupting the actual data, and glabel writes at the end of the disk so again a good chance he hasn't done any damage. Anyway, test via scrub is a good first step...

gkontos · Jun 27, 2011

That is correct but he still won't be able to use labels with this setup.

Sebulon · Jun 27, 2011

Hi,

IÂ´ve already tried a complete restart- cleared all metadata from the drives, crashed the pool, created with labels from scratch. Then, when I export (or reboot) and then import again, the labels just vanish.

/Sebulon

gkontos · Jun 27, 2011

Sebulon said:
Hi,

IÂ´ve already tried a complete restart- cleared all metadata from the drives, crashed the pool, created with labels from scratch. Then, when I export (or reboot) and then import again, the labels just vanish.

/Sebulon

Could you describe the procedure that you do? Also, sorry if this sounds silly, did you update world when you build your new kernel?

Sebulon · Jun 27, 2011

@gkontos
No, not at all, it is a very reasonable question. However, I did update world so that shouldnÂ´t be an issue.

This is how I clear metadata:

Code:

# zpool destroy pool2

[FILE]gpart[/FILE]
# gpart delete -i 1 ada1(2,3,4,5,6,7,8)
ada1p1 deleted
# gpart destroy ada1(etc)
ada1 destroyed
# ls -la /dev/gpt/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..

[FILE]glabel[/FILE]
# glabel clear -v ada1(etc)
Metadata cleared on ada1.
Done.
# glabel list ada1(etc)
glabel: No such geom: ada1.
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel  512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel  512 Jun 26 19:34 ..

Then, starting from scratch:

Code:

# glabel label -v rack1-2 ada1(etc)
Metadata value stored on ada1.
Done.
# zpool create pool2 raidz2 label/rack1-{2,3,4,5} label/rack2-{1,2,3,4}
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel          512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel          512 Jun 26 19:34 ..
crw-r-----   1 root  operator    0, 192 Jun 27 17:12 rack1-2
crw-r-----   1 root  operator    0, 193 Jun 27 17:12 rack1-3
crw-r-----   1 root  operator    0, 194 Jun 27 17:12 rack1-4
crw-r-----   1 root  operator    0, 195 Jun 27 17:12 rack1-5
crw-r-----   1 root  operator    0, 196 Jun 27 17:12 rack2-1
crw-r-----   1 root  operator    0, 197 Jun 27 17:12 rack2-2
crw-r-----   1 root  operator    0, 198 Jun 27 17:12 rack2-3
crw-r-----   1 root  operator    0, 199 Jun 27 17:12 rack2-4
# zpool status pool2
  pool: pool2
 state: ONLINE
 scan: none requested
config:

	NAME               STATE     READ WRITE CKSUM
	pool2              ONLINE       0     0     0
	  raidz2-0         ONLINE       0     0     0
	    label/rack1-2  ONLINE       0     0     0
	    label/rack1-3  ONLINE       0     0     0
	    label/rack1-4  ONLINE       0     0     0
	    label/rack1-5  ONLINE       0     0     0
	    label/rack2-1  ONLINE       0     0     0
	    label/rack2-2  ONLINE       0     0     0
	    label/rack2-3  ONLINE       0     0     0
	    label/rack2-4  ONLINE       0     0     0

errors: No known data errors
# zpool export pool2
# zpool import pool2
# ls -la /dev/label/
total 2
dr-xr-xr-x   2 root  wheel  512 Jun 26 17:34 .
dr-xr-xr-x  10 root  wheel  512 Jun 26 19:34 ..
# zpool status pool2
  pool: pool2
 state: ONLINE
 scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	pool2       ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    ada1    ONLINE       0     0     0
	    ada2    ONLINE       0     0     0
	    ada3    ONLINE       0     0     0
	    ada4    ONLINE       0     0     0
	    ada5    ONLINE       0     0     0
	    ada6    ONLINE       0     0     0
	    ada7    ONLINE       0     0     0
	    ada8    ONLINE       0     0     0

errors: No known data errors

Before import:

Code:

# zpool export pool2
# zdb -e pool2
pool2
        vdev_children: 1
        version: 28
        pool_guid: 8210453133321466902
        name: 'pool2'
        state: 1
        hostid: 416064828
        hostname: 'main.inparadise.dontexist.com'
        vdev_tree:
            type: 'root'
            id: 0
            guid: 8210453133321466902
            children[0]:
                type: 'raidz'
                id: 0
                guid: 5230556295749594025
                nparity: 2
                metaslab_array: 30
                metaslab_shift: 36
                ashift: 9
                asize: 8001599569920
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 7511156879401974014
                    phys_path: '/dev/label/rack1-2'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack1-2'
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 10579738102838987115
                    phys_path: '/dev/label/rack1-3'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack1-3'
                children[2]:
                    type: 'disk'
                    id: 2
                    guid: 6878640907136507935
                    phys_path: '/dev/label/rack1-4'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack1-4'
                children[3]:
                    type: 'disk'
                    id: 3
                    guid: 4853194485766963248
                    phys_path: '/dev/label/rack1-5'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack1-5'
                children[4]:
                    type: 'disk'
                    id: 4
                    guid: 12770906492587502244
                    phys_path: '/dev/label/rack2-1'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack2-1'
                children[5]:
                    type: 'disk'
                    id: 5
                    guid: 13880953260692839486
                    phys_path: '/dev/label/rack2-2'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack2-2'
                children[6]:
                    type: 'disk'
                    id: 6
                    guid: 6121608347280253910
                    phys_path: '/dev/label/rack2-3'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack2-3'
                children[7]:
                    type: 'disk'
                    id: 7
                    guid: 14732735682576851146
                    phys_path: '/dev/label/rack2-4'
                    whole_disk: 1
                    create_txg: 4
                    path: '/dev/dsk/label/rack2-4'

Then, after import:

Code:

# zpool import pool2
# zdb -C pool2

MOS Configuration:
        version: 28
        name: 'pool2'
        state: 0
        txg: 31
        pool_guid: 8210453133321466902
        hostid: 416064828
        hostname: 'main.inparadise.dontexist.com'
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 8210453133321466902
            children[0]:
                type: 'raidz'
                id: 0
                guid: 5230556295749594025
                nparity: 2
                metaslab_array: 30
                metaslab_shift: 36
                ashift: 9
                asize: 8001599569920
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 7511156879401974014
                    path: '/dev/ada1'
                    phys_path: '/dev/ada1'
                    whole_disk: 1
                    create_txg: 4
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 10579738102838987115
                    path: '/dev/ada2'
                    phys_path: '/dev/ada2'
                    whole_disk: 1
                    create_txg: 4
                children[2]:
                    type: 'disk'
                    id: 2
                    guid: 6878640907136507935
                    path: '/dev/ada3'
                    phys_path: '/dev/ada3'
                    whole_disk: 1
                    create_txg: 4
                children[3]:
                    type: 'disk'
                    id: 3
                    guid: 4853194485766963248
                    path: '/dev/ada4'
                    phys_path: '/dev/ada4'
                    whole_disk: 1
                    create_txg: 4
                children[4]:
                    type: 'disk'
                    id: 4
                    guid: 12770906492587502244
                    path: '/dev/ada5'
                    phys_path: '/dev/ada5'
                    whole_disk: 1
                    create_txg: 4
                children[5]:
                    type: 'disk'
                    id: 5
                    guid: 13880953260692839486
                    path: '/dev/ada6'
                    phys_path: '/dev/ada6'
                    whole_disk: 1
                    create_txg: 4
                children[6]:
                    type: 'disk'
                    id: 6
                    guid: 6121608347280253910
                    path: '/dev/ada7'
                    phys_path: '/dev/ada7'
                    whole_disk: 1
                    create_txg: 4
                children[7]:
                    type: 'disk'
                    id: 7
                    guid: 14732735682576851146
                    path: '/dev/ada8'
                    phys_path: '/dev/ada8'
                    whole_disk: 1
                    create_txg: 4

Ideas?

/Sebulon

gkontos · Jun 27, 2011

Sebulon said:
Ideas?

/Sebulon

Am sorry but no!

If I were you I would update my sources again, rebuild world && kernel once more. After that I would ask help from the stable and current mailing list.

There is definitely nothing wrong with your procedure.

George

AndyUKG · Jun 27, 2011

Personally I'd use GPT and GPT labels, it's the new cross platform standard so why not use it? I know lots of people use glabel but I don't like it, from what I understand it writes data to the end of the disk, which FreeBSD magically won't overwrite however if you glabel a disk with existing data you will overwrite your data without any warnings. Maybe I've not got that correct, feel free to correct me, but if so it just seems quite messy and dangerous to me vs GPT which is seems more precise,

Andy.

Sebulon · Jun 28, 2011

@gkontos
I updated my sources again yesterday and rebuilt; no cigar.
Wow, thisÂ´ll be the first time IÂ´ve had to ask "the lists", how do I do that?

@AndyUKG
ItÂ´s the same result, no matter if IÂ´ve used glabel or gpart; After IÂ´ve exported and imported again, the labels disappear.

Oh, yeah, one more thing! If I then export the pool, the labels magically appear again. Also true for both glabel and gpart.
ItÂ´s as if zfs lies down over the labels like a cover and when you export the pool, you pull away the cover and you can see the labels again. It didnÂ´t do that before.

/Sebulon

AndyUKG · Jun 28, 2011

Can you show your steps used to test gpart GPT labels? The steps in the very first post are bad so hopefully you have tried again with corrected steps?

graudeejs · Jun 28, 2011

I switched to glabel from gpt labels.

Few days ago I used gpt labels, but they did "disappear" (more like they didn't appear at all, when I boot from labeled disks. I have complex setup of gpt/geli/zfs).

Today, I rebuilt my desktop PC relying on glabel instead of gpt labels. Everything works fine now.

toddhpoole · Jun 29, 2011

Sebulon, I am seeing the exact same behavior here - perhaps this is a bug?

I created pool named "mediatank" several months ago with a RAIDZ vdev comprised of 4 2TB "advanced format" hard drives from WD. During the creation process, the drives were properly gnop'd (so that my pool's ashift would equal 12) and glabel'd (so that I could easily locate them in the server rack). I have had 0 problems with this vdev since then.

Earlier this afternoon, I tried adding 4 more drives to the same rack with the intention of creating a 2nd RAIDZ vdev with the same ashift value and same naming scheme as the one I used in the first vdev. Everything went smoothly until I rebooted. After the server came back up, the labels were gone from /dev/label. When I exported the pool, the labels were back in /dev/label. When I imported the pool, the labels were gone again from /dev/label.

As you can imagine, playing hide and seek with my disks isn't my idea of fun. Sooner or later, you just want the damn things to stay put.

My procedure was as follows:
Insert a new disk in bay X
[CMD=""]grep 'da' /var/log/messages[/CMD]
[CMD=""]dd if=/dev/zero of=/dev/daX bs=1m count=1 #(to clear off old partition information)[/CMD]
[CMD=""]glabel label WD.5GB-r1-cX /dev/daX[/CMD]
[CMD=""]gnop create -S 4096 /dev/label/WD.5TB-r1-cX[/CMD]
Do this for all 4 disks, replacing X in the above commands with a number {0..3}.

When finished, I
[CMD=""]zpool add mediatank raidz /dev/label/WD.5TB-r1-c{0..3}.nop[/CMD]
[CMD=""]shutdown -p now[/CMD]

And bam. After the reboot, my labels are gone, and zfs status gives me the device names (like daX, daY, etc.) instead of the labels I created earlier.

Anyone have any ideas?

toddhpoole · Jun 29, 2011

For what it's worth, here's the output of a zpool status command:

Code:

[root@mediaserver3 /dev/label]# zpool status
  pool: mediatank
 state: ONLINE
 scrub: none requested
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    da1                    ONLINE       0     0     0
	    da0                    ONLINE       0     0     0

You'll notice how the first RAIDZ vdev has all of it's labels properly displayed, but the second is still running off of device names.

Since my previous post, I've tried the following:
zpool offline mediatank da0
glabel label WD.5TB-r1-c0 /dev/da0
zpool replace mediatank da0 /dev/label/WD.5TB-r1-c0

Code:

invalid vdev specification
use '-f' to override the following errors:
/dev/label/WD.5TB-r1-c0 is part of active pool 'mediatank'

zpool replace -f mediatank da0 /dev/label/WD.5TB-r1-c0

Code:

invalid vdev specification
the following errors must be manually repaired:
/dev/label/WD.5TB-r1-c0 is part of active pool 'mediatank'

No dice.

toddhpoole · Jun 29, 2011

Alright, I spent a few hours doing some research/experimenting, and I think I found a solution. Well, more like work around. It's not at all elegant, but it works, and I now have my labels staying persistent across reboots. Here's what I did:

First, I wanted to make sure I was starting from a clean slate.
zpool status

Code:

[root@mediaserver3 /dev/label]# zpool status        
  pool: mediatank
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Tue Jun 28 20:13:23 2011
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    da1                    ONLINE       0     0     0
	    da0                    ONLINE       0     0     0

errors: No known data errors

Next, I offline'd one of the drives, deciding that the vdev da0 entry was as good a place as any to start:
zpool offline mediatank da0

I then found out how many sectors the disk /dev/da0 had:
grep 'da0' /var/log/messages

Code:

Jun 28 18:29:56 mediaserver3 kernel: da0 at mps2 bus 0 scbus2 target 0 lun 0
Jun 28 18:29:56 mediaserver3 kernel: da0: <ATA WDC WD5000YS-01M 2E07> Fixed Direct Access SCSI-5 device 
Jun 28 18:29:56 mediaserver3 kernel: da0: 300.000MB/s transfers
Jun 28 18:29:56 mediaserver3 kernel: da0: Command Queueing enabled
Jun 28 18:29:56 mediaserver3 kernel: da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)

And overwrote the last 1 MiB of the disk with zeros (Note that I seeked to the sector that was 2048 512 byte sectors before the end of the disk):
dd if=/dev/zero of=/dev/da0 seek=976771120 bs=512

Code:

[root@mediaserver3 /dev/label]# dd if=/dev/zero of=/dev/da0 seek=976771120 bs=512
dd: /dev/da0: end of device
2049+0 records in
2048+0 records out
1048576 bytes transferred in 0.409449 secs (2560943 bytes/sec)

I then relabeled the disk /dev/da0
glabel label WD.5TB-r1-c0 /dev/da0

And used this newly relabeled /dev/da0 to replace the RAIDZ vdev da1 entry:
zpool replace mediatank da1 /dev/label/WD.5TB-r1-c0

A quick check confirmed the pool was resilvering:
zpool status

Code:

[root@mediaserver3 /dev/label]# zpool status
  pool: mediatank
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
	Sufficient replicas exist for the pool to continue functioning in a
	degraded state.
action: Online the device using 'zpool online' or replace the device with
	'zpool replace'.
 scrub: resilver in progress for 0h0m, 0.90% done, 0h31m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    replacing              ONLINE       0     0     0
	      da1                  ONLINE       0     0     0
	      label/WD.5TB-r1-c0   ONLINE       0     0     0  56K resilvered
	    da0                    OFFLINE      0     0     0

errors: No known data errors

When the resilvering finished, I exported and then imported the pool:
zpool export mediatank
zpool import mediatank

I then online'd /dev/da0, but since /dev/da0 was already replacing the RAIDZ vdev da1 entry, ZFS complained and marked the da1 as UNAVAIL/cannot open.
zpool online mediatank da0

Code:

[root@mediaserver3 /dev/label]# zpool online mediatank da0
warning: device 'da0' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present
[root@mediaserver3 /dev/label]# zpool status
  pool: mediatank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    label/WD.5TB-r1-c0     ONLINE       0     0     0
	    da0                    UNAVAIL      0     0     0  cannot open

errors: No known data errors

Ignoring that for the time being, I then used glabel to relabel the disk /dev/da1:
glabel label WD.5GB-r1-c1 /dev/da1

And used this newly relabeled /dev/da1 to replace the da0 entry in the RIADZ vdev:
zpool replace mediatank da0 /dev/label/WD.5TB-r1-c1

I checked to make sure the resilvering process had begun:

Code:

[root@mediaserver3 /dev/label]# zpool status
  pool: mediatank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h1m, 6.48% done, 0h26m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    label/WD.5TB-r1-c0     ONLINE       0     0     0
	    replacing              DEGRADED     0     0     0
	      da0                  UNAVAIL      0     0     0  cannot open
	      label/WD.5TB-r1-c1   ONLINE       0     0     0  96K resilvered

errors: No known data errors

And when that was done, the entire pool was back up and running with 2 properly labeled drives:

Code:

[root@mediaserver3 /dev/label]# zpool status
  pool: mediatank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h1m, 6.48% done, 0h26m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    da3                    ONLINE       0     0     0
	    da2                    ONLINE       0     0     0
	    label/WD.5TB-r1-c0     ONLINE       0     0     0
	    replacing              DEGRADED     0     0     0
	      da0                  UNAVAIL      0     0     0  cannot open
	      label/WD.5TB-r1-c1   ONLINE       0     0     0  96K resilvered

errors: No known data errors

I then began the somewhat tedious process of repeating all above commands to swap and relabel /dev/da2 and /dev/da3 in the exact same way I swapped and relabeled /dev/da0 and /dev/da1.

Sebulon · Jun 29, 2011

Hi,

After starting to troubleshoot this issue, I have done another buildworld, did:
[cmd=]sysctl kern.geom.debugflags=16[/cmd] cleared the disks of glabel and gpart metadata, ran [cmd=]dd if=/dev/zero[/cmd] to ada-devices 1-8 at once (so cool to see ~800MB/s disk IO) removed /boot/zfs/zpool.cache, rebooted, created gpt labels, did:
[cmd=]sysctl kern.geom.debugflags=0[/cmd] created the pool, exported, imported, rebooted. It's still there =)

Code:

# zpool status pool2
  pool: pool2
 state: ONLINE
 scan: none requested
config:

	NAME             STATE     READ WRITE CKSUM
	pool2            ONLINE       0     0     0
	  raidz2-0       ONLINE       0     0     0
	    gpt/rack1-2  ONLINE       0     0     0
	    gpt/rack1-3  ONLINE       0     0     0
	    gpt/rack1-4  ONLINE       0     0     0
	    gpt/rack1-5  ONLINE       0     0     0
	    gpt/rack2-1  ONLINE       0     0     0
	    gpt/rack2-2  ONLINE       0     0     0
	    gpt/rack2-3  ONLINE       0     0     0
	    gpt/rack2-4  ONLINE       0     0     0

errors: No known data errors

I'm not sure exactly which of these steps solved my problem, it can very well have been a mix of underlying issues causing this. But I did try to do this, but with glabel instead and it was the same as before; the labels vanished after import.

It does seem sporadic that gpart was the best solution for me but for others, it was the opposite- glabel.

No matter what, I have been fortunate enough to have had a backup-pool to run on while this was going on. The moral of this story: backups, backups, backups!

/Sebulon

gkontos · Jun 29, 2011

[cmd=]sysctl kern.geom.debugflags=16[/cmd] is being used only if the disk is currently mounted in read/write mode. Something just to protect you from accidentally messing up. So, you used that setting and then successfully cleared all gpart/glabel data. Then you returned to 0 which is the default value and everything worked fine.

I am glad you solved it!

toddhpoole · Jun 29, 2011

Just wanted to post an update on what happened after I tried to swap and relabel /dev/da2 and /dev/da3. I would have posted this sooner but my original post was held for moderation, probably because the board thought I was trying to spam the thread.

In summary, the method I posted above worked for /dev/da0 and dev/da1, but using the same set of commands - literally copied and pasted right from my terminal's history, modifying only the device names - it did not work for /dev/da2 and /dev/da3.

I have no idea why. I don't understand how I can successfully swap and relabel one pair of drives, but not another. I kept on getting blocked at the replace command:
zpool replace mediatank /dev/da2 /dev/label/WD.5TB-r1-c2

Code:

[root@mediaserver3 /dev/label]# zpool replace mediatank /dev/da2 /dev/label/WD.5TB-r1-c2
invalid vdev specification
use '-f' to override the following errors:
/dev/label/WD1TB-r1-c2 is part of active pool 'mediatank'

Adding in the -f flag, as always, gets you nowhere:
zpool replace -f mediatank /dev/da2 /dev/label/WD1TB-r1-c2

Code:

[root@mediaserver3 /dev/label]# zpool replace -f mediatank /dev/da2 /dev/label/WD1TB-r1-c2
invalid vdev specification
the following errors must be manually repaired:
/dev/label/WD1TB-r1-c2 is part of active pool 'mediatank'

Side note: What a useless flag. What good does a "force" option do if it never actually forces anything?

After trying every solution I could think of, I just gave up, got another hard drive, and added it to the rack with the others. FreeBSD then assigned that drive a name that wasn't in use in either of my vdevs (I think it was da8 since da0 - da7 were all part of the two RAIDZ vdevs), and one by one, I swapped each drive I wanted to label with the drive I just added. I used the following pattern:
1) Replace da2 with da8.
2) Let the pool resilver. When done, da2 will no longer part of the RAIDZ vdev and da8 will have taken it's place.
3) Relabel da2 using glabel.
4) Replace da8 with with the relabeled da2.

Repeat the procedure for da3.

You'll ultimately have to go through 4 resilvers (which is so very annoying), but it works. The down side to this is that you need space to add another drive. If you're rack is filled to capacity, and you run into this problem, I don't know what you're going to need to do since you won't be able to add a "rotation drive" like I added da8.

I've got 3 more sets of 4 hard drives to add to this server, so I'll keep on experimenting and seeing if I can't find another way. Finding a way to do the above without needing an extra drive to rotate in and out would be better, but having a command (like zpool replace -f, but actually useful) would be ideal.

PS: Sebulon, it's good to hear that you found another possible solution. One question though: if we have other vdevs in our pool, will any of these actions affect their data integrity? Can anyone comment on this?

Any solution that requires erasing everything and restoring from backups is going to be a last resort for me since that whole process is a _huge_ hassle and takes _days_ of transfer time. I'll do it if there's no other way, but I'd hate to waste 5 or 6 days waiting for backups to re-transfer.

Sebulon · Jun 30, 2011

God damn it, blasted, I spoke too soon! I had just finished replicating everything onto the pool, then rebooted into my boot enviroment- a bootable usb key- and did:

Code:

# zpool export pool1
# zpool export pool2
# zpool import -f pool2 pool1
# zpool import -f pool1 pool2

I then rebooted back to normal and now it looks like this:

Code:

# zpool status
  pool: pool1
 state: ONLINE
 scan: none requested
config:

	NAME                                            STATE     READ WRITE CKSUM
	pool1                                           ONLINE       0     0     0
	  raidz2-0                                      ONLINE       0     0     0
	    gptid/3c61783c-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/3f8235c7-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/4186aec7-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/43a9945c-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/468fb888-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/496e8e29-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/4bebf963-a21c-11e0-b623-002590231060  ONLINE       0     0     0
	    gptid/4e1fc498-a21c-11e0-b623-002590231060  ONLINE       0     0     0

errors: No known data errors

  pool: pool2
 state: ONLINE
 scan: scrub repaired 0 in 7h55m with 0 errors on Wed Jun 22 00:40:08 2011
config:

	NAME                                          STATE     READ WRITE CKSUM
	pool2                                         ONLINE       0     0     0
	  gptid/9bd6d340-7b92-11e0-b3d6-002590231060  ONLINE       0     0     0

errors: No known data errors

And /dev/gpt/ is empty yet again.

I also noticed this in /var/log/messages:

Code:

ZFS WARNING: Unable to attach to ada1
ZFS WARNING: Unable to attach to ada2
ZFS WARNING: Unable to attach to ada3
ZFS WARNING: Unable to attach to ada4
ZFS WARNING: Unable to attach to ada5
ZFS WARNING: Unable to attach to ada6
ZFS WARNING: Unable to attach to ada7
ZFS WARNING: Unable to attach to ada8

ThereÂ´s nothing wrong with the partitions really, itÂ´s just that zfs has "covered" them:

Code:

# gpart show
=>        34  5860533101  ada0  GPT  (2.7T)
          34        2014        - free -  (1M)
        2048  5860531087     1  freebsd-zfs  (2.7T)

=>        34  1953525101  ada1  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada2  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada3  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada4  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada5  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada6  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada7  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>        34  1953525101  ada8  GPT  (931G)
          34  1953525101     1  freebsd-zfs  (931G)

=>     34  7815101  da0  GPT  (3.7G)
       34       30    1  freebsd-boot  (15k)
       64   524288    2  freebsd-swap  (256M)
   524352  7290783    3  freebsd-ufs  (3.5G)

If no one has any good advice, I have to give up labeling and just assign the devices raw. At least ada1 makes more sense to have than gptid/3c61783c-a21c-11e0-b623-002590231060.

/Sebulon

AndyUKG · Jun 30, 2011

You can get rid of the long gptid's by setting

Code:

kern.geom.label.gptid.enable=0

in /boot/loader.conf, although it would seem to me this is a work around for a ZFS issue.

Sebulon said:
Code:

ZFS WARNING: Unable to attach to ada1

This, I don't know what this relates to...

Andy.

toddhpoole · Jul 1, 2011

Well, I think I've found a better solution. Still not as great as using a single zpool blah blah command, but it gets the job done without needing an extra rotational drive.

Basically, if this is your problem:
You glabel a bunch of drives, gnop them, add them to a vdev, and then reboot, and your *.nop names are gone (which was to be expected) but your labels are also gone/not showing up when the pool that the vdev they're in is imported...

Then this is your solution:
After inserting, glabel'ing and gnop'ing and making a vdev out of all 4 drives, and then adding that vdev to the zpool, do not reboot.

That is, I'm already assuming you got to this point by running the following two commands for all 4 drives (which were da8, da9, da10, and da11 for me):
glabel label WD1TB-r2-c0 /dev/da8
gnop create -S 4096 /dev/label/WD1TB-r2-c0

And then added all of them to the zpool like so:
zpool add mediatank raidz /dev/label/WD1TB-r2-c{0..3}.nop

If you did not get to this point by using the above commands, then I don't know if this solution is right for you. ZFS's behavior hasn't exactly been logical on this issue, so I'm not sure if this will get your labels back if you used some other method.

Now, once you've created your new un-deletable and un-expandable-in-terms-of-number-of-disks-in-the-vdev RAIDZ vdev (two ridiculously dimwitted design decisions from Sun - don't even get me started on the explanation "that's not a priority for our corporate customers") do not reboot. One by one, we're going to go through each drive, offline it, destroy the gnop, dd over the entire surface, and then relabel it, and replace it with itself in the vdev. It's a tedious process, but it works.

(Just a sidenote: I thought I might explain my drive naming scheme. When you see "WD1TB-r2-c0" or something similar, that means:
WD1TB = a hard drive 1 TB in size from Western Digital
r2 = located in Row 2 of the server rack (rows start counting at 0)
c0 = and Column 0 of the server rack (columns start counting at 0))

First, offline the first gnop. In my case, that's WD1TB-r2-c0.nop
zpool offline mediatank /dev/label/WD1TB-r2-c0.nop

Then export the pool:
zpool export mediatank

And destroy the gnop:
gnop destroy /dev/label/WD1TB-r2-c0.nop

Reimport the pool:
zpool import mediatank

Check the status of your pool:

Code:

[root@mediaserver3 /dev]# [B][FILE]zpool status[/FILE][/B]
  pool: mediatank
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
	Sufficient replicas exist for the pool to continue functioning in a
	degraded state.
action: Online the device using 'zpool online' or replace the device with
	'zpool replace'.
 scrub: none requested
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD.5TB-r1-c2     ONLINE       0     0     0
	    label/WD.5TB-r1-c3     ONLINE       0     0     0
	    label/WD.5TB-r1-c0     ONLINE       0     0     0
	    label/WD.5TB-r1-c1     ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    da8                    OFFLINE      0     0     0
	    label/WD1TB-r2-c1.nop  ONLINE       0     0     0
	    label/WD1TB-r2-c2.nop  ONLINE       0     0     0
	    label/WD1TB-r2-c3.nop  ONLINE       0     0     0

errors: No known data errors

You should notice that the disk we offlined earlier is now showing up as da_whatever (in my case, that's da8).

dd over the entire surface of the drive that's lost its label:
dd if=/dev/zero of=/dev/da8 bs=1m

(This could take 3 - 24 hours, depending on the size of the drive and the speed of your equipment. Frustrating, I know.)

Then, once it's been zeroed out, relabel the drive:
glabel label WD1TB-r2-c0 /dev/da8

And replace the newly zeroed and relabeled drive with that drive's old entry in the zpool:
zpool replace -f mediatank da8 /dev/label/WD1TB-r2-c0

ZFS will begin to resilver the drive, which you can confirm with a status command:

Code:

[root@mediaserver3 /dev]# [FILE][B]zpool status[/B][/FILE]
  pool: mediatank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 1.32% done, 0h29m to go
config:

	NAME                       STATE     READ WRITE CKSUM
	mediatank                  DEGRADED     0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD2TB-row0-col0  ONLINE       0     0     0
	    label/WD2TB-row0-col1  ONLINE       0     0     0
	    label/WD2TB-row0-col2  ONLINE       0     0     0
	    label/WD2TB-row0-col3  ONLINE       0     0     0
	  raidz1                   ONLINE       0     0     0
	    label/WD.5TB-r1-c2     ONLINE       0     0     0
	    label/WD.5TB-r1-c3     ONLINE       0     0     0
	    label/WD.5TB-r1-c0     ONLINE       0     0     0
	    label/WD.5TB-r1-c1     ONLINE       0     0     0
	  raidz1                   DEGRADED     0     0     0
	    replacing              DEGRADED     0     0     0
	      da8                  OFFLINE      0     0     0
	      label/WD1TB-r2-c0    ONLINE       0     0     0  52K resilvered
	    label/WD1TB-r2-c1.nop  ONLINE       0     0     0
	    label/WD1TB-r2-c2.nop  ONLINE       0     0     0
	    label/WD1TB-r2-c3.nop  ONLINE       0     0     0

errors: No known data errors

When done, repeat the same procedure for da9, da10, and da11, or whatever your drive names are.

And then run one last export/import before restarting:
zpool export mediatank
zpool import mediatank

You're done.

This works for me on the following system:

Code:

[root@mediaserver3 /home/mediaserver]# [FILE][B]uname -a[/B][/FILE]
FreeBSD mediaserver3 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu May 12 10:04:15 CDT 2011     root@mediaserver:/usr/obj/usr/src/sys/GENERIC  amd64

toddhpoole · Jul 5, 2011

I've created 6 more 4-disk RAIDZ vdev's since my last post, and I've made some improvements to the above method that should drastically reduce the time required to relabel a disk. I'll post my improvements here with the hope that someone else might benefit from my work.

So, you know that step where I told you to dd over the entire drive in the above post? This one here:
dd if=/dev/zero of=/dev/da8 bs=1m

Well, you don't need to spend 3 - 24 hours wiping the entire disk. It took some serious research and a lot of patience to confirm, but it turns out one only needs to zero over two specific locations of the disk to achieve the same results.

Our goal is to wipe the vdev information off of each disk, so instead of zeroing the entire drive, all we need to do is wipe out the front and the back of it. How much of the front and how much of the back I can't say: I tried zeroing out the first and the last MiB of the disk. It took that dd step in the previously posted guide down from 3 - 24 hours to less than 1 second which is good enough for me.

The improvements are as follows:

Instead of wiping the entire drive, just zero out the first MiB:
dd if=/dev/zero of=/dev/da12 bs=1m count=1

Then, find out how many sectors your drive has:
dmesg | grep "da12"

Code:

[root@mediaserver3 /dev/label]# [FILE][B]dmesg | grep "da12"[/B][/FILE]
da12 at mps1 bus 0 scbus1 target 3 lun 0
da12: <ATA WDC WD20EARS-00M AB50> Fixed Direct Access SCSI-5 device 
da12: 300.000MB/s transfers
da12: Command Queueing enabled
da12: 1907729MB ([B][color="Red"]3907029168[/color][/B] 512 byte sectors: 255H 63S/T 243201C)
GEOM: da12: partition 1 does not start on a track boundary.
GEOM: da12: partition 1 does not end on a track boundary.

Note how this particular drive has 3907029168 sectors that are 512 bytes in size? If I want to zero over the last mebibyte of the drive, that means I'm going to need to seek to sector 3907029168 - 2048 = 3907027120. (The 2048 came from the fact that 1 mebibyte = 1048576 bytes. 1048576 bytes/512 bytes = 2048.)

Next, zero out the last mebibyte of the drive using the adjusted sector count for the seek argument:
dd if=/dev/zero of=/dev/da12 seek=3907027120

You've now just saved yourself anywhere from 12 hours to 4 days of unnecessary waiting. Continue on your merry way following my previous post. (Your next step from here will be the glabel command.)

Good luck.

Sebulon · Sep 9, 2011

OK, so I let this simmer for a while. Been tinkering on this for a bit and I am now ready to present to everyone:

cleandrives:

Code:

#!/bin/sh

if [ -z "$1" ]
  then
    echo "Usage: `basename $0` drive1 drive2 ..."
  exit
fi

drives="$*"

verifydrives()
  {
    for drive in $drives
      do
        if [ `ls -l /dev/ | grep -w $drive | wc -l` = "0" ]
          then
            echo "Drive $drive does not exist. Aborting."
            exit
          else
            echo "Drive $drive verified."
        fi
      done
  }

seeksector()
  {
    blocksize=`dmesg | grep -w $drive | grep -oe '[0-9]\{8,\}'`
    mbsize=`echo "$blocksize / 2048" | bc`
    echo "$mbsize - 10" | bc
  }

cleandrives()
  {
    for drive in $drives
      do
        dd if=/dev/zero of=/dev/$drive bs=1M count=10 >/dev/null 2>&1
        dd if=/dev/zero of=/dev/$drive bs=1M count=10 seek=`seeksector $drive` >/dev/null 2>&1
    done
  }

verifydrives $drives

echo ""
echo "This will irreversibly destroy partition- and filesystem data on drive(s):"
echo "$drives"
echo ""
echo "USE WITH EXTREME CAUTION!"
read -r -p 'Do you confirm "yes/no": ' choice
  case "$choice" in
    yes) cleandrives $drives
         echo ""
         echo "Drive(s) cleaned."  ;;
     no) echo ""
         echo "Cleaning cancelled."; break ;;
      *) echo ""
         echo "Cleaning cancelled."; break ;;
  esac

This script verifies and then erases the first and last 10MB of every hard drive youÂ´ve confirmed. If you find youself in a situation like mine or toddhpooleÂ´s, or if you want to move a hard drive from one system to another and donÂ´t want the new system to be "confused" by old partitioning- and filesystem data, then this command makes that cleaning easy and fast. Best part is that it works on any *nix system out there, not just FreeBSD. Enjoy!

/Sebulon

HarryE · Jun 6, 2012

GPT labels can be kept using:
# zpool import -d /dev/gpt/ yourpool
HTH