Where is ZFS data stored?

I apologize if this has been asked before, I did look. I am still having questions about ZFS.

Let 's say you have a boot drive with UFS and three other drives in a RAID-Z pool. When you boot up, what tells FreeBSD that these three drives are in a RAID-Z pool? Is this information all stored on these three drives? If not, where on the boot drive is this information stored? The /boot/zfs/zpool.cache is used just during run-time. correct?

Thanks,

Bill
 
Each device, a full disk or a partition, that is part of a pool has what are called ZFS labels on them, four copies of the label on each device. These are scanned by the ZFS kernel code to auto-detect the disks that make the pools. The function of zpool.cache is as far as I know to have a quick cache of pools that are available and can be auto-imported on boot. On a recent enough version FreeBSD, I think it was 9.2 or later, it's no longer needed for booting because the kernel can probe geom(4) providers for disks that make up ZFS pools.
 
Ok, so zpool.cache is non-volatile but not necessary. Well, it might have been necessary before 9.2 but no longer.

I have read about the four labels, so everything the OS would need is stored in these labels. Does this sound about right?

Bill
 
Yes, the pools are very much self contained. There's no other way to specify the pools in fact than use of the on-disk labels.
 
kpa said:
Yes, the pools are very much self contained. There's no other way to specify the pools in fact than use of the on-disk labels.

OK, good, this simplifies things for me.

Bill
 
Just for anyone interested in a bit more information.

The labels on each disk contain information about devices in the same vdev, as well as information about their parent container. So in a simple three disk RAID-Z, each disk will contain information themselves and their two 'neighbors', as well as the pool itself (their parent is the pool). You can see this clearly by running a zdb -l /dev/somezfsdisk which will show the label off the disk. During updates, labels 1 and 3 are updated first (one at the beginning of the disk and one at the end), then labels 2 and 4 are updated to minimize the possibility of any problems making all the labels unusable.

Because a device only contains information about the disks in the same vdev, if you have a multiple vdev pool and lose all the disks in one vdev, ZFS can't work out the original pool layout. A simple test shows this:

Pool with two mirror vdevs:
Code:
# zpool status test
  pool: test
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        test        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            md0     ONLINE       0     0     0
            md1     ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            md2     ONLINE       0     0     0
            md3     ONLINE       0     0     0

errors: No known data errors

The label off one device shows all the pool info, but only the info on itself and the other disk in its mirror (this actually outputs all four labels but they're all the same):
Code:
# zdb -l /dev/md0
--------------------------------------------
LABEL 0
--------------------------------------------
    version: 28
    name: 'test'
    state: 1
    txg: 9
    pool_guid: 1748106379667507035
    hostid: 3533697201
    hostname: 'host'
    top_guid: 7204611594387503165
    guid: 15665966071343249231
    vdev_children: 2
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 7204611594387503165
        metaslab_array: 33
        metaslab_shift: 20
        ashift: 9
        asize: 129499136
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 15665966071343249231
            path: '/dev/md0'
            phys_path: '/dev/md0'
            whole_disk: 1
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 18376971707402152791
            path: '/dev/md1'
            phys_path: '/dev/md1'
            whole_disk: 1
            create_txg: 4

If you destroy one entire mirror ZFS can't figure out what's missing:
Code:
# zpool export test
# mdconfig -d -u 3
# mdconfig -d -u 2
# zpool import
   pool: test
     id: 1748106379667507035
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://illumos.org/msg/ZFS-8000-6X
 config:

        test         UNAVAIL  missing device
          mirror-0   ONLINE
            md0      ONLINE
            md1      ONLINE

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.

I suspect it knows that there are other devices because of the vdev_children value at the top of the tree, which in this case suggests that there should be two vdevs.
 
Back
Top