Solved FreeBSD 10.1: How to fix mirror zpool w. system?

Hi,
Today I noticed that one of the disks in a server is broken. The server itself has a very basic setup: i7, 2 drives, clean FreeBSD 10.1 install. The bsdinstall(8) has created the zfs mirror drive and installed the system on it. Now I've seen in the logs that one of the drives has gone south and needs to be replaced. During my time with FreeBSD I had to repair some failed RaidZ pools but never a broken mirror containing the system. So what really puzzles me that it only shows one drive, no alias for the broken one. Here is the output of zpool status:

Code:
  pool: systemPool
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
   attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
   using 'zpool clear' or replace the device with 'zpool replace'.
  see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 704M in 0h0m with 0 errors on Mon Aug  3 20:27:25 2015
config:
   NAME  STATE  READ WRITE CKSUM
   systemPool  ONLINE  0  0  0
    mirror-0  ONLINE  0  0  0
    gpt/zfs0  ONLINE  0  0  0
    diskid/DISK-Z1F0SEY6p3  ONLINE  0  0  7
errors: No known data errors
The pool shows only the running drive ata0 but no hint for the missing ata1. Here is the output of camcontrol devlist:
Code:
<WDC WD3000FYYZ-01UL1B2 01.01K03>  at scbus0 target 0 lun 0 (ada0,pass0)
<ST3000DM001-1CH166 CC43>  at scbus1 target 0 lun 0 (ada1,pass1)
<AHCI SGPIO Enclosure 1.00 0001>  at scbus2 target 0 lun 0 (ses0,pass2)

Yes, I know you shouldn't mix different drives but making things even more complicated the server is hosted at a remote location...

So how can I fix the pool?

Best regards,

Mike
 
You should use code tags to show the zpool status output so it keeps the whitespace. It's hard to interpret otherwise.

By the look of it, it is showing two disks:
Code:
mirror-0                 ONLINE 0 0 0
  gpt/zfs0               ONLINE 0 0 0
  diskid/DISK-Z1F0SEY6p3 ONLINE 0 0 7

The first disk has been picked up using the GPT partition label zfs0.
The second disk has been picked up using the disk ID, and ZFS is using the third GPT partition.

Going by the checksum error count, I suspect the second disk is the one with the problem.
You could confirm by running diskinfo -v /dev/{baddisk} to see if its ID is Z1F0SEY6.

Then just offline the bad disk, replace it physically, and replace in the pool
Code:
zpool offline systemPool diskid/DISK-Z1F0SEY6p3
zpool replace systemPool diskid/DISK-Z1F0SEY6p3 newDiskpN
.. or if you labelled the ZFS parition on the new disk "zfs1" with gpart ..
zpool replace systemPool diskid/DISK-Z1F0SEY6p3 gpt/zfs1

Remember to partition the new disk and add bootcode.
Then make sure you give ZFS the correct partition on the new drive, and not the whole disk.
 
Hi usdmatt,
Thanks for the fast reply and sorry for my bad posting style, I promise to do better next time.

I took a look at the diskinfo output and you're right, the bad disk is Z1F0SEY6. I'll call the hoster tomorrow to plan the replacement of the disk and will let you know afterwards how it has worked.

Best regards,

Mike
 
If you have room in the case for adding a third drive, the following is a much nicer/safer way to replace a dying disk in a mirror vdev:
Code:
    <add new disk to system>
    <partition new disk, labelling partitions as needed>
# zpool attach systemPool gpt/zfs0 gpt/newDiskLabel
    <wait for the resilver to complete>
# zpool detach systemPool diskid/DISK-Z1F0SEY6p3
    <remove dead disk from system>
The beauty of doing it this way is that you go from a 2-disk mirror, to a 3-disk mirror while resilvering, back to a 2-disk mirror. You never lose redundancy in the vdev.
 
Hi,
Thank you for all your help, the disk is changed. I followed scottro's Howto which still works fine (but added a swap partition ;)). Also thanks for your tip phoenix but in this case it's not so easy to do because the server is some hundreds miles away in a datacenter.

Best regards,

Mike
 
Last edited by a moderator:
Back
Top