ZFS Pools status is unknown

HI,
We have TrueNAS (12) with 4 pools in our storage. Two of these pools (TPM_STORAGE_01, TPM_STORAGE_02) have RAIDz2 for data drives and Mirror drives for cache and logs. The system was operating without issues in these pools, but after a routine system restart, the status of the 2 pools using log drives became unknown. The other two pools continued to function normally.

We attempted to import the pools that showed an unknown status due to the log drive issue, but were unsuccessful. Trying to import each pool separately also failed, consistently resulting in an error stating the pool was not available. Additionally, we were unable to execute any tasks at the pool level due to the same error. Smart checks on the 4 drives (2 for each affected pool) also failed with errors, preventing us from diagnosing further.

We suspect that all 4 log drives may have been corrupted simultaneously, though we're unsure what caused this. We urgently need assistance to restore functionality to the pools without losing any data. Below are the results of the import attempts. The 'zpool status' command shows only the normal pools, not the ones marked as unknown."

Code:
root@SRV3:~ # zpool import
   pool: TPM_STORAGE_02
     id: 9038427024247942209
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: [URL]http://illumos.org/msg/ZFS-8000-6X[/URL]
 config:

        TPM_STORAGE_02                                  UNAVAIL  missing device
          raidz2-0                                      ONLINE
            gptid/459e3901-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/46816177-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/47599791-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/4839f3eb-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/4913cb3f-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/4cb6e973-60bf-11e8-afbd-0cc47a1e183a  ONLINE
          raidz2-1                                      ONLINE
            gptid/466c7202-b8ce-11ed-9cd5-0cc47a1e183a  ONLINE
            gptid/4ee15f58-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/4fc29728-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/50a1a8ce-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/5188ef3a-60bf-11e8-afbd-0cc47a1e183a  ONLINE
            gptid/526ebd59-60bf-11e8-afbd-0cc47a1e183a  ONLINE
          raidz2-3                                      ONLINE
            gptid/3c0ab4d8-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/3ce59879-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/3dbdf5d1-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/3e9b5caf-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/3f7351c2-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/40513800-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
          raidz2-4                                      ONLINE
            gptid/529ceb90-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/537df4e5-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/545609eb-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/553c5094-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/563258fb-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
            gptid/5724479d-01ea-11e9-a2e3-0cc47a1e183a  ONLINE
          raidz2-5                                      ONLINE
            da64p2                                      ONLINE
            da63p2                                      ONLINE
            da61p2                                      ONLINE
            da62p2                                      ONLINE
            da55p2                                      ONLINE
            da56p2                                      ONLINE
          raidz2-6                                      ONLINE
            gptid/36673d6c-de51-11eb-bee9-0cc47a1e183a  ONLINE
            da77p2                                      ONLINE
            gptid/37ee0a14-de51-11eb-bee9-0cc47a1e183a  ONLINE
            gptid/3905a0ef-de51-11eb-bee9-0cc47a1e183a  ONLINE
            gptid/3923492c-de51-11eb-bee9-0cc47a1e183a  ONLINE
            gptid/393a13c7-de51-11eb-bee9-0cc47a1e183a  ONLINE
        logs
          mirror-2                                      UNAVAIL  insufficient replicas
            8072931786312405605                         UNAVAIL  cannot open
            3539032398906333351                         UNAVAIL  cannot open

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.

   pool: TPM_STORAGE_01
     id: 11666779640591141191
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: [URL]http://illumos.org/msg/ZFS-8000-6X[/URL]
 config:

        TPM_STORAGE_01                                  UNAVAIL  missing device
          raidz2-1                                      ONLINE
            gptid/de38dbe6-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/df41449e-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e044613e-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e169721f-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e274d902-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e36e4f1d-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e47aab42-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e579457d-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e67ec5c2-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e783d559-9e58-11e5-a7af-0cc47a1e183a  ONLINE
          raidz2-2                                      ONLINE
            gptid/e8ad867f-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/e9b8f9ae-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/eab65a7a-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/ebcc997d-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/ecd6fc0f-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/ede678dd-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/eee1ed71-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/efdc04a8-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/f0d92f56-9e58-11e5-a7af-0cc47a1e183a  ONLINE
            gptid/f1e707a6-9e58-11e5-a7af-0cc47a1e183a  ONLINE
        logs
          mirror-0                                      UNAVAIL  insufficient replicas
            14766580126774858799                        UNAVAIL  cannot open
            9531077834738806110                         UNAVAIL  cannot open
 
This is a question about TrueNAS and you should ask there. But since it (as far as I can see) targets ZFS only and a catastrophic failure there, we all might benefit from solving this.
 
You did not say how to import the pools, according to the documentation to import a pool with a log mirror, at least one of its replicas must be available.

zpool-import(8)

-m
Allows a pool to import when there is a missing log device. Recent transactions can be lost because the log device will be discarded.

Did you try to import the pool with this parameter? Keep in mind the warning it is giving you, on the other hand I don't know if it would be convenient to import the pool in read mode.

Thanks.
 
Smart checks on the 4 drives (2 for each affected pool) also failed with errors, preventing us from diagnosing further.
What is the error? Just telling us "with error" doesn't help us debug.

But the much bigger question is this: You can see how many disk drives are attached, for example by doing "ls" on the /dev/directory, although it might be clearer to use /dev/gptid, since there the entries will match the output you see from "zpool status". Looking there: are the log drives actually connected to the server and accessible? If yes, what happens if you try to read them (with a simple tool like dd)? What happens if you inspect their partition tables with gpart, or use a tool like "file" to identify what the content of the disks is? Did you use partition tables and human-readable partition names on the disk drives, so you can identify them? Did you write down the hardware serial numbers of the disks (readable with camcontrol), and can you match those to what you see, what you expect to see, and what the current output of zpool status is?

In a nutshell: Disks don't just vanish. It is very unlikely that 4 drives "get corrupted" simultaneously. I suspect that some sort of error condition is making the OS (or BIOS or hardware or HBA or ... anything else in the stack) not be able to communicate with the disks at all. Your job is to find out why the disks don't show up.

Joke: If it was me, I would go all paranoid, and go open the server and check that the log drives are physically present. If they aren't, you can start looking for fingerprints, and call the police, since someone stole them. End joke.
 
pool: TPM_STORAGE_02
...
state: UNAVAIL status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing devices and try again.
see: http://illumos.org/msg/ZFS-8000-6X
If the message is the correct one to the situation, then it doesn't look good to import the pools without the log devices:

http://illumos.org/msg/ZFS-8000-6X
Rich (BB code):
illumos Fault Management Architecture (FMA) Message Registry
Message: ZFS-8000-6X

title           Missing top level device
description     One or more top level devices are missing.
severity        critical
...

action          ...
                The pool cannot be imported until the unknown missing device is attached to the system.
                If the device has been made available in an alternate location, use the '-d' option to 'zpool
                import' to search for devices in a different directory. If the missing device is unavailable,
                then the pool cannot be imported.

In case the log devices are lost, hopefully there are backups of those pools.
 
If all the log devices are connected to the same adapter card, that one may have given up the ghost. You may connect one of the log devices to the connection of a data drive. Then the log is available and the pool should come up degraded.

ZFS does not care about the IO channel as long as it can see the drive somewhere.
 
Back
Top