gmirror works when booting from mfsBSD but not at boot

Hi,

I have a server which is running FreeBSD 9.0-RELEASE. It has one provider gm0 which consists of ada0 and ada1.

When I boot to mfsBSD I can load gmirror and gm0 works fine.

When I reboot from harddisk it will prompt with error 19, asking for a mountroot. This error seems to be popular after upgrading from FreeBSD 8.x to 9.x which is not the case here. It was working fine but now...

Any idea what I could do? It seems the geom_mirror.ko module can't be loaded at boot. How can I fix this?

Some output:
Code:
rescue-bsd# gpart show
=>       63  234441584  mirror/gm0  MBR  (111G)
         63  234436482           1  freebsd  [active]  (111G)
  234436545       5102              - free -  (2.5M)

=>        0  234436482  mirror/gm0s1  BSD  (111G)
          0  213954560             1  freebsd-ufs  (102G)
  213954560    1024000             2  freebsd-ufs  (500M)
  214978560   19457922             4  freebsd-swap  (9.3G)[/CMD]

rescue-bsd# gmirror list
Geom name: gm0
State: COMPLETE
Components: 2
Balance: round-robin
Slice: 4096
Flags: NONE
GenID: 0
SyncID: 1
ID: 3106185129
Providers:
1. Name: mirror/gm0
   Mediasize: 120034123264 (111G)
   Sectorsize: 512
   Mode: r0w0e0
Consumers:
1. Name: ada1
   Mediasize: 120034123776 (111G)
   Sectorsize: 512
   Mode: r1w1e1
   State: ACTIVE
   Priority: 1
   Flags: (null)
   GenID: 0
   SyncID: 1
   ID: 3005574804
2. Name: ada0
   Mediasize: 120034123776 (111G)
   Sectorsize: 512
   Mode: r1w1e1
   State: ACTIVE
   Priority: 0
   Flags: (null)
   GenID: 0
   SyncID: 1
   ID: 3658582627[/CMD]

This is when I boot from mfsBSD.

/etc/fstab:
Code:
/dev/mirror/gm0s1a              /               ufs             rw      1       1
/dev/mirror/gm0s1b              /root           ufs             rw      2       2
/dev/mirror/gm0s1d              none            swap            sw      0       0

/boot/loader.conf
Code:
geom_mirror_enable="YES"
 
After booting with 9.1, press Scroll Lock and scroll back through the kernel messages looking for errors. This could be the graid(8) problem, could be more strict checking of MBR correctness, or something else.
 
Okay, so it was not due to an upgrade but one day it was working and the next it was not? What changed?
 
Ah ok, here is the error:

GEOM: ada0s1: geometry does not match label (255,63s != 16h,63s).

This remote console thing is driving me nuts ;)

UPDATE: I don't know what changed. I normally don't take care of this system. The guy said "I just mounted a partition and maybe did mount -u instead of umount"
 
That's a normal GEOM warning which can be ignored. Keep looking. If you can, boot in single user mode, make certain that the gmirror(8) module is loaded (gmirror load) and then manually check the mirror status and mount the filesystems read-only. If that works, run fsck(8).
 
"Cannot add disk ada1 to gm0 (error=22). Device gm0 destroyed."

I ran fsck already. It just seems that the gmirror module is not loaded at boot. If I boot from mfsBSD (rescue console) gmirror is loaded and the raid is just fine.

UPDATE: When I mount /dev/ada0s1a and run kldstat geom_mirror.ko is loaded.
 
That's after booting in single user mode and loading the module?
# gmirror load
or
# kldload geom_mirror

Is the module still present in /boot/kernel/geom_mirror.ko?
 
Right after single user it's already loaded. I did not load it manually at that moment.
 
During boot it says
"Cannot add disk ada0 to gm0 (error=22). Device gm0 destroyed."
After it says
"Cannot add disk ada1 to gm0 (error=22). Device gm0 destroyed."
My assumption that gmirror is not loaded is wrong. It is loaded but adding the drives fails. The rescue system is a 9.1 btw. Is this important here?
 
It sounds like the mirror is out of sync, or possibly the mirror metadata has been corrupted on ada1. Since you've already mounted one disk separately, the data on the two drives is already different. So at this point, I'd remove ada1 from the mirror, reboot so the system is running off the one-drive mirror, then add back ada1. (Back up first, of course.)
# gmirror remove gm0 ada1
(reboot)
# gmirror insert gm0 ada1

But looking at those messages now, you say it can't add either drive to the mirror. I have no idea what would do that. Metadata corruption on both?
 
Strangely it works when I mount it from the rescue system. Then gm0 works, I can mount it, read it, write etc.

You think a "rebuild" works also to repair a corruption?
 
I removed ada1 from the gm0 provider and rebooted, but it still says it can't add ada0 to gm0.

Any idea? Otherwise I need to destroy gm0 completely and recreate it which hopefully works...
 
A rebuild would just copy the data. The metadata would be created from scratch when the drive was added to the mirror.

"The guy" somehow changed/corrupted that system. If you could identify exactly where... it's worth looking at /etc/rc.conf to see if there have been surprise settings added that somehow prevent gmirror(8) from starting. I don't know what that would be. If it were me, I'd be running a backup from the mfsBSD-mounted mirror right now.

Maybe he ran freebsd-update(8)?
 
I checked the "potential spots" but nothing found. Everything looks normal. I will boot now without gm0 and start from scratch.
 
I detroyed it, relabeled it and inserted it again, but no success. Same error as before.
 
Yes.
[CMD="uname -a"]FreeBSD HOSTNAME 9.0-RELEASE-p3 FreeBSD 9.0-RELEASE-p3 #0: Tue Jun 12 02:52:29 UTC 2012 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64[/CMD]

I now booted from either one of the SSDs (/dev/ada0 and /dev/ada1 after changing fstab) and it works.

I really dont know whats going on here.
 
I ran gmirror clear on both drives once again and will create the mirror a 2nd and last time.
 
The loader.conf only shows
geom_mirror_load="YES"

Nothing else. There should be something more?
 
Not unless there was something else there before. But there would be a message in the kernel startup complaining about partition integrity, and running gpart(8) from mfsBSD would show it also.
 
No nothing, but now maybe Im a step closer. I cleared the gmirror metadata once more, rebooted from ada0, created gm0 by labeling ata0 and rebooted again. This time it worked (ada1 not yet added). Now I added ada1 and it's rebuilding.

When it's finished I will reboot once more. If it works then I guess I made it.

Anyway, thanks so much for your help and input!!
 
I wish we could have found a definite cause for that. The error message comes from sys/geom/mirror/g_mirror.c, line 3072 in 9-STABLE. At first glance, it can give an error if metadata is corrupt. Could there be a RAID controller in that system that was switched to RAID mode?
 
Back
Top