The machine did work correctly with two disks, until I added a third one. The new disk appears as ada0, and the old ones have changed names (from ada0+1 to ada1+2; proper adjustments in /etc/fstab etc. have been made).
The new ada0 disk shall not be used, it is added for a burn-in test, and it must be on a lower sata port because it can do sata600.
But now the following problem appears - when booting singleuser, everything is fine and as expected:
But after going multiuser, ada1 is lost:
I did not find a way yet to get that disk back in multiuser mode, so gvinum currently runs on broken mirrors.
The ada1 disk works and is accessible with dd.
The trouble happens per /etc/rc.d/zvol, or any zfs command. During the first zfs command, a bunch of errors appear from "g_access", error code 6 (supposed ENXIO).
ZFS itself finds its stuff, but uses now different paths:
Investigating further I found weird things in the output from "gpart show".
In singleuser mode it shows:
After going multiuser (or after any zfs command) this changes to:
It seems, disks can appear by name, or by diskid, or by both. But I do not currently understand the meaning of each.
Probably the most effective approach would be to zero out the whole disks and rebuild the partitioning scheme from scratch. But I do not like that approach; I would prefer to understand what is wrong, and to fix precisely the offending bytes.
The system is currently running Rel. 11.1, but was originally installed with Rel. 2.1, and piece-wise upgraded ever since.
Is there some kind of paper/documentation that might be helpful in understanding the secrets of the diskid scheme and how it is supposed to work?
The new ada0 disk shall not be used, it is added for a burn-in test, and it must be on a lower sata port because it can do sata600.
But now the following problem appears - when booting singleuser, everything is fine and as expected:
Code:
root@# ls /dev/ada*
/dev/ada0 /dev/ada1s1a /dev/ada1s1e /dev/ada1s2 /dev/ada2s1a /dev/ada2s1e /dev/ada2s2
/dev/ada1 /dev/ada1s1b /dev/ada1s1f /dev/ada2 /dev/ada2s1b /dev/ada2s1f /dev/ada2s2a
/dev/ada1s1 /dev/ada1s1d /dev/ada1s1g /dev/ada2s1 /dev/ada2s1d /dev/ada2s1g
root@# gvinum ld
2 drives:
D a10 State: up /dev/ada1s1f A: 38068/59592 MB (63%)
D a11 State: up /dev/ada2s1f A: 38068/59592 MB (63%)
But after going multiuser, ada1 is lost:
Code:
root@# ls /dev/ada*
/dev/ada0 /dev/ada2 /dev/ada2s1a /dev/ada2s1d /dev/ada2s1f /dev/ada2s2
/dev/ada1 /dev/ada2s1 /dev/ada2s1b /dev/ada2s1e /dev/ada2s1g
root@# gvinum ld
2 drives:
D a11 State: up /dev/ada2s1f A: 38068/59592 MB (63%)
D a10 State: down /dev/??? A: 0/0 MB (0%)
I did not find a way yet to get that disk back in multiuser mode, so gvinum currently runs on broken mirrors.
The ada1 disk works and is accessible with dd.
The trouble happens per /etc/rc.d/zvol, or any zfs command. During the first zfs command, a bunch of errors appear from "g_access", error code 6 (supposed ENXIO).
ZFS itself finds its stuff, but uses now different paths:
Code:
NAME STATE READ WRITE CKSUM
build ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
diskid/DISK-WD-WCASY7821919s1g ONLINE 0 0 0
ada2s1g ONLINE 0 0 0
Investigating further I found weird things in the output from "gpart show".
In singleuser mode it shows:
Code:
root@# gpart show
=> 63 976773105 ada1 MBR (466G)
63 242769933 1 freebsd (116G)
242769996 734003172 2 !191 (350G)
=> 0 242769933 ada1s1 BSD (116G)
0 16 - free - (8.0K)
16 1200000 1 freebsd-ufs (586M)
1200016 2000000 4 freebsd-ufs (977M)
3200016 200000 5 freebsd-ufs (98M)
3400016 122045271 6 freebsd-vinum (58G)
125445287 10485760 2 freebsd-swap (5.0G)
135931047 106838886 7 !10 (51G)
=> 63 976773105 diskid/DISK-WD-WCASY7821919 MBR (466G)
63 242769933 1 freebsd (116G)
242769996 734003172 2 !191 (350G)
=> 63 976773105 ada2 MBR (466G)
63 242769933 1 freebsd [active] (116G)
242769996 734003172 2 !191 (350G)
=> 0 242769933 diskid/DISK-WD-WCASY7821919s1 BSD (116G)
0 16 - free - (8.0K)
16 1200000 1 freebsd-ufs (586M)
1200016 2000000 4 freebsd-ufs (977M)
3200016 200000 5 freebsd-ufs (98M)
3400016 122045271 6 freebsd-vinum (58G)
125445287 10485760 2 freebsd-swap (5.0G)
135931047 106838886 7 !10 (51G)
=> 0 242769933 ada2s1 BSD (116G)
0 16 - free - (8.0K)
16 1200000 1 freebsd-ufs (586M)
1200016 2000000 4 freebsd-ufs (977M)
3200016 200000 5 freebsd-ufs (98M)
3400016 122045271 6 freebsd-vinum (58G)
125445287 10485760 2 freebsd-swap (5.0G)
135931047 106838886 7 !10 (51G)
=> 0 734003172 ada2s2 BSD (350G)
0 16 - free - (8.0K)
16 734003156 1 !0 (350G)
After going multiuser (or after any zfs command) this changes to:
Code:
root@# gpart show
=> 63 976773105 diskid/DISK-WD-WCASY7821919 MBR (466G)
63 242769933 1 freebsd (116G)
242769996 734003172 2 !191 (350G)
=> 63 976773105 ada2 MBR (466G)
63 242769933 1 freebsd [active] (116G)
242769996 734003172 2 !191 (350G)
=> 0 242769933 diskid/DISK-WD-WCASY7821919s1 BSD (116G)
0 16 - free - (8.0K)
16 1200000 1 freebsd-ufs (586M)
1200016 2000000 4 freebsd-ufs (977M)
3200016 200000 5 freebsd-ufs (98M)
3400016 122045271 6 freebsd-vinum (58G)
125445287 10485760 2 freebsd-swap (5.0G)
135931047 106838886 7 !10 (51G)
=> 0 242769933 ada2s1 BSD (116G)
0 16 - free - (8.0K)
16 1200000 1 freebsd-ufs (586M)
1200016 2000000 4 freebsd-ufs (977M)
3200016 200000 5 freebsd-ufs (98M)
3400016 122045271 6 freebsd-vinum (58G)
125445287 10485760 2 freebsd-swap (5.0G)
135931047 106838886 7 !10 (51G)
It seems, disks can appear by name, or by diskid, or by both. But I do not currently understand the meaning of each.
Probably the most effective approach would be to zero out the whole disks and rebuild the partitioning scheme from scratch. But I do not like that approach; I would prefer to understand what is wrong, and to fix precisely the offending bytes.
The system is currently running Rel. 11.1, but was originally installed with Rel. 2.1, and piece-wise upgraded ever since.
Is there some kind of paper/documentation that might be helpful in understanding the secrets of the diskid scheme and how it is supposed to work?