Hi Guys,
I'm running a FreeBSD 8.0 system with an "OS" drive (mirrored), and 3 other HD's in a RAID3 array via GEOM. One of my drives has gone bad, and I'm waiting on a delivery of disks (only a few hours out) so I can swap the disk out and rebuild the array.
As this is the first time I've had to do this, I am simply looking for confirmation and/or advice on ensuring I don't mess the array up when I swap the disks.
First however, some details. The array consists of 3 x 500 Gbyte disks:
ad8, ad14 and ad16.
dmesg kicks out this set of messages:
Telling me that ad8 is the failed disk, based on those details.
a quick look at status of the array confirms the array is degraded:
and listing some more details:
Confirms that ad8 is the failed disk.
Once the drives get here I will be swapping out ata4, which is where ad8 maps hardware wise. As I've never done an array disk swap and rebuild I wanted to confirm the following steps, and or get corrections/advice in advance.
Do I then need to execute rebuild before remounting, or does the insert automatically do that for me?
At this point is it safe to remount the array, or should I leave it unmounted while the array rebuilds?
Do I need to "pre-label" the new drive before inserting it into the array?
Should I load/unload the array at any point, thus far I've not fond a crisp clean "how-to" or doc online that goes through that in detail (could be my search terms, but "geom raid3 freebsd rebuild array" gave me clues, but nothing concise).
I apologize in advance if this is not in the right forums, but if anyone can clear up my "list of operations", I would be greatly appreciative.
I'm running a FreeBSD 8.0 system with an "OS" drive (mirrored), and 3 other HD's in a RAID3 array via GEOM. One of my drives has gone bad, and I'm waiting on a delivery of disks (only a few hours out) so I can swap the disk out and rebuild the array.
As this is the first time I've had to do this, I am simply looking for confirmation and/or advice on ensuring I don't mess the array up when I swap the disks.
First however, some details. The array consists of 3 x 500 Gbyte disks:
ad8, ad14 and ad16.
dmesg kicks out this set of messages:
Code:
ad8: FAILURE - READ_DMA48 status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=965661086
GEOM_RAID3: Request failed (error=5). ad8[READ(offset=494418476032, length=1024)]ad8:
GEOM_RAID3FAILURE - READ_DMA48 status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED>: LBA=931254672Device dataarray: provider ad8 disconnected.
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=1
ad8: FAILURE - READ_DMA48 status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=976773167
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=128
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=16
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=512
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=128
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=16
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=512
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=64
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=2
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=16
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=128
ad8: FAILURE - READ_DMA status=71<READY,DMA_READY,DSC,ERROR> error=4<ABORTED> LBA=0
Telling me that ad8 is the failed disk, based on those details.
a quick look at status of the array confirms the array is degraded:
Code:
test# graid3 status
Name Status Components
raid3/dataarray DEGRADED ad14
ad16
and listing some more details:
Code:
test# graid3 list
Geom name: dataarray
State: DEGRADED
Components: 3
Flags: NONE
GenID: 1
SyncID: 1
ID: 1006246370
Zone64kFailed: 16281
Zone64kRequested: 183522
Zone16kFailed: 269778
Zone16kRequested: 1892123
Zone4kFailed: 25205
Zone4kRequested: 576030
Providers:
1. Name: raid3/dataarray
Mediasize: 1000215723008 (932G)
Sectorsize: 1024
Mode: r1w1e1
Consumers:
1. Name: ad14
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Flags: DIRTY
GenID: 1
SyncID: 1
Number: 1
Type: DATA
2. Name: ad16
Mediasize: 500107862016 (466G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Flags: DIRTY
GenID: 1
SyncID: 1
Number: 2
Type: PARITY
Confirms that ad8 is the failed disk.
Once the drives get here I will be swapping out ata4, which is where ad8 maps hardware wise. As I've never done an array disk swap and rebuild I wanted to confirm the following steps, and or get corrections/advice in advance.
- unmount the array, and hotswap the disk.
- execute "graid3 remove -n 0 dataarray"
based on ad8 mapping to number 0 in the array (any way I can confirm that before hand? it's possibly it's number 3, but "graid3 list" is not providing those details).
- execute "graid3 insert -n 0 dataarray ad8" (assuming the new disk is remapped to ad8)
Do I then need to execute rebuild before remounting, or does the insert automatically do that for me?
At this point is it safe to remount the array, or should I leave it unmounted while the array rebuilds?
Do I need to "pre-label" the new drive before inserting it into the array?
Should I load/unload the array at any point, thus far I've not fond a crisp clean "how-to" or doc online that goes through that in detail (could be my search terms, but "geom raid3 freebsd rebuild array" gave me clues, but nothing concise).
I apologize in advance if this is not in the right forums, but if anyone can clear up my "list of operations", I would be greatly appreciative.