Need Help Recovering Damaged Slice or Partition

Short version: I think I messed up the slice or partition on one of my storage drives and I'd like to fix it or at least recover the data.

Long version:

I'll preface this by saying I'm pretty much a Unix newbie, so I may have missed something obvious and, if you have any help to offer, I'll probably need you to be as explicit as possible.

I have file server with multiple RAID 5 arrays that was running FreeBSD 6.0 for years without issue. Recently, however, I attempted to setup a copy of several different Linux distros on a spare drive. I couldn't get the drivers for one of my two RAID controllers to work under any of the various distros and versions I tried and, in the process of one of the installs, I managed to write GRUB to my FreeBSD system drive. This meant I could no longer boot the old system so I decided I might as well install FreeBSD 7.2 on the spare drive.

The install went fine and I was pleasantly surprised to see that 7.2 supported both of my RAID controllers without any additional configuration. However, the older controller, a RocketRAID 464, had two RAID 5 arrays connected to it and I was only able to mount one of them.

Upon investigating /dev, I found that the array having the issue showed up only as ar0 while, for the working array on the same card, /dev contained ar1, ar1s1, ar1s1c and ar1s1d. This leads me to believe the slice was somehow damaged. The array was formatted with ufs2, as was everything else.

I can think of two possible causes. The first would be that something occurred during one of the various installs. The other possibility is that the array was somehow damaged when I rebuilt it as, when rebooting to perform the first Linux install, I got an error message from the RR 464's BIOS saying the array I'm currently having an issue with was missing a drive. The missing drive was still detecting in the card's BIOS so, after the first Linux distro I tried wouldn't support the card, I had it rebuild the array overnight. The card indicated the process was successful but I was unable to verify this in any OS until I install FreeBSD 7.2 and found it wouldn't mount.

So, with the history out of the way, these are the diagnostics I've run thus far:

fdisk /dev/ar0 yields:
Code:
******* Working on device /dev/ar0 *******
parameters extracted from in-core disklabel are:
cylinders=61031 heads=255 sectors/track=63 (16065 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=61031 heads=255 sectors/track=63 (16065 blks/cyl)

fdisk: invalid fdisk partition table found
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
    start 63, size 980462952 (478741 Meg), flag 80 (active)
        beg: cyl 0/ head 1/ sector 1;
        end: cyl 614/ head 254/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
The data for partition 1 seems correct, but the message about "invalid fdisk partition table" is worrying.

Based on this thread, I also tried scan_ffs /dev/ad0 which gave this output:
Code:
ufs2 at 63 size 245115738 mount /storage time Thu Mar 16 22:14:48 2006

ufs1 at 907144059 size 2880 mount /mnt time Thu Nov  3 03:49:18 2005

ufs1 at 907146947 size 2880 mount /mnt time Thu Nov  3 03:49:19 2005

ufs1 at 907149827 size 2880 mount /mnt time Thu Nov  3 03:49:17 2005

scan_ffs: read: Input/output error

Right now I'm at a loss as to how to proceed and any help or suggestions you can provide would be greatly appreciated. Some folks in the thread linked above recommended testdisk which sounds applicable but one poster mentioned it may have issues with ufs which makes me hesitant to try it.
 
According to scan_ffs, /storage was only 120 gig. Does that sound right? What about those possible /mnt file systems?

Well it looks like you just need to create a new partition table with fdisk. That shouldn't be destructive to the data on the drive, and when you get the boundaries right you might find your disklabel still intact.
 
aragon said:
According to scan_ffs, /storage was only 120 gig. Does that sound right? What about those possible /mnt file systems?

No, /storage should be close to 500 gigs, which is what fdisk gives. I have no idea what the file systems listed by scan_ffs as /mnt would be as the disk should just be one partition filling all available space.

As far as creating a new partition table goes, I should use fdisk -f, correct? Where then should I get the info for the config file from? Should I base it off the info given by fdisk without any arguments? The data it gives for partition 1 appears correct as it at least gets the size right.
 
For comparison, running scan_ffs /dev/ar1 (the working 500 GB array on the same card) yields:
Code:
ufs2 at 63 size 244196016 mount /storage/software time Fri Feb 16 20:34:21 2007

ufs1 at 270045659 size 2880 mount /mnt time Thu Nov  3 03:49:18 2005

ufs1 at 270048547 size 2880 mount /mnt time Thu Nov  3 03:49:19 2005

ufs1 at 270051427 size 2880 mount /mnt time Thu Nov  3 03:49:17 2005

ufs1 at 270365123 size 2880 mount /mnt time Thu Nov  3 03:49:17 2005

scan_ffs: read: Input/output error

I know this array is working and is 500 GB in size and yet the size field for the first partition is similar to that on the broken array. I'm not sure what the unit is for size in scan_ffs, but it doesn't look like it indicates a problem in ar0.
 
I don't think the faulty array has a valid partition table. The fact that there is no /dev/ar0s1, etc. and that fdisk emits an error kinda confirms this.

As far as creating a new partition table, easiest is probably to just create a single slice spanning the whole disk: fdisk -I. Add -B to install boot code.
 
Thank you very much. fdisk -I yielded a usable slice and I can now mount the array. Unfortunately the problem seems worse than I had hoped. It first became apparent when attempting to list the contents of some of the directories in /storage gave a "Bad file descriptor" error. Other directories worked better but when attempting to copy some files to another PC, only some of the files would copy, others couldn't be read.

Running fsck on the array gives a huge number of errors and it ultimately terminates with the message "fsck_ufs: bad inode number 21926912 to nextinode" after 15 min or so. Most of the errors it gives me are of these types:
Code:
UNKNOWN FILE TYPE I=21926911
UNEXPECTED SOFT UPDATE INCONSISTENCY

PARTIALLY ALLOCATED INODE I=21926909
UNEXPECTED SOFT UPDATE INCONSISTENCY

4616413357413495652 BAD I=21809164
UNEXPECTED SOFT UPDATE INCONSISTENCY

EXCESSIVE BAD BLKS I=21809164

INCORRECT BLOCK COUNT I=21809164 (40416 should be 2784)

PARTIALLY TRUNCATED INODE I=21691492

EXCESSIVE DUP BLKS I=21691491
Most of these errors are repeated numerous times.

I found a thread here, posted by someone who appears to have been in nearly my exact situation which suggests that the file system is likely a lost cause and that the best I can hope for is to recover individual files. I'd heard mention of magicrescue before but I'd appreciate any recommendations. If there's something else I'm missing or could try, please let me know.
 
So, I investigated magicrescue and gave it a shot using the flac recipe as there should only be a few flac files on the array. Unfortunately, magicrescue -d ~/rescue -r flac /dev/ar0s1 yields:
Code:
Read error on /dev/ar0s1 at 102400 bytes: Invalid argument

The result is exactly the same whether the array is mounted or not, or even if I use ar0 instead of ar0s1. I've seen others post the same error, complete with the same byte value but have had no luck figuring out what it means.

Magicrescue is less than ideal in any case as I'd really like to preserve as much of the file structure as possible (identifying 300 some GB of files without directory names would be more than a little painful). Does anyone know of a utility to suggest? Even one that would allow me to attempt to recover files from a specific directory on the array would be great as, when it is mounted, I can browse most of the directories but some of the files within them appear corrupt. Can anyone suggest a way to copy all the files and directories that are readable while automatically skipping those that aren't?
 
Back
Top