dump causes seg fault, fsck reports many errors. Maybe my USB drive is going bad

I am running FreeBSD 9-STABLE in my home NAS using a pair of 8 GB USB flash drives in a gmirror array (gm0). Only the root partition is running from the flash drives anymore as the other system partitions are mounted from ZFS.

I was about to do some updating to the system, so I was backing up the root partition before starting. I use dump to backup the partition to a file. When I try to do that, this is the reponse:
Code:
$ sudo dump -C16 -b64 -0uanL -h0 -f /storage/dumps/2013-07-08/root.dump /
  DUMP: Date of this level 0 dump: Mon Jul  8 23:06:51 2013
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/mirror/gm0s1a (/) to /storage/dumps/2013-07-08/root.dump
  DUMP: mapping (Pass I) [regular files]
  DUMP: Cache 16 MB, blocksize = 65536
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 548458 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: SIGSEGV: ABORTING!
  DUMP: SIGSEGV: ABORTING!
  DUMP:   DUMP: SIGSEGV: ABORTING!
SIGSEGV: ABORTING!
Segmentation fault: 11
  DUMP: SIGSEGV: ABORTING!

According to some other threads out there, dump sometimes throws a seq fault when the filesystem is corrupt. So I ran fsck on the root partition:
Code:
$ sudo fsck /
** /dev/mirror/gm0s1a (NO WRITE)
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
PARTIALLY TRUNCATED INODE I=65851
SALVAGE? no

1236264795632723608 BAD I=65851
UNEXPECTED SOFT UPDATE INCONSISTENCY

360287970189788832 BAD I=65851
UNEXPECTED SOFT UPDATE INCONSISTENCY

PARTIALLY TRUNCATED INODE I=65865
SALVAGE? no

PARTIALLY TRUNCATED INODE I=65973
SALVAGE? no

4650529635141047456 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

1166749237749415116 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

4510269711728104 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

146437369587973241 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

1162069441433969920 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

725097209535798568 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

38301092416731416 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

18349182581852 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

180291319938411812 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

-9205076122431751160 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

1308317690573782384 BAD I=65973
UNEXPECTED SOFT UPDATE INCONSISTENCY

EXCESSIVE BAD BLKS I=65973
CONTINUE? [yn] y

INCORRECT BLOCK COUNT I=65973 (2048 should be 1472)
CORRECT? no

PARTIALLY TRUNCATED INODE I=65974
SALVAGE? no

PARTIALLY TRUNCATED INODE I=65976
SALVAGE? no

140737488518864 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

140771848779992 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

17180032736 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

281509336608488 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

2252968044953328 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

2260730 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

288230444871351552 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

4755803405526527816 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

49539596035440400 BAD I=65976
UNEXPECTED SOFT UPDATE INCONSISTENCY

PARTIALLY TRUNCATED INODE I=65977
SALVAGE? no

5333537402025694976 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

324371325504351618 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

2018880919725301664 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

3458840380744691642 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

14878907985889 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

1297036732746856297 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

5791629155191908792 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

86839842023046104 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

522606672775106376 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

-9205335570266261048 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

-9184491092512703024 BAD I=65977
UNEXPECTED SOFT UPDATE INCONSISTENCY

EXCESSIVE BAD BLKS I=65977
CONTINUE? [yn] n

So I rebooted into single-user mode and re-ran fsck -y / in an attempt to repair the errors. I can't copy and paste that text, but the same errors appear and fsck works through it, at the end marking the filesystem as clean. But after rebooting and running fsck again, the same errors recurred.

I have tried to find out what those fsck errors mean, but my searches have come up short. I am at a bit of a loss. My only guess is that perhaps one or both of the USB flash drives is failing. But I would think that would cause the gmirror to degrade, which it hasn't. Things seem to be working just fine otherwise, too.

So is it possible to test each individual flash drive for errors? Or any other ideas? Thanks.
 
In an attempt to figure out if only one drive was the problem, I loaded a live USB version of GhostBSD and ran fsck on each of the USB flash drives. As it turns out, one (a Transcend model) was throwing many errors repeatedly while the other (a SanDisk model) seemed to be fixed right away.

Since I had an extra 8 GB flash drive laying about, I replaced the faulty one with it. But I was unable to insert it into the gmirror array as it was ever-so-slightly smaller.

It seems like I might have made an error when I set up the array. I mirrored the entire drives instead of individual partitions. I'm not particularly worried about rebuild times, but it looks like size mismatch is another reason to mirror partitions.

So I suppose I might redo the mirror setup. My plan would be to destroy the gmirror array, dd the working flash drive to the new one, then set up an array for each of the partitions (there are three, though only the root partition is used anymore). I'll report back with results.
 
There are problems with mirroring partitions. If multiple partitions start to resync at the same time--and they will if a drive fails--it will get ugly.

Don't use dd(8) to copy filesystems. It's very slow and will not correctly resize a filesystem. Use dump(8) and restore(8). See Backup Options For FreeBSD. But reconsider using USB memory for this. Many people try, and surely some succeed, but there are many horror stories. CF media interfaced through an IDE or SATA adapter are probably more trustworthy, but quality ones will cost more than a couple of low-power notebook drives.
 
wblock@ said:
There are problems with mirroring partitions. If multiple partitions start to resync at the same time--and they will if a drive fails--it will get ugly.

So is mirroring the entire drive the way to go, then? I was actually going by your how-to as a reason to rethink my approach.

wblock@ said:
Don't use dd(8) to copy filesystems. It's very slow and will not correctly resize a filesystem. Use dump(8) and restore(8). See Backup Options For FreeBSD.

Hmm, so if I use dump(8) and restore(8), I need to first create the filesystems on the new drive, right?

wblock@ said:
But reconsider using USB memory for this. Many people try, and surely some succeed, but there are many horror stories. CF media interfaced through an IDE or SATA adapter are probably more trustworthy, but quality ones will cost more than a couple of low-power notebook drives.

I'm curious why a CF card would be more trustworthy. My previous NAS ran off a mirrored pair of USB flash drives for years without a hiccup. I would imagine that CF cards would be just as susceptible to errors as USB flash, but maybe I'm missing something. Once the system is set up, I switch the root partition to read-only mode. So that should guard against too many writes. As for the problem in this case, my guess is that this particular flash drive was bad. I hadn't ever bought a Transcend flash device before and I don't think I will in the future.
 
cbunn said:
So is mirroring the entire drive the way to go, then? I was actually going by your how-to as a reason to rethink my approach.

Yes, if you can mirror the whole drive it is preferable. With the whole drive mirrored, changes to the bootcode or partition table will be mirrored. If individual partitions are mirrored, only changes to things inside those partitions will be mirrored.

That article was specifically a workaround to get GPT and gmirror(8) to work together. I should edit it to add more warnings or notes or something.

Hmm, so if I use dump(8) and restore(8), I need to first create the filesystems on the new drive, right?

Yes. restore(8) writes to a filesystem. I should remember to mention sysutils/clone, which could make the process faster and easier.

I'm curious why a CF card would be more trustworthy. My previous NAS ran off a mirrored pair of USB flash drives for years without a hiccup. I would imagine that CF cards would be just as susceptible to errors as USB flash, but maybe I'm missing something.

The flash is the same, but the CF interface is more drive-like than USB. (Actually, it's a mini IDE.) This assumes the use of a CF to IDE adapter, avoiding USB problems. Those adapters are fairly rare. For that matter, CF is fairly rare any more.

Once the system is set up, I switch the root partition to read-only mode. So that should guard against too many writes. As for the problem in this case, my guess is that this particular flash drive was bad. I hadn't ever bought a Transcend flash device before and I don't think I will in the future.

My concern is not with wearing out the flash with too many writes, but just general reliability with USB communications for system drives. But if it has been working for you until now, it should continue.

Regardless of the system drive type, it's handy to use sysutils/rsnapshot to keep copies of the system config files on the ZFS array, or even a full dump(8) backup.
 
wblock@ said:
Yes, if you can mirror the whole drive it is preferable. With the whole drive mirrored, changes to the bootcode or partition table will be mirrored. If individual partitions are mirrored, only changes to things inside those partitions will be mirrored.

That article was specifically a workaround to get GPT and gmirror(8) to work together. I should edit it to add more warnings or notes or something.

Ah, I see. Good to know. I had seen the same advice in a couple of other tutorials, but I imagine it is for the same use case.

wblock@ said:
Yes. restore(8) writes to a filesystem. I should remember to mention sysutils/clone, which could make the process faster and easier.

That program looks interesting. I'll have to read up on it. I wonder if just inserting a blank drive into the gmirror array and rebuilding might not be the easiest (if not the fastest) method, though.

wblock@ said:
The flash is the same, but the CF interface is more drive-like than USB. (Actually, it's a mini IDE.) This assumes the use of a CF to IDE adapter, avoiding USB problems. Those adapters are fairly rare. For that matter, CF is fairly rare any more.

I see what you mean now. My board has no IDE ports, only SATA. I've found a few SATA->CF adapters. CF cards themselves aren't so rare. I still use them for my DSLR. But they are not cheap.

wblock@ said:
Regardless of the system drive type, it's handy to use sysutils/rsnapshot to keep copies of the system config files on the ZFS array, or even a full dump(8) backup.

I do regular backups of the config files and occasional dumps of the root filesystem. That utility looks like it might be a bit slicker than my usual routine, though. So thanks.
 
cbunn said:
That program looks interesting. I'll have to read up on it. I wonder if just inserting a blank drive into the gmirror array and rebuilding might not be the easiest (if not the fastest) method, though.

The RAID1 section in the Handbook shows how to create a mirror with a single new drive. The mirror is created on the new drive, the old drive copied to it with dump(8)/restore(8), and then the original drive is added as a component of the mirror. That's the second half of that section.

I'd rather have a full backup and create the mirror with two drives from the start, as shown in the first section.
 
Well, if I'm going to start over with a new array, I think I will use a tip from the section on creating a mirror with an existing disk by using gnop to create a mirror of less than 8 GB. Perhaps 7 GB. That way I can later swap in any 8 GB drive without running into similar problems. The only issue there would be that at least one partition wouldn't be able to be restored from a full dump this time around. But that's not a big deal.
 
It's not necessary to go that much smaller, a few megabytes is probably enough. Or double the difference in actual size between the two USB drives should be safe.
 
I bought two new 8 GB flash drives. One Sandisk and one Kingston. Oddly, the Kingston is only 7.2 GB. Anywho, still plenty of room for my purposes.

I used the gnop(8) tip from the Handbook to create a fake disk to limit the size used by gmirror(8) along with some dump(8)/restore(8) action to get a functioning mirror back.

Thanks to @wblock@ for all the help!
 
Last edited by a moderator:
Back
Top