Possible disk problems: hpttr messages in syslog ?

Could somebody please help me explain the recent messages I have been getting in my system log. They seem to happen during or just after a local RSYNC operation (CRON) but I'm not sure if this is relevant.

The messages look like this:

Code:
May 2 10:32:19 	server1 kernel: hptrr: channel [0,2] started successfully
May 2 10:32:19 	server1 kernel: hptrr: start channel [0,2]
May 2 10:32:19 	server1 kernel: hptrr: ATA regs: error 40, sector count 32, LBA low 644e, LBA mid d8, LBA high c9, device 0, status 51

May 2 10:32:19 	server1 kernel: hptrr: [0 2 0] Command completion error, flag(84)

Clearly the error is of concern. The error number is always the same but the sector number and LBA values change of course. I should add that I did have a problem a little while back which I cleared by running fsck which came back reporting a clean disk. This is built over a RAID array.

I would like to understand what is going on and find out how to stop these messages cropping up on my syslog i.e. by correctly fixing any underlying problems.

Thanks in advance.
 
Hi, could you please post the output of your dmesg?
Also, some more information regarding your raid will be useful.

George
 
Thanks for the quick reply.

Due to the length of the dmseg output I have attached it here. Hope the zipped format is OK. Unzipped it's too big.

The small raid setup uses a hardware RAID card with 5 disks (1 redundant) giving 2TB (RAID 5) of storage. The RAID drive is recognized as da0 !

Is there anything further you need to know ?

Thanks,

ALEX
 

Attachments

It looks like you might have some problems with a drive:
Code:
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=64800\
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=64800\
GEOM_LABEL: Label for provider ad0a is ufsid/4bc097de4f45f42b.\
GEOM_LABEL: Label ufsid/4bc097de4f45f42b removed.\
GEOM_LABEL: Label for provider ad0a is ufsid/4bc097de4f45f42b.\
You might want to replace that with the spare and see if your problems continue.
 
Yeah the messages about ad0 have been there from the very start and to be honest I have given up worrying about them. They relate to the boot drive which is a flash drive. It has always reported things but I have never had a serious problem.

The hptrr messages are far more recent and I can't tell which drive they relate to. The word ATA makes me think they may relate to the native devices but I think I only have a CD drive attached no disks.

I would like to know what this messages means exactly and to what it relates.

Thanks again...
 
Coincidentally, I just read a post on another forum within the last 24 hours which attested to the unreliability of flash drives ("I never buy greater than 4G because twenty percent suddenly die, I've a box full of non-working flash drives") or something to that effect. Not my experience, though. Howsoever, given that news, maybe a CF rather than flash (unless it is CF already) might be more reliable? Just guessing that that relates somehow to this problem... and guessing that CF may be more reliable than flash drives.
 
Thanks for that... it was my mistake really, it is a CF card I am using. But as it has always given me error messages which, with time, I have learned to ignore (is that good or bad ?)! As you say it seems fairly robust.

I still have a sneaky suspicion that the error message relates to something else, another drive !? I had a problem with an attached usb external drive whereby, on re-booting, it was being incorrectly recognised as my main drive or in fact was incorrectly assigned. I have found that I need to remove the usb drive and re-attach AFTER the system has booted. As I mentioned, having done this, I did an fsck on my main RAID array which came back with a clean disk.

In a similar way the GEOM_LABEL: messages pop up often but have never given rise to any serious problem while I have been running the machine. The httpr messages are a recent and new addition!?

ALEX
 
Back
Top