UFS CAM status : ATA Status Error while I try to boot FreeBSD in multi-user-mode.

Hello.

Something bad happened to the disk where I was working. I have never seen the error before and I don't know what to do. I tried to fix the error with fsck -y /dev/ada2p2 (the main partition) in single user mode but it didn't work. Very odd error. I can boot FreeBSD in single user mode but I can't boot it multi user mode. I did the check several times,only the first time it cleaned the disk. the other times it was already cleaned. The error is still there. fsck is not able to fix that kind of error. I suspect that there is a bug behind that.

What can I do ?

WhatsApp Image 2021-09-27 at 20.43.52.jpeg
 
That disk isn't happy, backup as soon as possible check SMART status (smartmontools / smartctl) and run a long self-test.
 
… I can boot FreeBSD in single user mode but I can't boot it multi user mode. …

Single user mode requires reading from a subset of the file system.

An exit (from single user mode) to multi-user mode will require reading from a larger set, and some writes. If it's a hard disk, there might be a problem with an area of the disk that's occupied by all or part of a file in the larger set.

If I'm not mistaken, your photograph shows failure before multi-user mode. Do check the disk but also, check cabling and other hardware.

If you temporarily disconnect:
  • the other two or more internal disks
  • all non-essential peripherals (leaving only the keyboard, mouse and display)
– then can the computer boot in multi-user mode from the suspect disk alone?
 
-–> then can the computer boot in multi-user mode from the suspect disk alone ?

No. The other disks are good. The only damaged disk is the disk where I have installed FreeBSD. If I was in Linux,I would have used the USB Live cd. But It seems that for FreeBSD there isn't any Live cd. (For Live CD I mean the full OS which run on the USB stick). Someone should create it. Its useful :p
 
So,the idea is to install the smart tools in live mode and then try to fix the error on the disk where I have installed FreeBSD. Smart tools is similar to fsck ?
 
Smart tools is similar to fsck ?
No, modern drives have something called S.M.A.R.T. The smartctl(8) tool can access that information and start a self-test. If SMART reports issues with the drive then it's likely the drive itself is dead or dying. One of the self-tests it can do is a surface scan, that will find "bad" spots on the disk. If there are any then it's usually time to replace the drive. Normally those bad spots are automatically mapped to bit of spare space on the disk itself, but if that spare bit of space is full then those bad spots are going to cause read and/or write errors on your data. Then it's time to replace the disk (some will even replace the disk long before that spare space is full).
 
at the end I have reinstalled freebsd on the same "damaged" disk because I wanted to be sure that it was damaged. For the moment the installation is clean and the disk does not seem to be damaged. I suspect that I'd found a bug that caused all the troubles. But it's just a hunch. In any case, do you know if there is a technique to instantly mirror what happens to my FreeBSD installation while I use it to another disk ?
 
A suspicion is not a fact supported by reasonable arguments. Sometimes a suspicion is a simple way to explain the complex world around us, including possible misguidance.
 
I have no idea about what happened. Mine is just a suspect not supported by adequate knowledges and researches. One thing is sure : I don't want that on the next disk failure I will lose again all the datas. I want to mirror the disk content on another disk and I want that it is updated in real time. Anyway,if it was really damaged at the hardware level,I couldn't have done another installation on the same disk. Is this true or not ?
 
The answer is rather very easy. After installation, perform a backup strategy. I do this on a seperate disk.
When my main O.S. would crash i can recover quickly 99% of all my data.
Every 15 minutes all my working data is safed from SSD to an internal HD drive using an incremental zfs snapshot.
In fact to be perfectably safe I should perform a montly backup from my internal HD drive to a slow external USB drive.
 
The answer is rather very easy. After installation, perform a backup strategy. I do this on a seperate disk.
When my main O.S. would crash i can recover quickly 99% of all my data.
Every 15 minutes all my working data is safed from SSD to an internal HD drive using an incremental zfs snapshot.
In fact to be perfectably safe I should perform a montly backup from my internal HD drive to a slow external USB drive.

I found this tutorial ----> https://xai.sh/2018/08/27/zfs-incremental-backups.html ; it could explain how to apply your suggestion (zfs incremental backup,right ?) ; can u tell me if I'm on the right road ? thanks. take also in consideration that I haven't installed FreeBSD using ZFS. Can I do it just the same ?
 
Some time ago (before the crash) I'd created an image of the FreeBSD installation using this command :
Code:
dd if=/dev/sdc of=freebsd13R-new.img

Where sdc is something like this :

Code:
Disk /dev/sdc: 298.09 GiB, 320072933376 bytes, 625142448 sectors
Disk model: WDC WD3200AAJS-0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6248EBF5-2313-11EC-BA9F-E0D55EE21F22

Device         Start       End   Sectors   Size Type
/dev/sdc1         40    532519    532480   260M EFI System
/dev/sdc2     532520 616562727 616030208 293.7G FreeBSD UFS
/dev/sdc3  616562728 624951335   8388608     4G FreeBSD swap

I tried to restore the img file on the same disk from which I had created it,but,with my surprise,after the recovering I've seen the same error that had caused the disk failure. But later I've been able to make a fresh installation of FreeBSD using the graphical installer. I did it again and this is what happened :

Code:
@Z390-AORUS-PRO:/media/ziomario/Android21/OS/BSD/FreeBSD13R# dd if=/dev/sdc of=freebsd13R-new.img

dd: error reading '/dev/sdc': Input/output error
466616064+0 records in
466616064+0 records out
238907424768 bytes (239 GB, 222 GiB) copied, 2715.15 s, 88.0 MB/s

@Z390-AORUS-PRO:/media/ziomario/Android21/OS/BSD/FreeBSD13R# kpartx freebsd13R-new.img
Alternate GPT is invalid, using primary GPT.
loop29p1 : 0 532480 /dev/loop29 40

loop29p2 : 0 616030208 /dev/loop29 532520
loop29p3 : 0 8388608 /dev/loop29 616562728
loop deleted : /dev/loop29
 
When you have freebsd correctly installed you can use a simple cp -axvfR to backup all files.
You can use dd to backup a master-boot-record or boot-partition but for that you normally use the installation media using gpart.
 
at the end I have reinstalled freebsd on the same "damaged" disk because I wanted to be sure that it was damaged. For the moment the installation is clean and the disk does not seem to be damaged. I suspect that I'd found a bug that caused all the troubles. But it's just a hunch. In any case, do you know if there is a technique to instantly mirror what happens to my FreeBSD installation while I use it to another disk ?

Yes, you can use gmirror(8).
 
When you have freebsd correctly installed you can use a simple cp -axvfR to backup all files.
You can use dd to backup a master-boot-record or boot-partition but for that you normally use the installation media using gpart.

does your command make the incremental copy ? I'm not sure,since I've changed some configuration file on FreeBSD and I gave again your command and it restarted to copy everything from scratch. Maybe this is better :

Code:
cd /
rsync -avxHAX * /mnt/da4p2/home/marietto/Desktop/freebsd-wdc-3-10-2021
 
ziomario are you certain that all cabling and connections are good? Other hardware?

yes,I checked,I have no hardware / power problems. At the moment the disk does not show any error. Maybe I've understood the reason of that error. I've created a bhyve virtual machine with netbsd as a guest os and I gave to bhyve the direct access to the disk where netbsd is installed,like this :

Code:
bhyve -S -c 8 -m 8G -w -H \
    -s 0,hostbridge \
    -s 1,virtio-blk,/dev/ada1 \
    -s 6,virtio-net,tap0 \
    -s 29,fbuf,tcp=0.0.0.0:5900,w=1440,h=900 \
    -s 30,xhci,tablet \
    -s 31,lpc -l com1,/dev/nmdm1A \
    -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \

Maybe Freebsd and or bhyve dont support the netbsd file system and,with the participation of some obscure bugs,correlated with bhyve and with some file system driver,after some time the disk gets corrupted. Although I would have expected the netbsd file system to be corrupted.
 
Back
Top