Solved Constant stream of "zio_read error: 5" messages on boot. (Please help)

For no particular reason that I can identify, my FreeBSD desktop PC (13.0) has started printing out an unending stream of zio_read error: 5 messages on boot. The boot still finished correctly, and the messages stop once I reach the point where the actual kernel is starting up (i.e. the start of the dmesg history), but before that, while I'm in the bootloader, the stream of error messages makes it basically impossible to actually see the bootloader menu (I see it fly by, but it gets pushed off the page by the error messages in under a second). Does anybody recognize those symptoms or have a suggestion re: how to fix?

I could post a video of the problem, but as no sign of the error seems to exist once the kernel actually starts, I'm not sure what other details I can offer at this time. The installation is zfs-on-root, though I don't think there's anything out of the ordinary about that.
 
Error 5 is EIO = disk read/write error. Most likely, you have a problem with your disk or interface.

Start with two things: run smartctl -a, and see whether the disk reports any IO errors. And check and reseat data and power cables to the disk.
 
Also, please share outputs from the following commands:
  1. freebsd-version -kru
  2. uname -aKU
  3. sysrc -f /boot/loader.conf opensolaris_load openzfs_load zfs_load
  4. zfs --version
  5. zpool status -x
  6. zpool list
  7. geom part show
  8. geom disk list
 
I tried reseating all SATA cables first and that did fix it. I'm still rather confused, because none of the drives report any errors with smartctl, and the filesystem worked fine once the operating system had actually booted, but I guess the problem is solved...

Thank you both very much.
 
because none of the drives report any errors with smartctl,
That's good, that means the drives themselves are good.

and the filesystem worked fine once the operating system had actually booted
Connection may have been dodgy. Behind the scenes the system will retry a couple of times, if those retries work then you won't notice anything. Would still show a bunch of errors in the logs though, and performance is going to take a big hit.
 
because none of the drives report any errors with smartctl,
did you rely on the "health status" or check the actual attributes (esp. error rates and reallocated sectors)? disk firmware ALWAYS lies. I've actually never seen anything else but "OK" for the health status, even on drives that were obviously dying and were logging massively increasing error rates and/or failed sectors.
 
did you rely on the "health status" or check the actual attributes (esp. error rates and reallocated sectors)? disk firmware ALWAYS lies. I've actually never seen anything else but "OK" for the health status, even on drives that were obviously dying and were logging massively increasing error rates and/or failed sectors.
I looked at the error rates and for the presence of any logged errors. I'm suspicious of "health status" as well, but I'm glad to hear someone more experienced validate that my suspicion is justified.
 
can you recreate the errors after boot ?
if you dd the boot partition to null, or boot files like loader, kernel ?
 
If reseating the SATA cables fixed it, then extremely likely the problem was communications between the drive and the host. Those communication errors would not (and should not) show up in SMART. And, as SirDice said, it is very possible that those errors only happened in the boot loader, and not once the OS itself is loaded.
 
My system are a FreeBSD 12.3 . Zfs on root with a mirror of two disks.

I got the exact problem.
Expect my system does not boot up. What do you mean when you write "reset sata cables"?
If i boot a livecd i can access all data but i can't boot from the disks. If i install reinstall FreeBSD on a new disk when rollback a snapshot from the old disk i get the same error on the new disk.
 

Attachments

  • received_1617162772000981.jpeg
    received_1617162772000981.jpeg
    121 KB · Views: 644
What do you mean when you write "reset sata cables"?
reseat, not reset. Which means making sure the cables are connected correctly, sometimes cables can come loose a bit causing transfer issues.
 
is there any backplane/hot-swap enclosure (e.g. icy dock) involved?
I've seen those errors once with an older icy dock 5x3.5" SAS3 enclosure that didn't play nice with some SATA drives (SAS worked fine).
Also SAS-expanders + SATA drives can show "unexpected behaviour" at times.
 
your zfs-gpt-boot code may be out of date (if the system was originally installed with an older freebsd version)
That's a good one. A boot loader from 12.x cannot boot an OpenZFS pool made with 13.x.
 
Yes they are good and new as i have migrated from 2tb old disk to 6tb disks . I first migrated to two 6tb disks but they there loud so i got two new . It was when I changed to the last 6tb disk the the disks just refused to boot with that error expect the 2 TB disks . So i am moving newer data from 6 TB disk to the 2 TB disk and will try to migrate again from the 2 TB disk pool to 6tb disks
 
Yes they are good and new as i have migrated from 2tb old disk to 6tb disks . I first migrated to two 6tb disks but they there loud so i got two new . It was when I changed to the last 6tb disk the the disks just refused to boot with that error expect the 2 TB disks . So i am moving newer data from 6 TB disk to the 2 TB disk and will try to migrate again from the 2 TB disk pool to 6tb disks
Just to be sure - how old is that system?
There were numerous old controllers based on 16-bit (or some weird 24bit) chipsets that can't handle disks larger than ~3TB. They might just fail to discover larger drives at all, only show the maximum supported size, wrap around at their max addressable size or behave in countless other unexpected ways...
 
Hi everyone here comes an update:
So first of all I want thank everyone that have tried to help.
The system I had problems with is Dell precision t3600. It seems that the system can't boot drives that are larger than 2 TB. If I move the drives to another system they boot fine.
I have bought a raid card in the hopes that that will solve the issue. In the meantime I have moved the drives to an other system.
 
Last edited by a moderator:
Back
Top