ZFS ZFS preventing system boot

Hello,

I have setup a FreeBSD 10.3 system with ZFS. (Well attempting to anyway.) I can boot the system without issue as long as ZFS is not enabled in rc.conf. After booting the system I can start the ZFS service without any problems. If I boot the system with ZFS enabled I get a, I think, 2 line error message which repeats too fast for me to read. It looks like it says something like:
Code:
vnode_pager_get_read : I/O read error
  default pages  error, pid 1 (init)
The lines are on top of each other so I can't really read them. I can't see anything like this error in any of the files under /var/log.

The system has 5 drives. One 480GB SSD and four 4TB HDDs. The SSD is partitioned with GPT with the following partitions:
Code:
1  freebsd-boot (512K)
2  freebsd-ufs (40G) - root
3  freebsd-swap (2.0G)
4  freebsd-zfs (360G)
  -free space- (45G)
The 4 HDDs are directly formatted with zfs without a partition table.

Under zfs there is one pool made up of the 4 HDDs in raidz and the zfs partition on the SSD as cache (l2arc).

Any thoughts or ideas on what the issue might be or how to begin to diagnosis this problem would be greatly appreciated.

Thanks,
Aaron.
 
What happens when you boot the system without ZFS and load it afterwards?
 
It loads normally as far as I can tell. The pool, various datasets, and all files are all there.

Using service zfs start (or service zfs onestart if zfs is not enabled in /etc/rc.conf) produces no output on the command line which I would think means everything went fine starting the service.

Lines from /var/log/messages when the zfs service is started:
Code:
Aug 18 20:53:16 adsSERVER kernel: ZFS filesystem version: 5
Aug 18 20:53:16 adsSERVER kernel: ZFS storage pool version: features support (5000)
Aug 18 20:53:16 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=12155484232612741330''
Aug 18 20:53:16 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=12155484232612741330
Aug 18 20:53:16 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=13694934710861323402''
Aug 18 20:53:16 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=13694934710861323402
Aug 18 20:53:16 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=9211866907286234278''
Aug 18 20:53:16 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=9211866907286234278
Aug 18 20:53:16 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=14311969669329886361''
Aug 18 20:53:16 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=14311969669329886361
Aug 18 20:53:17 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=12155484232612741330''
Aug 18 20:53:17 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=12155484232612741330
Aug 18 20:53:17 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=13694934710861323402''
Aug 18 20:53:17 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=13694934710861323402
Aug 18 20:53:17 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=9211866907286234278''
Aug 18 20:53:17 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=9211866907286234278
Aug 18 20:53:17 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=14311969669329886361''
Aug 18 20:53:17 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=14311969669329886361
Aug 18 20:53:17 adsSERVER devd: Executing 'logger -p kern.notice -t ZFS 'vdev state changed, pool_guid=8376669012341517020 vdev_guid=8557648133257797728''
Aug 18 20:53:17 adsSERVER ZFS: vdev state changed, pool_guid=8376669012341517020 vdev_guid=8557648133257797728
 
I did more testing. First I destroyed the pool I had made (describe in my initial post) and tried booting the system with zfs enabled. The system came up without issue. I then recreated the pool with just the 4 HDDs in raidz and rebooted the system. Again the system came up without issue. Next I added the L2ARC cache, partition 4 on the SSD (ada0p4). Upon rebooting the problem from my initial post returned. I then removed the cache partition from the pool and rebooted the system one last time. The system came up fine. So zfs will not start properly during boot with the cache device on the pool. It will run fine, as far as I can tell, with the cache device on the pool. I can startup zfs with the cache device on the pool, just not during system boot. While I am glad to be farther along, it would be nice to have the extra speed which, in theory, should be provided by having an SSD cache for the pool. Any ideas on what could be causing this issue?

Thanks,
Aaron.
 
Can you try the following, add a label to the zfs partition on the SSD drive, then use its device node (/dev/gpt/yourlabel) when you add the L2ARC and make sure that the "kern.geom.label.gpt.enable" tunable is set to 1 which means the kernel will not wither the gpt labels from /dev

Also, as a piece of advice, using raw disks is not a good idea (due to disk size differences, alignment, etc.), a single gpt partition with the same size, aligned to 4k on each drive size is the better option.
Example (supposing you have 4 drives, same mfg, model, size):
Code:
# gpart add -t freebsd-zfs -l yourlabel -a 1m adaX
 
Hello tsarya,

Thank you for the suggestion. I do have labels for the partitions on the SSD:
glabel status
Code:
                                      Name  Status  Components
                              gpt/bsd-boot     N/A  ada0p1
gptid/d57dc6c6-33e7-11e6-a758-d05099a7839a     N/A  ada0p1
                             gpt/zfs-cache     N/A  ada0p4
gptid/8145f41c-3423-11e6-b96b-d05099a7839a     N/A  ada0p4

I enabled gpt labels on boot:
cat /boot/loader.conf
Code:
kern.geom.label.gpt.enable=1

I added the cache partition to the zfs pool:
zpool add bigpool cache gpt/zfs-cache
zpool status
Code:
  pool: bigpool
state: ONLINE
  scan: none requested
config:

  NAME                 STATE   READ WRITE CKSUM
  bigpool              ONLINE     0     0     0
     raidz1-0          ONLINE     0     0     0
       ada1            ONLINE     0     0     0
       ada2            ONLINE     0     0     0
       ada3            ONLINE     0     0     0
       ada4            ONLINE     0     0     0
  cache
     gpt/zfs-cache     ONLINE     0     0     0

errors: No known data errors

Upon rebooting, the same error message as in my first post.
 
Hi adsrc,

Since you're setting the pool up, I suppose you don't have any data on the 4 drives.
Before you do what I suggest, please ensure that all your drives are the same, I have had same drives (mfg/model) with different total sectors count!
run diskinfo -v ada1 through 4 and look for the '# mediasize in bytes ' value.

Now try the following: recreate the pool, using GPT partitions instead of raw disks:
Code:
# gpart destroy -F ada1
# gpart destroy -F ada2
# gpart destroy -F ada3
# gpart destroy -F ada4

# gpart create -s gpt ada1
# gpart create -s gpt ada2
# gpart create -s gpt ada3
# gpart create -s gpt ada4

# gpart add -t freebsd-zfs -a 1m -l zdisk1 ada1
# gpart add -t freebsd-zfs -a 1m -l zdisk2 ada2
# gpart add -t freebsd-zfs -a 1m -l zdisk3 ada3
# gpart add -t freebsd-zfs -a 1m -l zdisk4 ada4

# sysctl vfs.zfs.min_auto_ashift=12

# zpool create mypoolname raidz /dev/gpt/zdisk1 gpt/zdisk2 gpt/zdisk3 gpt/zdisk4 cache gpt/zfs-cache
Note that after the pool is created, the /mypoolname dataset is mounted, adjust atime,compression,exec,setuid based on your needs.
Remove the kern.geom.label.gpt.enable=1 from loader.conf, it should be '1' by default.
zfs_enable="YES" in rc.conf will mount the filesystems under /mypoolname (unless you specify a different mountpoint property) on boot. It executes zfs mount -a.
 
You said that partition 4 on the SSD is for L2ARC. How is the rest of that drive partitioned?
 
Hello,

First to answers wblock@, the SSD is partitioned as follows:
ada0p1 freebsd-boot (512K)
ada0p2 freebsd-ufs (40G) - root
ada0p3 freebsd-swap (2.0G)
ada0p4 freebsd-zfs (360G) - L2ARC
-free space- (45G)

And now on to tsarya instructions:
kern.geom.label.gpt.enable=1 was removed from loader.conf and zfs_enable="YES" was already set in rc.conf.

Drive Info:
diskinfo -v ada1
Code:
ada1
  512  # sectorsize
  4000787030016  # mediasize in bytes (3.6T)
  7814037168  # mediasize in sectors
  4096  # stripesize
  0  # stripeoffset
  7752021  # Cylinders according to firmware.
  16  # Heads according to firmware.
  63  # Sectors according to firmware.
  Z3051RJG  # Disk ident.
diskinfo -v ada2
Code:
ada2
  512  # sectorsize
  4000787030016  # mediasize in bytes (3.6T)
  7814037168  # mediasize in sectors
  4096  # stripesize
  0  # stripeoffset
  7752021  # Cylinders according to firmware.
  16  # Heads according to firmware.
  63  # Sectors according to firmware.
  W301B145  # Disk ident.
diskinfo -v ada3
Code:
ada3
  512  # sectorsize
  4000787030016  # mediasize in bytes (3.6T)
  7814037168  # mediasize in sectors
  4096  # stripesize
  0  # stripeoffset
  7752021  # Cylinders according to firmware.
  16  # Heads according to firmware.
  63  # Sectors according to firmware.
  W301B2L0  # Disk ident.
diskinfo -v ada4
Code:
ada4
  512  # sectorsize
  4000787030016  # mediasize in bytes (3.6T)
  7814037168  # mediasize in sectors
  4096  # stripesize
  0  # stripeoffset
  7752021  # Cylinders according to firmware.
  16  # Heads according to firmware.
  63  # Sectors according to firmware.
  W301C0MW  # Disk ident.
diskinfo -v ada0
Code:
ada0
  512  # sectorsize
  480103981056  # mediasize in bytes (447G)
  937703088  # mediasize in sectors
  0  # stripesize
  0  # stripeoffset
  930261  # Cylinders according to firmware.
  16  # Heads according to firmware.
  63  # Sectors according to firmware.
  ME1604181001C646B  # Disk ident.

Creating the partitions and zfs pool:
gpart create -s gpt ada1
Code:
ada1 created
gpart create -s gpt ada2
Code:
ada2 created
gpart create -s gpt ada3
Code:
ada3 created
gpart create -s gpt ada4
Code:
ada4 created
gpart add -t freebsd-zfs -a 1m -l raidzDisk1 ada1
Code:
ada1p1 added
gpart add -t freebsd-zfs -a 1m -l raidzDisk2 ada2
Code:
ada2p1 added
gpart add -t freebsd-zfs -a 1m -l raidzDisk3 ada3
Code:
ada3p1 added
gpart add -t freebsd-zfs -a 1m -l raidzDisk4 ada4
Code:
ada4p1 added
sysctl vfs.zfs.min_auto_ashift=12
Code:
vfs.zfs.min_auto_ashift: 9 -> 12
zpool create storage raidz /dev/gpt/raidzDisk{1,2,3,4} cache /dev/gpt/zfs-cache
zpool status
Code:
  pool: storage
state: ONLINE
  scan: none requested
config:

  NAME                STATE  READ WRITE CKSUM
  storage             ONLINE    0     0     0
    raidz1-0          ONLINE    0     0     0
      gpt/raidzDisk1  ONLINE    0     0     0
      gpt/raidzDisk2  ONLINE    0     0     0
      gpt/raidzDisk3  ONLINE    0     0     0
      gpt/raidzDisk4  ONLINE    0     0     0
    cache
      gpt/zfs-cache   ONLINE    0     0     0

errors: No known data errors

Upon rebooting, the same error message as in my first post.
Thank you for your continued help. At the very least, I'm learning and becoming more familiar with FreeBSD.
 
If the cache device is removed from the pool, the system boots without issue.

If the cache device is left in the pool and the zfs_enable="YES" is commented out in rc.conf (i.e. #zfs_enable="YES") so the zfs service does not start during system boot, then the system will boot without issue. Once the system is up, the zfs service can be started without issue.

Having the zfs service start during boot (from rc.conf) with the cache device part of the zfs pool will cause the (endlessly printing) error message mentioned in the initial post of this thread.

System Hardware Configuration:
  • AMD A8-7600 APU AD7600YBJABOX
  • ASRock A68M-ITX AMD A68H (Bolton D2H) Mini ITX Motherboard
  • G.SKILL Ares Series 16GB (2 x 8GB) DDR3 2133 (PC3 17000) F3-2133C10D-16GAB
  • Mushkin Enhanced ECO3 2.5" 480GB SATA III TLC Internal SSD MKNSSDE3480GB
  • Seagate NAS HDD ST4000VN000 4TB 64MB Cache SATA 6.0Gb/s (Quantity - 4)
  • IO Crest 2 Port SATA III RAID PCIe 2.0 x 1 Card SI-PEX40098
The 4 HDDs are connected to the SATA ports on the Motherboard. The SSD is connected to the IO Crest SATA controller card. The system boots from the SSD.
 
Ok, let's try this, just for the experiment:
1) Connect the SSD to the first SATA port
2) Create a pool with only 3 drives (raidz, GPT labels) the same way as you did with 4.

My point is to see if the IO Crest SATA controller is creating the issue.
 
So here is an update on what I have attempted. (tsarya, I did try the experiment, it's just lower down in this post.)

First, in one of the times that I booted into single user mode, I forgot to make root writeable before trying to start the zfs service. ( mount -uw /.) The error message I received was:
Code:
vnode_pager_get pages: I/O read error
vm_fault: pager read error, pid 24 (zfs)
I believe this is the error message that I have been getting when booting the system with zfs enabled and having that cache partition in a zpool except pid 1 (init) instead of pid 24 (zfs) and endlessly repeating:
Code:
vnode_pager_get pages: I/O read error
vm_fault: pager read error, pid 1 (init)

I made a zpool with 3 of the HDDs in raidz and the fourth as the cache. This booted without issue.
zpool create storage raidz gpt/raidzDisk{1,2,3} cache gpt/raidzDisk4

I made a zpool with just the single zfs SSD partition. This also booted without issue.
zpool create storage gpt/zfs-cache (remember zfs-cache is the partition label of ada0p4, the fourth partition of the SSD.)

Next, I tried tsarya's experiment from the last post. I removed one of the HDDs from the motherboard's controller and plug the SSD into the vacant port leaving nothing in the IO Crest controller.
zpool create storage raidz gpt/raidzDisk{2,3,4} cache gpt/zfs-cache
The usual error occurred during boot.

Finally, I thought maybe there was something wrong with the zfs partition of the SSD, so I deleted it and added it again:
gpart delete -i 4 ada0
gpart add -t freebsd-zfs -s 360G -l zfs-cache ada0
Rebuilt the zpool:
zpool create storage raidz gpt/raidzDisk{1,2,3,4} cache gpt/zfs-cache
And rebooted the system back to the usual error.

So for whatever reason zfs does not like that partition of that SSD being used as an L2ARC cache during system boot. Otherwise it is perfectly fine with it. I should also add that if I boot from a FreeBSD install CD with that zpool already created on my HDDs/SSD the aforementioned error will occur.
 
Yeah, this is really weird.

Well, in this case, may I make a suggestion, instead of using raidz with 4 drives and a L2ARC on a SSD to improve read speed, have you considered the option to configure the pool in RAID-10 style?
In RAID-10 you will have a massive read speed from these 4 drives, probably comparable to the SSD. It is true though that you will sacrifice more disk space (50% compared to 25% with raidz).
 
Hello all,

Here is another update and kind of a solution.

I thought that perhaps there may be a conflict with one of the other services so I configured rc.conf to disable every service I could except zfs. Rebooted the system to the usual error.

I found, searching on the web, that some people were getting a similar error from init when running FreeBSD 10.3 as a VirtualBox guest. My FreeBSD system is running on bare metal, but I figured I would give the workaround a try. I added vfs.unmapped_buf_allowed=0 to /boot/loader.conf. Again I reboot and got the usual error.

My next idea, and solution to all of this, was to have cron start the zfs service on boot. I added
Code:
@reboot     root     service zfs onestart
to /etc/crontab. The system boots successfully with this configuration. Does anyone know if this is an okay solution or am I asking for difficulties down the road? Are there any drawbacks to using cron to start a service?

tsarya, thank you for the suggestion of not using the SSD for cache and just using the HDDs in raid10 instead. While this would give a very nice maximum sequential throughput, with HDDs you are not going to get the low latency nor the high number of I/Os per secound of an SSD. This is just a home server for myself and a way to get some practise with FreeBSD, but I really would like to use the SSD as an L2ARC cache if possible.

gkontos, that is an excellent thought of trying a different SSD. I do not have a spare SSD, nor do I know anyone who has a spare SSD either, but depending on how realistic (or should I say how good of an idea) my solutions winds up being, I may try buying another SSD to test with.

Another possible thought would be to reinstall FreeBSD on this system but with root on ZFS. If that would work, I imagine that may be a better solution than using cron to start zfs. (Or perhaps, I should just stick with what is currently working.)

I would love to hear any thoughts or comments, especially on if my solution is going to be trouble. Thanks for everyone's help.
 
Another possible thought would be to reinstall FreeBSD on this system but with root on ZFS

I would try that right away! Also, you could use FreeBSD 11.0-RC2. I imagine you will dedicate the entire SSD to L2ARC? Please keep us posted on the outcome.
As for using cron to start the service, I have no experience, never tried that before with zfs.
 
Hello,

Sorry for taking nearly a month to get back to work on this system. I did, today, re-install FreeBSD 10.3 on the system using root on zfs with the 4 HDDs in raidz. After installation was complete and the system rebooted, I added the SSD partition as an L2ARC cache to the zpool and rebooted the system. The system started without issue. I guess that kind of completes this thread, even though I do not know what caused the issue with my initial setup/configuration.

Thanks everyone for your help.
 
Back
Top