FreeBSD 8RC2 and ZFS - slow writing

Hi.
I use 8.0RC2 but have slow writing on disk:
PHP:
DSP         15.8T   488G    697      2  15.7M   244K
DSP         15.8T   488G    499     33  9.91M   798K
DSP         15.8T   488G     64     43  3.89M   646K
DSP         15.8T   488G     85      1  2.86M   174K
This is really critical problem.
Code:
8.0-RC2 FreeBSD 8.0-RC2 #0: Thu Nov  5 14:01:00 MSK 2009     drug@asrv31.qwarta.ru:/usr/obj/usr/src/sys/MYKERN  amd64
zpool version 13
# cat /boot/loader.conf
Code:
isp_load="YES"
isp_2400_load="YES"
kern.maxusers=2048
zfs_load="YES"
#vm.kmem_size_max="999M"// in 7.2 there are this parametrs
#vm.kmem_size="999M"
#vfs.zfs.arc_max="448M"
#vfs.zfs.prefetch_disable=1
#vfs.zfs.zil_disable=1
#vfs.zfs.cache_flush_disable=1
vm.kmem_size_max="1024M"
vm.kmem_size="1024M"
vfs.zfs.arc_max="100M"
cat /etc/sysctl.conf
Code:
security.bsd.see_other_uids=0
kern.maxvnodes=400000
 
Did you used the 8.0 RC1 before and got any trouble? Iam using the 8.0 RC1 (without FC) with ZFS filesystem version 13 and ZFS storage pool version 13 .I dont got any bad behaviour with it. The fact that the ZFS file and pool version does not change between the RC1 and the RC2 let me believe that we have to search the problem somewhere else.

First i would try to boot without ACPI (oh i hate it ^^).
Second make a diff between RC1 and RC2 version of the drivers for your FC controller to find out if something change.
 
How is your pool configured? What kinds of vdevs are in use (mirror, raidz1, raidz2, etc)? How many disks are in each vdev?

For raidz vdevs, you will get the write performance of a single drive, per vdev. Thus, if your pool has 1 raidz1 vdev of 5 disks, you will get the write performance of 1 disk. If your pool has 2 raidz1 vdevs of 5 disks, you will get the write performance of 2 disks. And so on.

If using raidz vdevs, do *NOT* use more than 9 disks per vdevs. You will get horrible write performance, and you will most likely be unable to replace any drive in it. Scrub and resilver operations will never finish due to disk thrashing.

The sweet spot for raidz2 vdevs seems to be 6 disks. The sweet spot for raidz1 seems to be 4 disks.

And, why are you limiting the ARC to 100 MB? Do you only have 1 GB of RAM? Depending on your workload, you should give up to 1/2 of your RAM to the ARC.
 
No any zraids, just 1 disk in Zpool
Code:
        NAME        STATE     READ WRITE CKSUM
        DSR         ONLINE       0     0     0
          da1       ONLINE       0     0     0
 
User23 said:
Did you used the 8.0 RC1 before and got any trouble? Iam using the 8.0 RC1 (without FC) with ZFS filesystem version 13 and ZFS storage pool version 13 .I dont got any bad behaviour with it. The fact that the ZFS file and pool version does not change between the RC1 and the RC2 let me believe that we have to search the problem somewhere else.

First i would try to boot without ACPI (oh i hate it ^^).
Second make a diff between RC1 and RC2 version of the drivers for your FC controller to find out if something change.
No, before 8.0RC2 was 7.2 .
 
Uhm, let ask one stupid question...

How you figured out your write speeds are "low"?

Code:
DSP         15.8T   488G    697      2  15.7M   244K 
DSP         15.8T   488G    499     33  9.91M   798K 
DSP         15.8T   488G     64     43  3.89M   646K 
DSP         15.8T   488G     85      1  2.86M   174K

This is just the output from:

Code:
zpool iostat

like

Code:
zpool iostat
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
backup     2.61T  2.83T    166     33  2.61M  2.09M

right?

This output say nothing about max read/write performance! It just tell about the usage over a ? amount of time. Please use iozone or something like that to figure out your write speeds. And please post them :)
 
I have 97% used ZFS.
Is it true for FreeBSD:
Keep pool space under 80% utilization to maintain pool performance. Currently, pool performance can degrade when a pool is very full and file systems are updated frequently, such as on a busy mail server. Full pools might cause a performance penalty, but no other issues. If the primary workload is immutable files (write once, never remove), then you can keep a pool in the 95-98% utilization range. Keep in mind that even with mostly static content in the 95-98% range, write, read, and resilvering performance might suffer.
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
 
Of course, it is true for any filesystem on any OS, as far as i know.

And i noticed ZFS' fragmentation myself; causing many seeks to lower the speeds of spindle drives. The solution is to keep enough free space to prevent heavy fragmentation. And ZFS could use some online defrag as well.
 
sub_mesa, afaik ZFS specially write to other blocks when disk write occurs. This allows saving snapshot data in free blocks
 
"Block pointer rewrite" is the golden key that will unlock a lot of long-desired features for ZFS (raidz migration from X-disks to X+Y disks; online defragmentation; removal of top-level vdevs; etc).

Unfortunately, it's been "in development" and "almost ready" for a couple of years now. :( Progress is ongoing ... it's just not ready for consumption yet.
 
ZFS write speed patch

You seem to have less than 30% of free space on your pool.

Please read the following mailing list posting for more information:
http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019346.html

And you can try the following patch:
http://people.freebsd.org/~mm/patches/zfs/zfs_metaslab.patch

If you are using 9-CURRENT, just apply the patch:
cd /usr/src
patch -p0 < /path/to/zfs_metaslab.patch

If you are using 8-STABLE, you also need:
http://people.freebsd.org/~mm/patches/zfs/v15/stable-8-v15.patch

cd /usr/src
patch -p0 < /path/to/stable-8-v15.patch
patch -p0 < /path/to/zfs_metaslab.patch

If you are using 8.1-RELEASE, then need the metaslab patch and:
http://people.freebsd.org/~mm/patches/zfs/v15/releng-8.1-zfsv15.patch

cd /usr/src
patch -p0 < /path/to/releng-8.1-zfsv15.patch
patch -p0 < /path/to/zfs_metaslab.patch

Then rebuild and reinstall your kernel and userland:
Code:
cd /usr/src
make buildkernel
make buildworld
make installkernel
make installworld
Don't upgrade your pools to v15, otherwise there is no way back to continue without the v15 patch and if you want to boot from a ZFS v15 pool, you have to upgrade the boot code.
 
First things first this thread is a year old so pursuing doesn't seem relevant.

Second, if you don't have a an advanced understanding of the topic, perhaps try not making authoritative statements like this:
eyellowbus said:
Turning the compression on will cause performance penalty.
For a reasonably powered system that statement is not only false, the exact opposite is true.

A CPU is orders of magnitude faster than disks, even SSD's. You want to put your load where you have excess, not where the greatest bottleneck is. For many system under heavy write queues, ZFS's compression will improve performance because it reduces some of the IO load at the cost of some CPU where you likely have plenty to spare.

There are many benchmarks proving this but you really need to run lzjb compression to see the best results.

One such example:
http://don.blogs.smugmug.com/2008/10/13/zfs-mysqlinnodb-compression-update/
 
phoenix said:
For raidz vdevs, you will get the write performance of a single drive, per vdev. Thus, if your pool has 1 raidz1 vdev of 5 disks, you will get the write performance of 1 disk.

Hi, whats your basis for this assertion? I just had a little test on a 4 disk raidz1 pool and I see writes evenly distributed across all physical drives (as you would expect on a standard RAID5 system). Additionally I see significantly improved performance over a 2 disk mirror (I admit not with identical disks, but similar spec).

thanks Andy.
 
I take my info from the ZFS devs. :) Go read through the zfs-discuss mailing lists. This comes up several times a year. raidz is good for bulk storage as it "wastes" the least amount of raw storage; but the dynamic striping used in raidz limits the IOps of the vdev to that of the slowest drive in the vdev.
 
I still have problems understanding this.
IOPS aside, is the read/write speed (measured in bytes transferred) the same to a single disk as to a 5-disk raidz?
 
phoenix said:
I take my info from the ZFS devs. :) Go read through the zfs-discuss mailing lists. This comes up several times a year. raidz is good for bulk storage as it "wastes" the least amount of raw storage; but the dynamic striping used in raidz limits the IOps of the vdev to that of the slowest drive in the vdev.

The IO being limited by the slowest drive makes sense and I think would normally be true for any RAID5 implementation. That though is not the same as the overall IO performance of the RAID set being equal to a single drive. I had a quick search on zfs-discuss and found this quote:

"Raidz is definitely made for sequential IO patterns not random. To get good random IO with raidz you need a zpool with X raidz vdevs where X = desired IOPS/IOPS of single drive."

Which is basically saying that randion IO is poor, but isnt saying that generally speaking IO of a RAIDZ set is equal to a single disk. Didnt find anything specifically talking about write performance. I would guess with ZFS COW that performance will decrease over time as IO becomes more random to avoid overwriting old data, but that would also impact mirrored vdevs.

ta Andy.
 
If you have say 4 drives, you may configure these in either stipped mirrors (raid10), or raidz (raid5). With the raidz configuration, you will have both better 'general io' performance and more disk space. With stripped mirrors, you get worse 'general io' and less disk space.
However, with the mirrors you get twice the write IOPs :)

Most people benchmark their filesystem by copying files over. This is considered 'general io' and is essentialy sequential single-threaded io. raidz will always perform better that mirror in this scenario. When you hit the IOPs limits of the drives, things get different. Due to the variable stripe size, the performance is also very unpredictable. Fragmentation also plays here.

Mirrors on the other hand are very predictable in performance. Each mirror (of same disks) gives you XX IOPs, adding another mirror vdev, adds another XX IOPs. This is for reading and writing as well.

With ZFS you may also add the L2ARC and SLOG vdevs, that will change the picture dramatically.

PS: At one point I was believing that mirrors are always faster, but real life experiments proved me wrong. Yet, it all depends on the intended usage. For mostly storage oriented applications, raidz would be better. For database type applications, mirrors are definitely what you want. Unless the IOPs required are low.
Given enough disks, ZFS lets you have the best of all worlds :)
 
Galactic_Dominator said:
Second, if you don't have a an advanced understanding of the topic, perhaps try not making authoritative statements like this:

I think I should clarify when I made the statement: Enabling compression may cause performance penalty. I meant the general, non-high end CPU. A lot of people set up NAS+ZFS using years-old machine. In this case, enabling compression will definitely cause performance issue because the CPU is not that powerful.

For example, my ZFS system is based on an AMD64 4600 CPU (Dual core). After I disable the compression, I see a huge improvement on I/O speed (25MB/s -> 55MB/s).
 
Back
Top