ZFS / Samba: performance issue

jyavenard · Dec 28, 2010

Arrggg...

Added the external log device... later I rebooted .. Exact same thing as earlier (except no error this time): trying to access the zfs disks with zpool just hang forever...

What a piece @#$@#$%@# that thing is...

AndyUKG · Dec 28, 2010

jyavenard said:
Ok, here I'm using a SSD (Intel X25-M drive)

So I would assume that writing speed on those is much greater than the WD 2TB Green RE3

Absolutely, in the blog they seemed to be using the same 500GB disks for the benchmarking. So I was referring to that.

jyavenard said:
Well, here I have a 40GB SSD drive, my plan was to use 8GB for the ZIL partition, and 32GB for the cache..

Well, that's my point, I can either create two slices, or a slice with two partitions.

Now, if Solaris isn't going to be able to read my two slices, I have a problem and considering the nightmare I went through yesterday when FreeBSD failed miserably when I removed the log device, and the only thing that saved me was booting OpenIndiana and re-importing the disk there.. I surely want my ZFS system to work with Solaris/OI *just in facse*

So yeah, partitioning on an SSD device isn't such a performance no no because you don't get in a mess with random IOs like you do with a rotating hard disk.
So the obvious thing is to partitioning using GPT partitions isn't it? I haven't tested that on Solaris so not sure, but sounds like you already know the answer??

jyavenard said:
I did, I can't say I saw much difference, timing varied so much with v14, from 30s to 55s so it's hard to say.

Mmm ok. So you still have something pretty weird going on. You could also try reducing the value of vfs.zfs.txg.timeout (ie to 5 or 10 maybe).

jyavenard · Dec 28, 2010

I don't know the answer about using slice instead of partitions yet. I used slices so far well test soon.

I did follow the instructions someone wrote in reply to my issue

1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1)
2) Boot into single user mode without opensolaris.ko and zfs.ko loaded
3) ( mount -w / ) to make sure you can remove and also write new
zpool.cache as needed.
3) Remove /boot/zfs/zpool.cache
4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick
5) verify that vfs.zfs.recover=1 is set then ( zpool import pool )
6) Give it a little bit monitor activity using Ctrl+T to see activity.

I followed those instructions, and sure enough, I could import my pool once again..

However, doing:

Code:

zpool export pool

followed by:

Code:

zpool import pool

once again resulted in zpool hanging and requiring the reboot of the server...

something is just no quite right with the new v28 code

danbi · Dec 28, 2010

Just another note on the ZIL. The ZIL is used ONLY for sychnronous IO. That is, at the times where the OS, or the application wants to be sure things are 'on safe media' (not in memory or caches). Also, the ZIL is write-specific feature it doesn't impact reads (*)

Therefore, the ZIL has relatively small impact on sequential writes, such as large file copies. Unless, you do these over NFS for example -- NFS will make sure data is "safe" before telling the client that it was written (unless you use async mounts but those are bad idea). With NFS writing to the filesystem, there are lots of SYNC writes -- all these go via the ZIL. In normal operation, sync writes are typically file creation and other directory operations, metadata updates etc. Another heavy user of sync writes is database software.

All this means, that the bulk of data does not go via the ZIL. So the write performance of ZIL has nothing to do with the typical sequential IO -- only if your workload is writing over NFS or database storage, will the ZIL be loaded significantly.

(*) However, if you have such load, writes to the in-pool ZIL will consume IOPs that could be used for reading, it will also cause heads to move off to new locations, further slowing things down. So if you have mixed read/write load, the separate ZIL (slog) will help greatly with read performance as well.

About the portability of partitions with regards to SLOG/L2ARC: With ZFS versions past v21 there is little trouble here. Even if Solaris will not find your SLOG device, it will still import the pool (with your data there). Same goes for the L2ARC cache. It is very unlikely you have applications that will work unmodified in both FreeBSD and Solaris and therefore you will use the "other" only to access your files -- for which there is no worry if you have SLOG/L2ARC or not.

For earlier pool versions you are in trouble if you lose your SLOG, one way or another.

PS: The v28 code is highly experimental. Perhaps some of the console logs will help developers figure out what happened to your pool.

jyavenard · Dec 28, 2010

danbi said:
PS: The v28 code is highly experimental. Perhaps some of the console logs will help developers figure out what happened to your pool.

The problem is nothing shows on the console..
just doing zpool status, or any zpool command for that matter just hangs forever..

I just wrote this on the FreeBSD stable, I've tracked down where the problem comes from. It's not the log as I thought it would be, but the cache drive.

Hi

On 27 December 2010 16:04, jhell <jhell@dataix.net> wrote:

> 1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1)
> 2) Boot into single user mode without opensolaris.ko and zfs.ko loaded
> 3) ( mount -w / ) to make sure you can remove and also write new
> zpool.cache as needed.
> 3) Remove /boot/zfs/zpool.cache
> 4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick
> 5) verify that vfs.zfs.recover=1 is set then ( zpool import pool )
> 6) Give it a little bit monitor activity using Ctrl+T to see activity.

Ok..

I've got into the same situation again, no idea why this time.

I've followed your instructions, and sure enough I could do an import of my pool again.

However, wanted to find out what was going on..
So I did:
zpool export pool

followed by zpool import

And guess what ... hanged zpool again.. can't Ctrl-C it, have to reboot..

So here we go again.
Rebooted as above.
zpool import pool -> ok

this time, I decided that maybe that what was screwing things up was the cache.
zpool remove pool ada1s2 -> ok
zpool status:
# zpool status
pool: pool
state: ONLINE
scan: scrub repaired 0 in 18h20m with 0 errors on Tue Dec 28 10:28:05 2010
config:

NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada6 ONLINE 0 0 0
ada7 ONLINE 0 0 0
logs
ada1s1 ONLINE 0 0 0

errors: No known data errors

# zpool export pool -> ok
# zpool import pool -> ok
# zpool add pool cache /dev/ada1s2 -> ok
# zpool status
pool: pool
state: ONLINE
scan: scrub repaired 0 in 18h20m with 0 errors on Tue Dec 28 10:28:05 2010
config:

NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada6 ONLINE 0 0 0
ada7 ONLINE 0 0 0
logs
ada1s1 ONLINE 0 0 0
cache
ada1s2 ONLINE 0 0 0

errors: No known data errors

# zpool export pool -> ok

# zpool import
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.11r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.94r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.57r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.95r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.19r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.72r 0.00u 0.03s 0% 2556k
load: 0.00 cmd: zpool 405 [spa_namespace_lock] 40.13r 0.00u 0.03s 0% 2556k

ah ah !
it's not the separate log that make zpool crash, it's the cache !

Having the cache in prevent from importing the pool again....

rebooting: same deal... can't access the pool any longer !

Hopefully this is enough hint for someone to track done the bug ...

This time I started the network and ssh before running zpool, and I can log into the machine without problems

But any access to zpool same deal:

Code:

[jeanyves_avenard@ /]$ zpool status

load: 0.00  cmd: zpool 411 [spa_namespace_lock] 3.06r 0.00u 0.00s 0% 2068k
load: 0.00  cmd: zpool 411 [spa_namespace_lock] 3.91r 0.00u 0.00s 0% 2068k
load: 0.00  cmd: zpool 411 [spa_namespace_lock] 4.29r 0.00u 0.00s 0% 2068k

there's a lock in there screwing things up...

jyavenard · Dec 28, 2010

Well, after getting my pool back ; I set vfs.zfs.zil_replay_disable to 1 which I assume disable ZIL... And the result is that it takes the exact same amount of time as with zil enabled now (around 16s still).

so either vfs.zfs.zil_replay_disable doesn't disable the zil, or something happened between v14 and v28, as v14+no_zil = 6s,, v14+zil = 55s, v28+(zil|nozil) = 16s

danbi · Dec 29, 2010

Disabling the ZIL replay does not disable ZIL. It only allows you to run without your SLOG devices present/operational.

jyavenard · Dec 29, 2010

Ok. So there's no way to disable ZIL with 0.28 anymore?

You could with v14 and v15