SSD drives on FreeBSD

If you have any experience with how SSD work/behave on FreeBSD, please write it here. Specifically I'm interested in how it works with different filesystems. I know popular way is to use SSD as a cache device for ZFS, but how would it perform if it actually hold ZFS pool; Does ZFS support TRIM? Or just UFS?

Is ahci all that's needed for TRIM support? Is there any write degradation over time using it on UFS filesystem? I've heard lots of speculations about write limits, but would be curios to hear actual anecdotes from people who had run into these problems.

I was planing to use SSD drive for pretty much everything except /home and something extra that comes under ZFS pool. Would it be wise to run swap, swap-mounted/tmp, /var on SSD? What about constantly building stuff from sources? How would that all affect SSD lifespan?

So please, if you have any experience with these cool devices, let us know!
 
bbzz said:
Does ZFS support TRIM? Or just UFS?
To my knowledge neither do, but newfs(8) does have a "-E" parameter to have it trim the device when a filesystem is created.


bbzz said:
Would it be wise to run swap, swap-mounted/tmp, /var on SSD?
I don't know if it's wise, but that is what I do. If your system is swapping heavily, I'm sure it is a bad idea, but nowadays with RAM so abundant, heavy swapping is quite an exception.
 
aragon,
I stumbled on your post that you made several months ago where you tested affects on proper alignment. Good stuff. Surprised nobody commented on that.
So with UFS+ZFS on SSD, you should align inside BSD labels for best performance; and then ZFS partition that will be used for pool should itself be aligned, correct?
I'll be getting couple of Intel SSDs for desktop/laptops, seeing that whole point of getting SSD for that use would be random 4K reads which they are supposedly best at.

Anyway, please feel free to share other intricacies you might have found.
 
bbzz said:
So with UFS+ZFS on SSD, you should align inside BSD labels for best performance;
Aligning inside the BSD label is the easiest because it can be done without having to also align on the legacy BIOS track boundaries. With MBR partitions your alignments have to meet two constraints: 1 MiB boundaries and 32256 byte boundaries (63*512).

bbzz said:
and then ZFS partition that will be used for pool should itself be aligned, correct?
Yes.

bbzz said:
Anyway, please feel free to share other intricacies you might have found.
For partitioning you might find another [post=133728]recent post[/post] of mine useful.

SSDs are great in that you can partition them up and use them for multiple tasks without much of a performance penalty from the resulting random accesses. Unfortunately partitioning is a bit of a pain. GPT is immature, and MBR is very, very old. :)
 
@bbzz

I use Intel SSDSA2M160G2GC (X25-M) 160GB for FreeBSD in my laptop, / is on UFS (512m) and all the rest is on ZFS's zpool, works like a charm with ahci.ko as ada0. I do not use SWAP (8GB RAM here) but I also do not 'limit' any writes to disk like disabling syslog or so, did not even care to check if TRIM is supported. I added below some benchmarks:

Code:
# [color="Blue"]diskinfo -c -t -v ada0[/color]
ada0
        512             # sectorsize
        160041885696    # mediasize in bytes (149G)
        312581808       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        310101          # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        CVPO01160261160AGN      # Disk ident.

I/O command overhead:
        time to read 10MB block      0.046067 sec       =    0.002 msec/sector
        time to read 20480 sectors   1.900484 sec       =    0.093 msec/sector
        calculated command overhead                     =    0.091 msec/sector

Seek times:
        Full stroke:      250 iter in   0.025751 sec =    0.103 msec
        Half stroke:      250 iter in   0.026163 sec =    0.105 msec
        Quarter stroke:   500 iter in   0.052073 sec =    0.104 msec
        Short forward:    400 iter in   0.040653 sec =    0.102 msec
        Short backward:   400 iter in   0.040956 sec =    0.102 msec
        Seq outer:       2048 iter in   0.077597 sec =    0.038 msec
        Seq inner:       2048 iter in   0.103460 sec =    0.051 msec
Transfer rates:
        outside:       102400 kbytes in   0.444337 sec =   230456 kbytes/sec
        middle:        102400 kbytes in   0.443060 sec =   231120 kbytes/sec
        inside:        102400 kbytes in   0.438879 sec =   233322 kbytes/sec

Code:
# [color="#0000ff"]time find / 1> /dev/null 2> /dev/null[/color]
find / > /dev/null 2> /dev/null  0.44s user 3.83s system 52% cpu 8.158 total

# [color="#0000ff"]df -m[/color]
Filesystem      1M-blocks   Used Avail Capacity  Mounted on
/dev/label/root       495     71   424    14%    /
storage/usr        144907 127316 17591    88%    /usr

Code:
# [color="#0000ff"]blogbench -i 10 -d BLOG[/color]

Frequency = 10 secs
Scratch dir = [BLOG]
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 10 iterations.
The test will run during 1 minutes.

  Nb blogs   R articles    W articles    R pictures    W pictures    R comments    W comments
        58       403642          2804        290140          3172        229064          6738
        60       276701           121        195970            93        185351          5240
        60       273753            11        193679             7        218322          4016
        60       300775            21        212174            10        252008          2029
        64       285494           246        202465           221        246846          1833
        64       296025            17        206478            11        246857          2959
        64       293819            19        207351             9        250991          2423
        64       274489             9        193715             4        253598          4781
        70       303173           327        215951           393        263001          2063
        70       295326            22        157769            13        263545          2345

Final score for writes:            70
Final score for reads :         65929
 
bbzz said:
If you have any experience with how SSD work/behave on FreeBSD, please write it here. Specifically I'm interested in how it works with different filesystems. I know popular way is to use SSD as a cache device for ZFS, but how would it perform if it actually hold ZFS pool; Does ZFS support TRIM? Or just UFS?

UFS supports TRIM in more recent builds of FreeBSD, but not in existing releases. ZFS does not, although I believe it's being worked on specifically for FreeBSD.

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide

This gives some nice info on designing your ZFS + SSD system.

Performance varies a lot across SSD's so sweeping generalizations are often inaccurate. One that's pretty consistent is power consumption. IME, cheap SSD's are great for laptops. They cost about as much as new battery for it, and will greatly extend the life of your old one plus you get all the other SSD benefits.

One other tidbit not mentioned a lot is that SSD's actually have an exceptionally long write life span. Even one of the cheap MLC 10,000 write SSD's will outlast a normal laptop by decades. You can make it easier for your drive to operate effectively by reserving a portion of the disk and never utilizing it say 10%. This allow static wear leveling to achieve maximum efficiency. Using your SSD's maximum capacity is a bad idea.
 
Another forum I read daily has numerous posts about ssd's suddenly inporerable after a short while, and need RMA if allowed, with all data lost (I just read the thread titles, there were/are too many to read daily.) So if one relies on an SSD I'd at least search about its reliability, data recovery etc, if not beforehand which make/model is likely to not fail. (Keeps me from even thinking about SSD's; those posts only started appearing maybe a year or two after the SSD posts began...I vaguely recall). So one should have backups... [ Barring the circumstance that it is just one or two active threads daily; I have to read those titles much too quickly to remember precisely...]
 
I use two Intel 40GB SSDs for my root pool zroot, and an 80GB Intel I picked up cheaply (less than I paid for the 40GB) that is used as L2ARC cache for storage, which is a HDD based pool. Both pools are mirrors.

On the HDD pool I put /home, /usr/ports/distfiles and /usr/ports/packages, in addition to an interim backup of everything that is on zroot. Everything else is on the SSD, if that makes sense.

I love SSD. It's probably the coolest thing to happen to computing since the Pentium. At this stage I will only use Intel, which appear to be the most reliable vendor based on newegg reviews. I've now got boot disks in all my systems, and they all are so much nicer than using HDDs as boot disks.

I do not even worry about TRIM support for my root mirror. The data on it hardly changes. I also don't worry about longevity. I have enough RAM now that my swap doesn't get used. At this rate my root mirror will probably outlive me assuming I don't replace it first.
 
Thanks for all the posts; thanks vermaden for that test, looks promising :e
Unfortunately, my 2 SSDs are going to have to be on low budget since I need to renew other components as well.
The choice came down to either OCZ Vertex 2 50g or Intel X25-V 40g. Now I don't really care that much about extra 10G as I care about actual application responsiveness - hence was looking at 4K random reads. Seems Intel is better here. On the other hand it has slow sequential writes (~35MB/s), which is not that bad, nor it matters. It may matter, though, if it additionally suffers over time in this department.
On the other hand read performance shouldn't be affected?

So, not sure. What's your take on it? The price is literal the same.

I read somewhere that RAID by default disables TRIM anyway, which is one of the reasons why ZFS can't do it, apart from how its designed in the first place. At the same time RAID doesn't need TRIM as it does sort of garbage collection over time in idle state... Don't quote me on that, though.

@carlton_draught

I'm going to set it up something like that. Things that would benefit from faster disk, ie. kernel/world build, /var, /tmp will got on SSD.

@jb_fvwm2

Yes, I head that too. Basically, old fashioned plates are still more reliable. In my case I don't care since important data is on zpool. This would be just for system. Serious server should mirror that ofcourse.
 
bbzz said:
Thanks for all the posts]
I've been running the Intel SSDs as boot drives for a year now and I notice no difference. Not sure this is a valid test, but anyway:
# dd if=/dev/ada1 of=/dev/null bs=10m
Code:
3816+1 records in
3816+1 records out
40020664320 bytes transferred in 213.738392 secs (187241347 bytes/sec)
What's that, 178MB/s? Over spec of 170MB/s, anyway.

Intel says:
Intel said:
(TRIM) allows the operating system to inform the solid-state drive which data blocks (e.g. from deleted files) are no longer in use and can be wiped internally allowing the controller to ensure compatibility, endurance, and performance.

Hence if your root mirror basically behaves in a normal way, there will not be much deleting going on, and there will be lots of spare space (basically half, not even counting swap, which in my system doesn't get used).

If you set it up like I do, you will have plenty of space left on your SSDs and they should not get full. See:
# zfs list

Code:
NAME                                   USED  AVAIL  REFER  MOUNTPOINT
storage                                145G   148G    22K  /storage
storage/distfiles                     3.54G   148G  2.82G  /usr/ports/distfiles
storage/home                          77.8G   148G  56.0G  /home
storage/packages                      6.54G   148G  6.30G  /usr/ports/packages
storage/zrootbackup                   57.4G   148G  6.44G  /storage/zrootbackup
storage/zrootbackup/zroot             51.0G   148G  1.05G  /storage/zrootbackup/zroot
storage/zrootbackup/zroot/tmp         2.20M   148G  76.5K  /storage/zrootbackup/zroot/tmp
storage/zrootbackup/zroot/usr         41.0G   148G  8.75G  /storage/zrootbackup/zroot/usr
storage/zrootbackup/zroot/usr/ports   15.8G   148G   314M  /storage/zrootbackup/zroot/usr/ports
storage/zrootbackup/zroot/usr/src     1.19G   148G   308M  /storage/zrootbackup/zroot/usr/src
storage/zrootbackup/zroot/var         3.68G   148G  26.8M  /storage/zrootbackup/zroot/var
storage/zrootbackup/zroot/var/crash    112K   148G  21.5K  /storage/zrootbackup/zroot/var/crash
storage/zrootbackup/zroot/var/db      3.36G   148G   675M  /storage/zrootbackup/zroot/var/db
storage/zrootbackup/zroot/var/db/pkg   144M   148G  24.9M  /storage/zrootbackup/zroot/var/db/pkg
storage/zrootbackup/zroot/var/empty     96K   148G    20K  /storage/zrootbackup/zroot/var/empty
storage/zrootbackup/zroot/var/log     52.0M   148G  1.21M  /storage/zrootbackup/zroot/var/log
storage/zrootbackup/zroot/var/mail     510K   148G  23.5K  /storage/zrootbackup/zroot/var/mail
storage/zrootbackup/zroot/var/run     1.74M   148G   117K  /storage/zrootbackup/zroot/var/run
storage/zrootbackup/zroot/var/tmp     16.1M   148G  1.61M  /storage/zrootbackup/zroot/var/tmp
zroot                                 12.3G  16.2G  1.06G  legacy
zroot/tmp                             4.43M  16.2G  74.5K  /tmp
zroot/usr                             10.2G  16.2G  8.60G  /usr
zroot/usr/ports                       1.05G  16.2G   321M  /usr/ports
zroot/usr/src                          309M  16.2G   309M  /usr/src
zroot/var                             1.02G  16.2G  26.8M  /var
zroot/var/crash                       20.5K  16.2G  20.5K  /var/crash
zroot/var/db                           910M  16.2G   698M  /var/db
zroot/var/db/pkg                      34.3M  16.2G  25.5M  /var/db/pkg
zroot/var/empty                         20K  16.2G    20K  /var/empty
zroot/var/log                         6.45M  16.2G  1.21M  /var/log
zroot/var/mail                         372K  16.2G  24.5K  /var/mail
zroot/var/run                          819K  16.2G  92.5K  /var/run
zroot/var/tmp                         8.54M  16.2G  1.97M  /var/tmp

bbzz said:
@carlton_draught

I'm going to set it up something like that. Things that would benefit from faster disk, ie. kernel/world build, /var, /tmp will got on SSD.
I have a set of articles on installing, backing up and restoring such a system, along with the scripts to do everything, already written and ready to post. All I'm waiting on is clarification of licensing for the install scripts, which are derived from the article linked in that post.

Yes, I head that too. Basically, old fashioned plates are still more reliable. In my case I don't care since important data is on zpool. This would be just for system. Serious server should mirror that ofcourse.
I strongly disagree that plates are more reliable if Intel drives are the basis of comparison. Have a look, or at this. Find me one HDD with ratings that good (using sum of 3 stars or less as a proxy for dead drives). <= 8%. I don't think you will find a modern HDD with less than double that.
 
Interesting; any performance hit with running two zpools? I was under impression that it would be silly to run two zpools.

About disks and reliability. I think SSD have potential to be more reliable, but HDDs are more mature technology. So, I don't know. I think in the end time will tell.
What about SSD and how long they can store data for? Static charges dissipate over time, right?
 
bbzz said:
About disks and reliability. I think SSD have potential to be more reliable, but HDDs are more mature technology.
The technology powering SSD's have been around for nearly 30 years and been in widespread use for over 20. How long do you need to feel comfortable?
bbzz said:
What about SSD and how long they can store data for? Static charges dissipate over time, right?
Sure, are you asserting magnetic media doesn't suffer from the same degradation? There are standards for the media so you can plan a strategy based on it. See JESD22-A117B. Anyways, that's what ZFS is for. Anecdotal experience shows ZFS's scrub find periodic checksum errors on 500GB HD's. AFAIK, there isn't any media really suitable for long-term storage.
 
Galactic_Dominator said:
The technology powering SSD's have been around for nearly 30 years and been in widespread use for over 20. How long do you need to feel comfortable?

I don't think it's that simple. I understand that the basis for what we have today had been known for quite some time. I simply meant to say that there is still no widespread use of high SSDs when compared to HDDs to quickly come to every concision. But what do I know: I'm not running high end storage data centers; I'm merely restating what other experts commented on this.
 
bbzz said:
I'm merely restating what other experts commented on this.

Anonymous "experts" are FUD. Please share what your sources are so everyone can decide if they are indeed experts and if their current storage strategies need to be reevaluated. For me, I trust in the engineering process and review standards that allow these devices to be created and sold on a massive scale. If you have evidence these devices don't meet the specs, or cause valid concern in some other area please don't keep it secret.
 
Like I said, the basic premise is that SSDs haven't been in widespread use for so long to know all the intricacies. Nobody can argue with this. As for experts, little digging on the net can show some interesting debates.
I see you prefer SSDs, and that's cool. No need to make big deal out of this really.

Thanks again for everyone's suggestions here.
 
bbzz said:
Interesting; any performance hit with running two zpools? I was under impression that it would be silly to run two zpools.
I'm not sure why there would be a performance hit, provided that you aren't making poor choices in the process (e.g. if you aren't using sufficient redundancy, or you put the wrong things on SSD). I certainly haven't detected any.

Since the size of the zpool is limited by the size of the smallest drive, it doesn't make sense to pair HDD in the same pool with SSD outside of using an SSD as dedicated cache for the HDD pool. But in many applications you want a lot of cheap drive space and can live with slow random access times, which is why you want a HDD pool. At the same time, your applications get the speed boost from running on SSD. e.g. Reading a document from HDD based /home is going to be quick enough because it's just one file, and opening libreoffice which presumably involves reading many, many small files will be sped up by being on SSD.

bbzz said:
About disks and reliability. I think SSD have potential to be more reliable, but HDDs are more mature technology. So, I don't know. I think in the end time will tell.
What about SSD and how long they can store data for? Static charges dissipate over time, right?
I think the drives have been out long enough not to worry about that. That drive I linked to has been out for a year and a half, and still only 7% 3 star and below reviews. That's amazing, considering that pretty much any HDD would be over 20% in that time. I would think if holding charge was going to be an issue over the 5 year typical life of a drive, it would start rearing its head soon.

The other thing to consider is what you are using your SSDs for and how you are using them in a ZFS based system. ZFS is designed for the world we are entering, a world of cheap drives. Redundancy should be a given. You should be regularly scrubbing your pools to determine if there are any errors. You should be swapping out drives that give you errors. This means that the chance of an error in a drive actually causing an unrecoverable error in your pool will be very low.

Consider that if you set up your system similar to the way I've set things up (with an interim backup of the root mirror SSD pool on the HDD based pool), you can restore your SSD mirror if it dies for whatever reason. And I backup basically everything on the storage mirror to offline HDD mirror pools, so even if the system is toast I can still restore everything, even zroot.

Also realize that if you are using an SSD for L2ARC (cache) duty, it only needs to be fast. There is no reason other than convenience for it to be reliable because everything stored on the SSD is also stored on the pool itself. Everything on there is also hashed, so that on read if the block data on the cache doesn't match the hash, zfs knows to refers to the pool for the correct information.
 
carlton_draught said:
I'm not sure why there would be a performance hit, provided that you aren't making poor choices in the process (e.g. if you aren't using sufficient redundancy, or you put the wrong things on SSD). I certainly haven't detected any.

I was thinking more along the line of extra memory consumption by ZFS if there are several pools. I guess this is not true then. Some very useful info here as well. Thanks.
 
Looks like I'm late to the party on this thread but for what it's worth here are my less technical two cents on SSD's and FreeBSD.

I'm running Corsair CMFSSD-128D1 2.0 on my FreeBSD 9.0 amd64 and love it. The install was fast and I haven't ran into any problems yet. I had the same drive running Win 7, Ubuntu, and XP a while ago and had nothing but issues with lost data on Win 7 and Ubuntu. I looked into the drive and it seems to be good still. Since putting FreeBSD 9.0 on it I haven't had any of the same issues. I'm not running anything special on it, all standard guided install options. I'm sure there are ways to boost performance but I honestly don't care. I'm working with 4GB of ram and a 2.5 Ghz dual core intel processor.

The boot time with FreeBSD didn't change much but that's because FreeBSD does more on boot than other OS's (making your Wifi go online etc), once XDM is up and I login pretty much everything seems to be instant when running software. libreoffice and other heavy apps that used to take a while are running much faster as well.
 
Galactic_Dominator said:
One other tidbit not mentioned a lot is that SSD's actually have an exceptionally long write life span. Even one of the cheap MLC 10,000 write SSD's will outlast a normal laptop by decades. You can make it easier for your drive to operate effectively by reserving a portion of the disk and never utilizing it say 10%. This allow static wear leveling to achieve maximum efficiency. Using your SSD's maximum capacity is a bad idea.

Would you please expand on this? Do you mean that when partitioning, 10% of the space should be left unpartitioned? Thanks.
 
lele said:
Would you please expand on this? Do you mean that when partitioning, 10% of the space should be left unpartitioned?

He means space unused from SSD's point of view, i.e. never written, or erased with TRIM or SECURITY ERASE. SSD's wear leveling algorithms reserve some amount of free space to equalize media wear. The more free space available, the better algorithms should work and faster and more reliable device should be. TRIM does it dynamically, but static allocation may also be helpful.
 
mav@ said:
He means space unused from SSD's point of view, i.e. never written, or erased with TRIM or SECURITY ERASE. SSD's wear leveling algorithms reserve some amount of free space to equalize media wear. The more free space available, the better algorithms should work and faster and more reliable device should be. TRIM does it dynamically, but static allocation may also be helpful.

Thanks for the explanation. What does that mean in layman's terms? Would leaving such space unpartitioned suffice? I suppose that unpartitioned space never gets touched.
 
My unverified impression is that UFS does not write to every block during a newfs(8). If true, those blocks are free for wear leveling, and unused filesystem space is as good a source of unused blocks as unpartitioned space. The difference is that it's easily available for use if you need it. Remember that UFS typically hides 8% of filesystem space anyway.
 
wblock@ said:
My unverified impression is that UFS does not write to every block during a newfs(8). If true, those blocks are free for wear leveling, and unused filesystem space is as good a source of unused blocks as unpartitioned space. The difference is that it's easily available for use if you need it. Remember that UFS typically hides 8% of filesystem space anyway.

So, basically you don't have to think about it, just use UFS and be done with it, right?
 
So far, that's what I've done. Well, I mean I thought about it, but have used UFS without leaving unpartitioned space. Do use TRIM, though.
 
Back
Top